Skip to content

Conversation

rnewson
Copy link
Member

@rnewson rnewson commented Aug 24, 2025

Overview

Upgrades Nouveau to Lucene 10.

Existing indexes can still be queried and updated. New indexes will be built with Lucene 10 (couchdb will auto-inject the version information into the design document).

A couch_scanner plugin will build Lucene 10 versions of all current indexes (sequentially) and update the design document on successful completion, making the upgrade automatic.

TODO:

  1. Needs additional tests
  2. documentation on enabling the scanner plugin.

NOTES

Lucene 10 raises the JDK requirement from 11 to 21.

Testing recommendations

TBD

Related Issues or Pull Requests

N/A

Checklist

  • Code is written and works correctly
  • Changes are covered by tests
  • Any new configurable parameters are documented in rel/overlay/etc/default.ini
  • Documentation changes were made in the src/docs folder
  • Documentation changes were backported (separated PR) to affected branches

@rnewson rnewson requested a review from nickva August 24, 2025 20:42
@rnewson rnewson force-pushed the lucene-10 branch 7 times, most recently from ffbe402 to 92b38b2 Compare August 28, 2025 09:43
@rnewson rnewson force-pushed the lucene-10 branch 2 times, most recently from 7b26d9f to 769bc65 Compare September 3, 2025 13:27
@rnewson rnewson marked this pull request as ready for review September 3, 2025 13:28
@jonasplaum
Copy link
Contributor

Just to be sure: without the scanner plugin, the index version is not automatically written into the design doc, right? We have a whole set of Nouveau indexes and views in a single design doc. Adding the version would trigger a rebuild of all indexes, which we want to avoid.

@rnewson
Copy link
Member Author

rnewson commented Sep 3, 2025

Avoiding query-blocking rebuilds is the highest priority of this work, so thank you for the comment.

The lucene_version field is injected on interactive edits (when you create or modify a design document). Since that edit would be a new index anyway, that it happens to be a Lucene 10 index is fine. It's one set of index builds. There might be some refinements to this before the PR is merged for the case where a design document defines multiple indexes.

The absence of the lucene_version field is taken to mean the index, if it exists on the nouveau server, is for Lucene 9. The index signature includes the lucene version from 10 onward and omits it for version 9 (so that the sig remains unchanged on upgrade).

@rnewson
Copy link
Member Author

rnewson commented Sep 3, 2025

I've pushed an update so that the lucene_version injection only occurs for new design documents, not updates to them.

1) the Dropwizard framework (https://dropwizard.io)
2) Java 11+
3) Lucene 9
2) Java 21+
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This makes sense as that seems to be the lowest supported Java version for Lucene 10. But I think this will force us to move to Debian Trixie with the packaging if we go with using built-in java packages. If we switch to 3rd party ones it wouldn't matter. Swtiching to Trixie is probably the better option. Not a showstopper, of course, just leaving a note so we don't forget.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, that's the main upgrade pain, that Debian oldstable is too old.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Debian Trixie is on the way but we need to bring this in before.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If it's not too much to ask, please don't drop support for Debian 12. We have several customer installations where it is not possible to upgrade to Debian 13 in the near future, but we want to keep updating CouchDB. It would be no problem for us to install a 3rd-party JVM. Support for Trixie is also nice for new Systems.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We don't drop support for Debian 12. The only thing you need to do is to install a 3rd-party VM (Java 21+) on Debian 12.

-include_lib("couch/include/couch_db.hrl").

%% New index definitions get an explicit lucene version property, if missing.
before_doc_update(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a neat approach.

Wonder what would happen if users would toggle the version by hand (bump it up, down, make it a non-integer) or just remove it after it was created? Would that break the index or crash anything?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've tried to consider various manual shenanigans while trying to make the Java side stick to a "the version of Lucene I am using now plus the one before" without hardcoding "10" and "9" everywhere. I will review that aspect again as I think there are gaps (e.g, I want it to be an error to set lucene_version to anything but 10 or 9 and have that handled well (i.e, no stack traces, no erlang process crashes, just an error message in the log or in the http response)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Revised commit "Allow existing legacy indexes but prevent new ones" so it only allows 9 or 10.

@rnewson rnewson force-pushed the lucene-10 branch 2 times, most recently from 91eacb4 to de25207 Compare September 9, 2025 16:51
@rnewson rnewson force-pushed the lucene-10 branch 2 times, most recently from 5a8c42d to cb4b90f Compare September 16, 2025 21:47
@nickva
Copy link
Contributor

nickva commented Sep 17, 2025

I gave this a try with this test script. Without adding an explicit "lucene_version" it worked.

#!/bin/bash

set -x

http -b post $DB/_nouveau_analyze analyzer=standard text=for@bar@baz

http -q delete $DB/ndb
http -q put $DB/ndb

http -q post $DB/ndb/_bulk_docs docs:='[{"_id":"1","a":"xxx"},{"_id":"2","a":"xxx"},{"_id":"3","a":"zzz"},{"_id":"4","a":"xxx"}]'

http -b put $DB/ndb/_design/nd1 nouveau:='{"idx1": {"index":"function(doc){index(\"string\",\"a\",doc.a,{\"store\":true})}"}}'
http -b post $DB/ndb/_design/nd1/_nouveau/idx1 q="a:xxx" limit:='2'
  • I then added an explicit "lucene_version":9 and got an error:
#!/bin/bash

set -x

http -b post $DB/_nouveau_analyze analyzer=standard text=for@bar@baz

http -q delete $DB/ndb
http -q put $DB/ndb

http -q post $DB/ndb/_bulk_docs docs:='[{"_id":"1","a":"xxx"},{"_id":"2","a":"xxx"},{"_id":"3","a":"zzz"},{"_id":"4","a":"xxx"}]'

# http -b put $DB/ndb/_design/nd1 nouveau:='{"idx1": {"index":"function(doc){index(\"string\",\"a\",doc.a,{\"store\":true})}"}}'
# http -b post $DB/ndb/_design/nd1/_nouveau/idx1 q="a:xxx" limit:='2'

http -b put $DB/ndb/_design/nd2 nouveau:='{"idx": {"lucene_version":9, "index":"function(doc){index(\"string\",\"a\",doc.a,{\"store\":true})}"}}'
http -b post $DB/ndb/_design/nd2/_nouveau/idx q="a:xxx" limit:='2'
+ http -b post http://adm:[email protected]:15984/ndb/_design/nd2/_nouveau/idx q=a:xxx limit:=2
{
    "error": "internal_server_error",
    "reason": "There was an error processing your request. It has been logged (ID add3f84dfaa436f8)."
}
e955c5ee6b50659cc9f3a4717962e028201fa2d/index/9 lockFactory=org.apache.lucene.store.NativeFSLockFactory@4d6b0c30)): files: [write.lock]
! at org.apache.lucene.index.IndexWriter.<init>(IndexWriter.java:1085)
! at org.apache.couchdb.nouveau.core.IndexManager.load(IndexManager.java:408)
! at org.apache.couchdb.nouveau.core.IndexManager.with(IndexManager.java:140)
! at org.apache.couchdb.nouveau.resources.IndexResource.getIndexInfo(IndexResource.java:85)
! at java.base/jdk.internal.reflect.DirectMethodHandleAccessor.invoke(DirectMethodHandleAccessor.java:103)
! at java.base/java.lang.reflect.Method.invoke(Method.java:580)
! at org.glassfish.jersey.server.model.internal.ResourceMethodInvocationHandlerFactory.lambda$static$0(ResourceMethodInvocationHandlerFactory.java:52)
! at org.glassfish.jersey.server.model.internal.AbstractJavaResourceMethodDispatcher$1.run(AbstractJavaResourceMethodDispatcher.java:146)
! at org.glassfish.jersey.server.model.internal.AbstractJavaResourceMethodDispatcher.invoke(AbstractJavaResourceMethodDispatcher.java:189)
! at org.glassfish.jersey.server.model.internal.JavaResourceMethodDispatcherProvider$TypeOutInvoker.doDispatch(JavaResourceMethodDispatcherProvider.java:219)
! at org.glassfish.jersey.server.model.internal.AbstractJavaResourceMethodDispatcher.dispatch(AbstractJavaResourceMethodDispatcher.java:93)
! at org.glassfish.jersey.server.model.ResourceMethodInvoker.invoke(ResourceMethodInvoker.java:478)
! at org.glassfish.jersey.server.model.ResourceMethodInvoker.apply(ResourceMethodInvoker.java:400)
! at org.glassfish.jersey.server.model.ResourceMethodInvoker.apply(ResourceMethodInvoker.java:81)
! at org.glassfish.jersey.server.ServerRuntime$1.run(ServerRuntime.java:256)
! at org.glassfish.jersey.internal.Errors$1.call(Errors.java:248)
! at org.glassfish.jersey.internal.Errors$1.call(Errors.java:244)
! at org.glassfish.jersey.internal.Errors.process(Errors.java:292)
! at org.glassfish.jersey.internal.Errors.process(Errors.java:274)
! at org.glassfish.jersey.internal.Errors.process(Errors.java:244)
127.0.0.1 - - [17/Sep/2025:05:20:17 +0000] "GET /index/node1%40127.0.0.1%2Fshards%2F00000000-7fffffff%2Fndb.1758086356%2Ffa5d962fee656da8f55e1e9d8e955c5ee6b50659cc9f3a4717962e028201fa2d HTTP/2.0" 500 110 "-" "-" 6
! at org.glassfish.jersey.process.internal.RequestScope.runInScope(RequestScope.java:265)
! at org.glassfish.jersey.server.ServerRuntime.process(ServerRuntime.java:235)
! at org.glassfish.jersey.server.ApplicationHandler.handle(ApplicationHandler.java:684)
! at org.glassfish.jersey.servlet.WebComponent.serviceImpl(WebComponent.java:397)
! at org.glassfish.jersey.servlet.WebComponent.service(WebComponent.java:349)
! at org.glassfish.jersey.servlet.ServletContainer.service(ServletContainer.java:379)
! at org.glassfish.jersey.servlet.ServletContainer.service(ServletContainer.java:312)
! at org.glassfish.jersey.servlet.ServletContainer.service(ServletContainer.java:205)
! at org.eclipse.jetty.servlet.ServletHolder.handle(ServletHolder.java:764)
! at org.eclipse.jetty.servlet.ServletHandler$ChainEnd.doFilter(ServletHandler.java:1665)
! at io.dropwizard.servlets.ThreadNameFilter.doFilter(ThreadNameFilter.java:36)
! at org.eclipse.jetty.servlet.FilterHolder.doFilter(FilterHolder.java:202)
! at org.eclipse.jetty.servlet.ServletHandler$Chain.doFilter(ServletHandler.java:1635)
! at io.dropwizard.jersey.filter.AllowedMethodsFilter.handle(AllowedMethodsFilter.java:46)
! at io.dropwizard.jersey.filter.AllowedMethodsFilter.doFilter(AllowedMethodsFilter.java:40)
! at org.eclipse.jetty.servlet.FilterHolder.doFilter(FilterHolder.java:202)
! at org.eclipse.jetty.servlet.ServletHandler$Chain.doFilter(ServletHandler.java:1635)
! at org.apache.couchdb.nouveau.core.UserAgentFilter.doFilter(UserAgentFilter.java:45)
! at org.eclipse.jetty.servlet.FilterHolder.doFilter(FilterHolder.java:202)
! at org.eclipse.jetty.servlet.ServletHandler$Chain.doFilter(ServletHandler.java:1635)
! at org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:527)
! at org.eclipse.jetty.server.handler.ScopedHandler.nextHandle(ScopedHandler.java:221)
! at org.eclipse.jetty.server.handler.ContextHandler.doHandle(ContextHandler.java:1381)
! at org.eclipse.jetty.server.handler.ScopedHandler.nextScope(ScopedHandler.java:176)
! at org.eclipse.jetty.servlet.ServletHandler.doScope(ServletHandler.java:484)
! at org.eclipse.jetty.server.handler.ScopedHandler.nextScope(ScopedHandler.java:174)
! at org.eclipse.jetty.server.handler.ContextHandler.doScope(ContextHandler.java:1303)
! at org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:129)
! at org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:122)
! at io.dropwizard.metrics.jetty11.InstrumentedHandler.handle(InstrumentedHandler.java:313)
! at io.dropwizard.jetty.RoutingHandler.handle(RoutingHandler.java:52)
! at org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:122)
! at org.eclipse.jetty.server.handler.gzip.GzipHandler.handle(GzipHandler.java:822)
! at io.dropwizard.jetty.ZipExceptionHandlingGzipHandler.handle(ZipExceptionHandlingGzipHandler.java:26)
! at org.eclipse.jetty.server.handler.RequestLogHandler.handle(RequestLogHandler.java:46)
! at org.eclipse.jetty.server.handler.StatisticsHandler.handle(StatisticsHandler.java:173)
! at org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:122)
! at org.eclipse.jetty.server.Server.handle(Server.java:563)
! at org.eclipse.jetty.server.HttpChannel$RequestDispatchable.dispatch(HttpChannel.java:1598)
! at org.eclipse.jetty.server.HttpChannel.dispatch(HttpChannel.java:753)
! at org.eclipse.jetty.server.HttpChannel.handle(HttpChannel.java:501)
! at org.eclipse.jetty.server.HttpChannel.run(HttpChannel.java:461)
! at org.eclipse.jetty.util.thread.strategy.AdaptiveExecutionStrategy.runTask(AdaptiveExecutionStrategy.java:421)
! at org.eclipse.jetty.util.thread.strategy.AdaptiveExecutionStrategy.consumeTask(AdaptiveExecutionStrategy.java:390)
! at org.eclipse.jetty.util.thread.strategy.AdaptiveExecutionStrategy.tryProduce(AdaptiveExecutionStrategy.java:277)
! at org.eclipse.jetty.util.thread.strategy.AdaptiveExecutionStrategy.produce(AdaptiveExecutionStrategy.java:193)
! at org.eclipse.jetty.http2.HTTP2Connection.produce(HTTP2Connection.java:208)
! at org.eclipse.jetty.http2.HTTP2Connection.onFillable(HTTP2Connection.java:155)
! at org.eclipse.jetty.http2.HTTP2Connection$FillableCallback.succeeded(HTTP2Connection.java:450)
! at org.eclipse.jetty.io.FillInterest.fillable(FillInterest.java:100)
! at org.eclipse.jetty.io.SelectableChannelEndPoint$1.run(SelectableChannelEndPoint.java:53)
! at org.eclipse.jetty.util.thread.strategy.AdaptiveExecutionStrategy.runTask(AdaptiveExecutionStrategy.java:421)
! at org.eclipse.jetty.util.thread.strategy.AdaptiveExecutionStrategy.consumeTask(AdaptiveExecutionStrategy.java:390)
! at org.eclipse.jetty.util.thread.strategy.AdaptiveExecutionStrategy.tryProduce(AdaptiveExecutionStrategy.java:277)
! at org.eclipse.jetty.util.thread.strategy.AdaptiveExecutionStrategy.run(AdaptiveExecutionStrategy.java:199)
! at org.eclipse.jetty.util.thread.ReservedThreadExecutor$ReservedThread.run(ReservedThreadExecutor.java:411)
! at org.eclipse.jetty.util.thread.QueuedThreadPool.runJob(QueuedThreadPool.java:969)
! at org.eclipse.jetty.util.thread.QueuedThreadPool$Runner.doRunJob(QueuedThreadPool.java:1194)
! at org.eclipse.jetty.util.thread.QueuedThreadPool$Runner.run(QueuedThreadPool.java:1149)
! at java.base/java.lang.Thread.run(Thread.java:1583)
  • An explicit "lucene_version":10 worked

  • "lucene_version":11 worked, validation error "luceneVersion must be less than or equal to 10"

  • "lucene_version":"foo" => 500 error:

+ http -b post http://adm:[email protected]:15984/ndb/_design/nd2/_nouveau/idx q=a:xxx limit:=2
{
    "error": "bad_request",
    "reason": "Unable to process JSON"
}
WARN  [2025-09-17 05:32:18,842] org.apache.couchdb.nouveau.core.IndexManager: I/O exception while committing [email protected]/shards/00000000-7fffffff/ndb.1758086897/b72b474eb35832d8bbe31c0798cec599c585c164b52b4ca880b59bfe87eb0f2c
! java.nio.file.NoSuchFileException: /Users/nvatama/asf-3/dev/lib/nouveau/[email protected]/shards/00000000-7fffffff/ndb.1758086897/b72b474eb35832d8bbe31c0798cec599c585c164b52b4ca880b59bfe87eb0f2c/index/10/write.lock
! at java.base/sun.nio.fs.UnixException.translateToIOException(UnixException.java:92)
! at java.base/sun.nio.fs.UnixException.rethrowAsIOException(UnixException.java:106)
! at java.base/sun.nio.fs.UnixException.rethrowAsIOException(UnixException.java:111)
! at java.base/sun.nio.fs.UnixFileAttributeViews$Basic.readAttributes(UnixFileAttributeViews.java:55)
! at java.base/sun.nio.fs.UnixFileSystemProvider.readAttributes(UnixFileSystemProvider.java:171)
! at java.base/java.nio.file.Files.readAttributes(Files.java:1854)
! at org.apache.lucene.store.NativeFSLockFactory$NativeFSLock.ensureValid(NativeFSLockFactory.java:177)
! at org.apache.lucene.store.LockValidatingDirectoryWrapper.syncMetaData(LockValidatingDirectoryWrapper.java:61)
! at org.apache.lucene.index.SegmentInfos.prepareCommit(SegmentInfos.java:906)
! at org.apache.lucene.index.IndexWriter.startCommit(IndexWriter.java:5671)
! at org.apache.lucene.index.IndexWriter.prepareCommitInternal(IndexWriter.java:3813)
! at org.apache.lucene.index.IndexWriter.commitInternal(IndexWriter.java:4154)
! at org.apache.lucene.index.IndexWriter.commit(IndexWriter.java:4116)
! at org.apache.couchdb.nouveau.lucene.LuceneIndex.doCommit(LuceneIndex.java:170)
! at org.apache.couchdb.nouveau.core.Index.commit(Index.java:96)
! at org.apache.couchdb.nouveau.core.IndexManager.lambda$commitFun$4(IndexManager.java:347)
! at java.base/java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:572)
! at java.base/java.util.concurrent.FutureTask.runAndReset(FutureTask.java:358)
! at java.base/java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:305)
! at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1144)
! at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:642)
! at com.codahale.metrics.InstrumentedThreadFactory$InstrumentedRunnable.run(InstrumentedThreadFactory.java:66)
! at java.base/java.lang.Thread.run(Thread.java:1583)

@rnewson rnewson force-pushed the lucene-10 branch 3 times, most recently from 9ad011e to 0bb8604 Compare September 26, 2025 17:47
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants