Skip to content

JAVA-3118: Add support for vector data type in Schema Builder, QueryBuilder #1931

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Jan 7, 2025

Conversation

SiyaoIsHiding
Copy link
Contributor

@SiyaoIsHiding SiyaoIsHiding commented May 6, 2024

Currently, the SchemaBuilder works with vector like this:

    assertThat(
            createTable("foo")
                .withPartitionKey("k", DataTypes.INT)
                .withColumn("v", new DefaultVectorType(DataTypes.FLOAT, 3)))
        .hasCql("CREATE TABLE foo (k int PRIMARY KEY,v VECTOR<FLOAT, 3>)");

Or

assertThat(createTable("foo")
            .withPartitionKey("k", DataTypes.INT)
            .withColumn("v", DataTypes.custom("org.apache.cassandra.db.marshal.VectorType(org.apache.cassandra.db.marshal.FloatType,3)")
            ))
            .hasCql("CREATE TABLE foo (k int PRIMARY KEY,v VECTOR<FLOAT, 3>)");

Please let me know if you want something like .withColumn("v", DataTypes.vector(DataTypes.FLOAT, 3)).

@absurdfarce absurdfarce self-requested a review May 29, 2024 17:37
@michaelsembwever
Copy link
Member

i can't get this to compile

[ERROR] Failed to execute goal org.revapi:revapi-maven-plugin:0.10.5:check (default) on project java-driver-query-builder: The following API problems caused the build to fail:
[ERROR] java.method.addedToInterface: method com.datastax.oss.driver.api.querybuilder.select.Select com.datastax.oss.driver.api.querybuilder.select.Select::orderBy(com.datastax.oss.driver.api.querybuilder.select.Ann): Method was added to an interface.
[ERROR]

am i doing something wrong ?

@michaelsembwever michaelsembwever self-requested a review June 11, 2024 16:48
@michaelsembwever
Copy link
Member

Is there a separate ticket for vector similarity functions ?
https://cassandra.apache.org/doc/latest/cassandra/developing/cql/functions.html#vector-similarity-functions

@@ -146,6 +146,8 @@ default Select orderBy(@NonNull String columnName, @NonNull ClusteringOrder orde
return orderBy(CqlIdentifier.fromCql(columnName), order);
}

@NonNull
Select orderBy(@NonNull Ann ann);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would consider adding here the direction (ASC, DESC) parameter. Currently we do not support DESC vector ordering, but this may be available in future and CQL syntax allows it.

Copy link
Contributor

@absurdfarce absurdfarce left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice work @SiyaoIsHiding! This is basically what I was expecting to see with this change. We can have a conversation about the comments about the API but otherwise there's just a few things to clean up here.


public static Ann annOf(@NonNull String cqlIdentifier, @NonNull CqlVector<Number> vector) {
return new DefaultAnn(CqlIdentifier.fromCql(cqlIdentifier), vector);
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These will need to be updated when the PR for JAVA-3143 is merged; the CqlVector constraint won't apply once that's in.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I tested out on C* 5.0.1 and it says ANN only supports float.

cqlsh:default_keyspace> insert INTO vt (key, v ) VALUES ( 1, ['a','b']) ;
cqlsh:default_keyspace> select * from vt order by v ann of ['a', 'c'];
InvalidRequest: Error from server: code=2200 [Invalid query] message="ANN ordering is only supported on float vector indexes"

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice catch! Yes, you're correct; Apache Cassandra 5.0.x supports vectors of any subtype as a type but the ANN index used there only supports floats.

@@ -146,6 +146,8 @@ default Select orderBy(@NonNull String columnName, @NonNull ClusteringOrder orde
return orderBy(CqlIdentifier.fromCql(columnName), order);
}

@NonNull
Select orderBy(@NonNull Ann ann);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe I'm missing something but it seems more natural to support something like the following:

Select orderByAnnOf(CqlIdentifier columnId, CqlVector ann);
Select orderByAnnOf(String columnName, CqlVector ann);

Advantage is that with this approach you don't even need to introduce an Ann type... which kinda seems right as that type isn't really doing much for you here.

You could also perhaps add a notion of type checking the specified column to make sure it's a vector type (and to make sure it matches the type of the input CqlVector).

To the point made by @lukasz-antoniak above we could add directionality here (and throw warnings if the user tries to use a DESC order before there's server-side support for it) but I'm not sure it's worth it. There's no mention of ordering in the relevant Cassandra docs so my intuition says to just leave it out for now and add it when it becomes more of a thing.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I also agree to save DESC ordering for later because:

  1. I find it weird to add a feature that does not work yet and we cannot even test
  2. Neither Bret nor I found relevant doc saying they may support DESC of vector search. @lukasz-antoniak did you find any? I find it hard to imagine in what cases we want to find the approximate farthest neighbor...?
  3. If we want to add DESC later, we just need to add another function overload Select orderByAnnOf(String columnName, CqlVector ann, ClusteringOrder order);. I assume this is not hard.

@absurdfarce
Copy link
Contributor

One other thing worth mentioning: the Cassandra impl also supports a way to get "the similarity calculation of the best scoring node closest to the query data as part of the results". Take a look at the similarity_dot_product() function (and the other choices as well) in the relevant Cassandra docs. The query builder should have support for those as well.

@SiyaoIsHiding
Copy link
Contributor Author

The revapi thing is fixed and the vector similarity function is already supported by the existing Function term. I added tests for it as examples:

public void should_generate_similarity_functions() {
Select similarity_cosine_clause =
selectFrom("cycling", "comments_vs")
.column("comment")
.function(
"similarity_cosine",
Selector.column("comment_vector"),
literal(CqlVector.newInstance(0.2, 0.15, 0.3, 0.2, 0.05)))
.orderByAnnOf("comment_vector", CqlVector.newInstance(0.1, 0.15, 0.3, 0.12, 0.05))
.limit(1);
assertThat(similarity_cosine_clause)
.hasCql(
"SELECT comment,similarity_cosine(comment_vector,[0.2, 0.15, 0.3, 0.2, 0.05]) FROM cycling.comments_vs ORDER BY comment_vector ANN OF [0.1, 0.15, 0.3, 0.12, 0.05] LIMIT 1");
Select similarity_euclidean_clause =
selectFrom("cycling", "comments_vs")
.column("comment")
.function(
"similarity_euclidean",
Selector.column("comment_vector"),
literal(CqlVector.newInstance(0.2, 0.15, 0.3, 0.2, 0.05)))
.orderByAnnOf("comment_vector", CqlVector.newInstance(0.1, 0.15, 0.3, 0.12, 0.05))
.limit(1);
assertThat(similarity_euclidean_clause)
.hasCql(
"SELECT comment,similarity_euclidean(comment_vector,[0.2, 0.15, 0.3, 0.2, 0.05]) FROM cycling.comments_vs ORDER BY comment_vector ANN OF [0.1, 0.15, 0.3, 0.12, 0.05] LIMIT 1");
Select similarity_dot_product_clause =
selectFrom("cycling", "comments_vs")
.column("comment")
.function(
"similarity_dot_product",
Selector.column("comment_vector"),
literal(CqlVector.newInstance(0.2, 0.15, 0.3, 0.2, 0.05)))
.orderByAnnOf("comment_vector", CqlVector.newInstance(0.1, 0.15, 0.3, 0.12, 0.05))
.limit(1);
assertThat(similarity_dot_product_clause)
.hasCql(
"SELECT comment,similarity_dot_product(comment_vector,[0.2, 0.15, 0.3, 0.2, 0.05]) FROM cycling.comments_vs ORDER BY comment_vector ANN OF [0.1, 0.15, 0.3, 0.12, 0.05] LIMIT 1");
}

In terms of the spring-ai downstream, as we won't actually break any API, is there anything we should test or how?

@absurdfarce
Copy link
Contributor

Good call on the similarity_* functions @SiyaoIsHiding!


/** Adds the ORDER BY ... ANN OF ... clause */
@NonNull
Select orderByAnnOf(@NonNull CqlIdentifier columnId, @NonNull CqlVector<? extends Number> ann);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think I'm changing my answer on this: we should remove the type bound here and just make this a CqlVector<?>. My rationale goes as follows.

The fact that ANN comparisons only support float vectors actually isn't a constraint on the underlying Java type used here. In theory any Java type whose codec will generate a serialized value that can be understood as a float when the server receives the message would work here. That means we could pass in a CqlVector of floats, decimals or doubles and still have the server handle that without issue. I'll also note that there's no common supertype for Float, BigDecimal or Double (the corresponding Java types based on the docs) other than Number... and we can't use that here because it includes types other than these three. So there's no meaningful supertype we can use for a type bound at this point.

A second point: the query builder is built around the notion of generating CQL based on whatever the user passes in; there's almost no checking whether types correlate to things the user specified. So, for example, OngoingValues doesn't have any kind of type bounds around it's various value() methods. I accept that this isn't exactly the same as the float constraint we're discussing here... but it is an indication that the query builder largely doesn't concern itself with type checking with the expectation that the server will handle that when it receives the query.

Finally, I'll note that avoiding the type bound here at least exposes the idea (at an API level) of using external/custom types which serialize to CQL float values by way of their own codecs. Scala users, for example, might want to support using Spire numerics for their CQL queries. This actually won't work for now since we constrain the codec registry used to the (immutable) default which doesn't include any support for Spire types... but that's an implementation detail. By avoiding type bounds at an API level we've left that door open for such a change without having to make a (potentially breaking) API change and without sacrificing our design principles elsewhere.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Getting back to this conversation because the current implementation will result in a compilation error when we extend vector support to arbitrary subtype.

  public Select orderByAnnOf(@NonNull CqlIdentifier columnId, @NonNull CqlVector<?> ann) {
    return withAnn(new Ann(columnId, ann));
  }

The argument we get is a CqlVector<?> , but the Ann constructor wants a CqlVector<? extends Number>. Turns out compilation error.
We either:

  1. change Ann constructor to CqlVector<?>, or
  2. change orderByAnnOf API to CqlVector<? extends Number>.

I think option 2 makes more sense.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@absurdfarce bumping this discussion

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@SiyaoIsHiding and I discussed in a meeting today. Both approaches have arguments for them but in the end we decided to go with CqlVector<?> to get this moving forward and merged.

Copy link
Contributor

@absurdfarce absurdfarce left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We're almost there @SiyaoIsHiding! Let's sort out the type bounds on Select.orderByAnnOf() and then I think this guy is (finally) ready to be merged!

@SiyaoIsHiding
Copy link
Contributor Author

I assume @lukasz-antoniak comment here about adding getAnn() in DefaultSelect should be addressed here. Thank you for your suggestion! It's added.

@absurdfarce
Copy link
Contributor

I'm not 💯 sure what the accessors on DefaultSelect are intended to do but I suppose it does make sense to keep it consistent and add an accessor for the Ann object.

Either way I'm satisfied with where this stands now... 👍 from me!

@absurdfarce absurdfarce requested review from tolbertam and removed request for lukasz-antoniak October 14, 2024 16:00
absurdfarce added a commit to absurdfarce/cassandra-java-driver that referenced this pull request Oct 15, 2024
@absurdfarce
Copy link
Contributor

Raised by another user in separate communication: we should add some updates to the documentation reflecting these new features. Specifically we should update relevant areas of the query_builder section of the manual.

@absurdfarce
Copy link
Contributor

@SiyaoIsHiding I'm okay with the idea behind these changes but we do need to fix a few things before this is merge-ready. Specifically:

  • The branch conflicts discussed above need to be resolved... I'm guessing those are largely from CASSJAVA-2
  • Let's also add the doc updates referenced above

Once these fixes are in (and we get a second +1) I think we're okay to move forward.

@michaelsembwever
Copy link
Member

Testing this with spring-projects/spring-ai#1817

I'm getting from this code:

SchemaBuilder.createTable(keyspace, table)
    .ifNotExists()
    .withPartitionKey(partitionKey.name, partitionKey.type)
    .withClusteringColumn(clusteringKey.name, clusteringKey.type)
    .withColumn("embedding", DataTypes.vectorOf(DataTypes.FLOAT, vectorDimension))
    .build();

this failure:

com.datastax.oss.driver.api.core.servererrors.SyntaxError: Error setting type org.apache.cassandra.db.marshal.VectorType(384): Invalid definition for comparator org.apache.cassandra.db.marshal.VectorType.
	at com.datastax.oss.driver.api.core.servererrors.SyntaxError.copy(SyntaxError.java:50)
	at com.datastax.oss.driver.internal.core.util.concurrent.CompletableFutures.getUninterruptibly(CompletableFutures.java:151)
	at com.datastax.oss.driver.internal.core.cql.CqlRequestSyncProcessor.process(CqlRequestSyncProcessor.java:55)
	at com.datastax.oss.driver.internal.core.cql.CqlRequestSyncProcessor.process(CqlRequestSyncProcessor.java:32)
	at com.datastax.oss.driver.internal.core.session.DefaultSession.execute(DefaultSession.java:234)
	at com.datastax.oss.driver.api.core.cql.SyncCqlSession.execute(SyncCqlSession.java:56)
	at org.springframework.ai.vectorstore.CassandraVectorStoreConfig.ensureTableExists(CassandraVectorStoreConfig.java:244)

michaelsembwever added a commit to michaelsembwever/spring-ai that referenced this pull request Nov 25, 2024
@SiyaoIsHiding
Copy link
Contributor Author

@michaelsembwever Can you see the debug log from this line?

logger.debug("Executing {}", createTable.asCql());

I wonder what asCql() outputs. I ran your code with substituted values

    assertThat(SchemaBuilder.createTable("ks", "mytb")
            .ifNotExists()
            .withPartitionKey("k", DataTypes.INT)
            .withClusteringColumn("v", DataTypes.TEXT)
            .withColumn("embedding", DataTypes.vectorOf(DataTypes.FLOAT, 3)))
            .hasCql("CREATE TABLE IF NOT EXISTS ks.mytb (k int,v text,embedding vector<float, 3>,PRIMARY KEY(k,v))");

And this will pass. I ran the cql statement in cqlsh, it turns out no problem, either.

@SiyaoIsHiding SiyaoIsHiding force-pushed the vector-support branch 2 times, most recently from 103b859 to d0d3461 Compare November 26, 2024 01:44
@SiyaoIsHiding
Copy link
Contributor Author

Solved merge conflict and updated doc. I just realized our CI broke. Can any one of you committers click the button to run tests for this PR please? Here: https://ci-cassandra.apache.org/blue/organizations/jenkins/cassandra-java-driver/detail/PR-1931/8/pipeline

@michaelsembwever
Copy link
Member

I just realized our CI broke. Can any one of you committers click the button to run tests for this PR please?

done: https://ci-cassandra.apache.org/job/cassandra-java-driver/job/PR-1931/9/

@michaelsembwever
Copy link
Member

michaelsembwever commented Nov 26, 2024

createTable.asCql() is

CREATE TABLE IF NOT EXISTS test_springframework.ai_vector_store (id text PRIMARY KEY,content text,embedding 'org.apache.cassandra.db.marshal.VectorType(384)',country text,year smallint,meta1 text,meta2 text)

Why am I getting 'org.apache.cassandra.db.marshal.VectorType(384)' instead of vector<float,384>

@SiyaoIsHiding
Copy link
Contributor Author

Are you sure you are using my PR version? That full class name looks like an outdated 4.x branch install before this commit: e843786

@SiyaoIsHiding
Copy link
Contributor Author

And it seems like our CI isn't configured with credentials. It met GitHub quota limit.

@michaelsembwever
Copy link
Member

Are you sure you are using my PR version? That full class name looks like an outdated 4.x branch install before this commit: e843786

My bad, my checkout was still on 0a36ff0
Thanks for being patient with me.

@SiyaoIsHiding
Copy link
Contributor Author

Thank you for testing the downstream too

@michaelsembwever
Copy link
Member

is there outstanding items or more work to do here ?

@absurdfarce
Copy link
Contributor

@michaelsembwever Short version is that this is basically done; we've essentially been waiting on a second +1 to move this one forward. Only wrinkle there is that @SiyaoIsHiding did change a couple things recently in relation to work she's been doing on the PR for JAVA-3143... but those changes should be relatively small.

@SiyaoIsHiding
Copy link
Contributor Author

No outstanding item here, this PR is pending for reviews.
@absurdfarce I've updated the PR. May you please give an approval?
@michaelsembwever It seems to me that you have some familiarity to this PR. Would you have time for a review as the second +1?

@absurdfarce
Copy link
Contributor

DS Jenkins looks good. There was one test failure which looks to be a test issue (ccm timeout on startup) so I'm calling this good. @SiyaoIsHiding can you squash this down to a single commit so we can get this in?

…uilder

patch by Jane He; reviewed by Mick Semb Wever and Bret McGuire for JAVA-3118
reference: apache#1931
@SiyaoIsHiding
Copy link
Contributor Author

Thanks! Squashed. @absurdfarce

@absurdfarce absurdfarce merged commit 7689c5a into apache:4.x Jan 7, 2025
1 check failed
dkropachev added a commit to scylladb/java-driver that referenced this pull request Mar 19, 2025
* CASSANDRA-19635: Run integration tests with C* 5.x

patch by Lukasz Antoniak; reviewed by Andy Tolbert, and Bret McGuire for CASSANDRA-19635

* CASSANDRA-19635: Configure Jenkins to run integration tests with C* 5.x

patch by Lukasz Antoniak; reviewed by Bret McGuire for CASSANDRA-19635

* update badge URL to org.apache.cassandra/java-driver-core

* Limit calls to Conversions.resolveExecutionProfile

Those repeated calls account for a non-negligible portion of my application
CPU (0.6%) and can definitly be a final field so that it gets resolved only
once per CqlRequestHandler.

patch by Benoit Tellier; reviewed by Andy Tolbert, and Bret McGuire
reference: apache#1623

* autolink JIRA tickets in commit messages

patch by Stefan Miklosovic; reviewed by Michael Semb Wever for CASSANDRA-19854

* Don't return empty routing key when partition key is unbound

DefaultBoundStatement#getRoutingKey has logic to infer the
routing key when no one has explicitly called setRoutingKey
or otherwise set the routing key on the statement.
It however doesn't check for cases where nothing has been
bound yet on the statement.
This causes more problems if the user decides to get a
BoundStatementBuilder from the PreparedStatement, set some
fields on it, and then copy it by constructing new
BoundStatementBuilder objects with the BoundStatement as a
parameter, since the empty ByteBuffer gets copied to all
bound statements, resulting in all requests being targeted
to the same Cassandra node in a token-aware load balancing
policy.

patch by Ammar Khaku; reviewed by Andy Tolbert, and Bret McGuire
reference: apache#1620

* JAVA-3167: CompletableFutures.allSuccessful() may return never completed future

patch by Lukasz Antoniak; reviewed by Andy Tolbert, and Bret McGuire for JAVA-3167

* ninja-fix Various test fixes

* Run integration tests with DSE 6.9.0

patch by Lukasz Antoniak; reviewed by Bret McGuire
reference: apache#1955

* JAVA-3117: Call CcmCustomRule#after if CcmCustomRule#before fails to allow subsequent tests to run

patch by Henry Hughes; reviewed by Alexandre Dutra and Andy Tolbert for JAVA-3117

* JAVA-3149: Support request cancellation in request throttler
patch by Lukasz Antoniak; reviewed by Andy Tolbert and Chris Lohfink for JAVA-3149

* Fix C* 3.0 tests failing on Jenkins
patch by Lukasz Antoniak; reviewed by Bret McGuire
reference: apache#1939

* Reduce lock held duration in ConcurrencyLimitingRequestThrottler

It might take some (small) time for callback handling when the
throttler request proceeds to submission.

Before this change, the throttler proceed request will happen while
holding the lock, preventing other tasks from proceeding when there is
spare capacity and even preventing tasks from enqueuing until the
callback completes.

By tracking the expected outcome, we can perform the callback outside
of the lock. This means that request registration and submission can
proceed even when a long callback is being processed.

patch by Jason Koch; Reviewed by Andy Tolbert and Chris Lohfink for CASSANDRA-19922

* Annotate BatchStatement, Statement, SimpleStatement methods with CheckReturnValue

Since the driver's default implementation is for
BatchStatement and SimpleStatement methods to be immutable,
we should annotate those methods with @CheckReturnValue.
Statement#setNowInSeconds implementations are immutable so
annotate that too.

patch by Ammar Khaku; reviewed by Andy Tolbert and Bret McGuire
reference: apache#1607

* Remove "beta" support for Java17 from docs

patch by Bret McGuire; reviewed by Andy Tolbert and Alexandre Dutra
reference: apache#1962

* Fix uncaught exception during graceful channel shutdown

after exceeding max orphan ids

patch by Christian Aistleitner; reviewed by Andy Tolbert, and Bret McGuire for apache#1938

* Build a public CI for Apache Cassandra Java Driver

 patch by Siyao (Jane) He; reviewed by Mick Semb Wever for CASSANDRA-19832

* CASSANDRA-19932: Allow to define extensions while creating table
patch by Lukasz Antoniak; reviewed by Bret McGuire and Chris Lohfink

* Fix DefaultSslEngineFactory missing null check on close

patch by Abe Ratnofsky; reviewed by Andy Tolbert and Chris Lohfink for CASSANDRA-20001

* Query builder support for NOT CQL syntax

patch by Bret McGuire; reviewed by Bret McGuire and Andy Tolbert for CASSANDRA-19930

* Fix CustomCcmRule to drop `CURRENT` flag no matter what

If super.after() throws an Exception `CURRENT` flag is never dropped
which leads next tests to fail with IllegalStateException("Attempting to use a Ccm rule while another is in use.  This is disallowed")

Patch by Dmitry Kropachev; reviewed by Andy Tolbert and Bret McGuire for JAVA-3117

* JAVA-3051: Memory leak

patch by Jane He; reviewed by Alexandre Dutra and Bret McGuire for JAVA-3051

* Automate latest Cassandra versions when running CI

 patch by Siyao (Jane) He; reviewed by Mick Semb Wever for CASSJAVA-25

* Refactor integration tests to support multiple C* distributions. Test with DataStax HCD 1.0.0

patch by Lukasz Antoniak; reviewed by Bret McGuire
reference: apache#1958

* Fix TableMetadata.describe() when containing a vector column

patch by Stefan Miklosovic; reviewed by Bret McGuire for CASSJAVA-2

* Move Apache Cassandra 5.x off of beta1 and remove some older Apache Cassandra versions.

patch by Bret McGuire; reviewed by Bret McGuire for CASSJAVA-54

* Update link to Jira to be CASSJAVA

Updating the link to Jira.

Previously we had a component in the CASSANDRA Jira project but now we have a project for each driver - in the case of Java, it's CASSJAVA.

Added CASSJAVA to .asf.yaml

patch by Jeremy Hanna; reviewed by Bret McGuire for CASSJAVA-61

* Move DataStax shaded Guava module into Java driver

patch by Lukasz Antoniak; reviewed by Alexandre Dutra and Bret McGuire for CASSJAVA-52

* JAVA-3057 Allow decoding a UDT that has more fields than expected

patch by Ammar Khaku; reviewed by Andy Tolbert and Bret McGuire
reference: apache#1635

* CASSJAVA-55 Remove setting "Host" header for metadata requests.

With some sysprops enabled this will actually be respected which completely borks Astra routing.

patch by Bret McGuire; reviewed by Alexandre Dutra and Bret McGuire for CASSJAVA-55

* JAVA-3118: Add support for vector data type in Schema Builder, QueryBuilder
patch by Jane He; reviewed by Mick Semb Wever and Bret McGuire for JAVA-3118
reference: apache#1931

* Upgrade Guava to 33.3.1-jre

patch by Lukasz Antoniak; reviewed by Alexandre Dutra and Bret McGuire for CASSJAVA-53

* Do not always cleanup Guava shaded module before packaging

* Revert "Do not always cleanup Guava shaded module before packaging"

This reverts commit 5be52ec.

* Conditionally compile shaded Guava module

* JAVA-3143: Extend driver vector support to arbitrary subtypes and fix handling of variable length types (OSS C* 5.0)

patch by Jane He; reviewed by Bret McGuire and João Reis
reference: apache#1952

* JAVA-3168 Copy node info for contact points on initial node refresh only from first match by endpoint

patch by Alex Sasnouskikh; reviewed by Andy Tolbert and Alexandre Dura for JAVA-3168

* JAVA-3055: Prevent PreparedStatement cache to be polluted if a request is cancelled.

There was a critical issue when the external code cancels a request, indeed the cached CompletableFuture will then always throw a CancellationException.
This may happens, for example, when used by reactive like Mono.zip or Mono.firstWithValue.

patch by Luc Boutier; reviewed by Alexandre Dutra and Bret McGuire
reference: apache#1757

* Expose a decorator for CqlPrepareAsyncProcessor cache rather than the ability to specify an
arbitrary cache from scratch.

Also bringing tests from apache#2003 forward
with a few minor changes due to this implementation

patch by Bret McGuire; reviewed by Bret McGuire and Andy Tolbert
reference: apache#2008

* ninja-fix Using shaded Guava classes for import in order to make OSGi class paths happy.

Major hat tip to Dmitry Konstantinov for the find here!

* Changelog updates for 4.19.0

* [maven-release-plugin] prepare release 4.19.0

* [maven-release-plugin] prepare for next development iteration

* Specify maven-clean-plugin version

Sets the version to 3.4.1 in parent pom.
Having it unspecified causes the following warning:
```
[WARNING] Some problems were encountered while building the effective model for com.scylladb:java-driver-guava-shaded:jar:4.19.0.0-SNAPSHOT
[WARNING] 'build.plugins.plugin.version' for org.apache.maven.plugins:maven-clean-plugin is missing. @ line 97, column 15
[WARNING]
[WARNING] It is highly recommended to fix these problems because they threaten the stability of your build.
[WARNING]
[WARNING] For this reason, future Maven versions might no longer support building such malformed projects.
[WARNING]
```

* Install guava-shaded before running core's compile in CI

Without this module available "Build" and "Unit tests" job fail with
"package <x> does not exist" or "cannot find symbol" pointing to
`[...].shaded.guava.[...]` packages.

* Remove exception catch in `prepared_stmt_metadata_update_loopholes_test`

Since incorporating "JAVA-3057 Allow decoding a UDT that has more fields than
expected" the asuumptions of removed check are no longer valid.

* Switch shaded guava's groupId in osgi-tests

Switches from `org.apache.cassandra` to `com.scylladb`
in `BundleOptions#commonBundles()`.

* Typo: increase line's loglevel in CcmBridge

---------

Co-authored-by: Lukasz Antoniak <[email protected]>
Co-authored-by: Brad Schoening <[email protected]>
Co-authored-by: Benoit Tellier <[email protected]>
Co-authored-by: Stefan Miklosovic <[email protected]>
Co-authored-by: Ammar Khaku <[email protected]>
Co-authored-by: absurdfarce <[email protected]>
Co-authored-by: Henry Hughes <[email protected]>
Co-authored-by: Jason Koch <[email protected]>
Co-authored-by: Christian Aistleitner <[email protected]>
Co-authored-by: janehe <[email protected]>
Co-authored-by: Abe Ratnofsky <[email protected]>
Co-authored-by: absurdfarce <[email protected]>
Co-authored-by: Dmitry Kropachev <[email protected]>
Co-authored-by: janehe <[email protected]>
Co-authored-by: Jeremy Hanna <[email protected]>
Co-authored-by: SiyaoIsHiding <[email protected]>
Co-authored-by: Alex Sasnouskikh <[email protected]>
Co-authored-by: Luc Boutier <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants