diff --git a/CONTRIBUTING.md b/CONTRIBUTING.md index 53857383cf2..72831ef39df 100644 --- a/CONTRIBUTING.md +++ b/CONTRIBUTING.md @@ -17,518 +17,108 @@ specific language governing permissions and limitations under the License. --> -# Contributing guidelines +# Contributing Guidelines -## Code formatting +Thank you for your interest in contributing to the Apache Cassandra Java Driver! Please review the following guidelines to help you get started. -### Java +## Table of Contents -We follow the [Google Java Style Guide](https://google.github.io/styleguide/javaguide.html). See -https://github.com/google/google-java-format for IDE plugins. The rules are not configurable. +- [Bugs, Features, and Questions](#bugs-features-and-questions) +- [Development Environment Setup](#development-environment-setup) +- [Before You Submit a Pull Request](#before-you-submit-a-pull-request) -The build will fail if the code is not formatted. To format all files from the command line, run: - -``` -mvn fmt:format -``` +## Bugs, Features, and Questions -Some aspects are not covered by the formatter: braces must be used with `if`, `else`, `for`, `do` -and `while` statements, even when the body is empty or contains only a single statement. +- To report bugs or request features, please file a ticket on our [JIRA](https://issues.apache.org/jira/projects/CASSJAVA). +- For questions, use our [user mailing list](https://groups.google.com/a/lists.datastax.com/g/java-driver-user). -### XML +## Development Environment Setup -The build will fail if XML files are not formatted correctly. Run the following command before you -commit: - -```java -mvn xml-format:xml-format -``` - -The formatter does not enforce a maximum line length, but please try to keep it below 100 characters -to keep files readable across all mediums (IDE, terminal, Github...). - -### Other text files (markdown, etc) - -Similarly, enforce a right margin of 100 characters in those files. Editors and IDEs generally have -a way to configure this (for IDEA, install the "Wrap to column" plugin). - -## Coding style -- production code - -Do not use static imports. They make things harder to understand when you look at the code -someplace where you don't have IDE support, like Github's code view. - -Avoid abbreviations in class and variable names. A good rule of thumb is that you should only use -them if you would also do so verbally, for example "id" and "config" are probably reasonable. -Single-letter variables are permissible if the variable scope is only a few lines, or for commonly -understood cases (like `i` for a loop index). - -Keep source files short. Short files are easy to understand and test. The average should probably -be around 200-300 lines. - -### Javadoc - -All types in "API" packages must be documented. For "internal" packages, documentation is optional, -but in no way discouraged: it's generally a good idea to have a class-level comment that explains -where the component fits in the architecture, and anything else that you feel is important. - -You don't need to document every parameter or return type, or even every method. Don't document -something if it is completely obvious, we don't want to end up with this: - -```java -/** - * Returns the name. - * - * @return the name - */ -String getName(); -``` - -On the other hand, there is often something useful to say about a method, so most should have at -least a one-line comment. Use common sense. - -Driver users coding in their IDE should find the right documentation at the right time. Try to -think of how they will come into contact with the class. For example, if a type is constructed with -a builder, each builder method should probably explain what the default is when you don't call it. - -Avoid using too many links, they can make comments harder to read, especially in the IDE. Link to a -type the first time it's mentioned, then use a text description ("this registry"...) or an `@code` -block. Don't link to a class in its own documentation. Don't link to types that appear right below -in the documented item's signature. - -```java -/** -* @return this {@link Builder} <-- completely unnecessary -*/ -Builder withLimit(int limit) { -``` - -### Logs - -We use SLF4J; loggers are declared like this: - -```java -private static final Logger LOG = LoggerFactory.getLogger(TheEnclosingClass.class); -``` - -Logs are intended for two personae: - -* Ops who manage the application in production. -* Developers (maybe you) who debug a particular issue. - -The first 3 log levels are for ops: - -* `ERROR`: something that renders the driver -- or a part of it -- completely unusable. An action is - required to fix it: bouncing the client, applying a patch, etc. -* `WARN`: something that the driver can recover from automatically, but indicates a configuration or - programming error that should be addressed. For example: the driver connected successfully, but - one of the contact points in the configuration was malformed; the same prepared statement is being - prepared multiple time by the application code. -* `INFO`: something that is part of the normal operation of the driver, but might be useful to know - for an operator. For example: the driver has initialized successfully and is ready to process - queries; an optional dependency was detected in the classpath and activated an enhanced feature. - -Do not log errors that are rethrown to the client (such as the error that you're going to complete a -request with). This is annoying for ops because they see a lot of stack traces that require no -actual action on their part, because they're already handled by application code. - -Similarly, do not log stack traces for non-critical errors. If you still want the option to get the -trace for debugging, see the `Loggers.warnWithException` utility. - -The last 2 levels are for developers, to help follow what the driver is doing from a "black box" -perspective (think about debugging an issue remotely, and all you have are the logs). - -* `TRACE`: anything that happens **for every user request**. Not only request handling, but all - related components (e.g. timestamp generators, policies, etc). -* `DEBUG`: everything else. For example, node state changes, control connection activity, etc. - -Note that `DEBUG` and `TRACE` can coexist within the same component, for example the LBP -initializing is a one-time event, but returning a query plan is a per-request event. - -Logs statements start with a prefix that identifies its origin, for example: - -* for components that are unique to the cluster instance, just the cluster name: `[c0]`. -* for sessions, the cluster name + a generated unique identifier: `[c0|s0]`. -* for channel pools, the session identifier + the address of the node: `[c0|s0|/127.0.0.2:9042]`. -* for channels, the identifier of the owner (session or control connection) + the Netty identifier, - which indicates the local and remote ports: - `[c0|s0|id: 0xf9ef0b15, L:/127.0.0.1:51482 - R:/127.0.0.1:9042]`. -* for request handlers, the session identifier, a unique identifier, and the index of the - speculative execution: `[c0|s0|1077199500|0]`. - -Tests run with the configuration defined in `src/test/resources/logback-test.xml`. The default level -for driver classes is `WARN`, but you can override it with a system property: `-DdriverLevel=DEBUG`. -A nice setup is to use `DEBUG` when you run from your IDE, and keep the default for the command -line. - -When you add or review new code, take a moment to run the tests in `DEBUG` mode and check if the -output looks good. - -### Don't abuse the stream API - -The `java.util.stream` API is often used (abused?) as a "functional API for collections": - -```java -List sizes = words.stream().map(String::length).collect(Collectors.toList()); -``` - -The perceived advantages of this approach over traditional for-loops are debatable: - -* readability: this is highly subjective. But consider the following: - * everyone can read for-loops, whether they are familiar with the Stream API or not. The opposite - is not true. - * the stream API does not spell out all the details: what kind of list does `Collectors.toList()` - return? Is it pre-sized? Mutable? Thread-safe? - * the stream API looks pretty on simple examples, but things can get ugly fast. Try rewriting - `NetworkTopologyReplicationStrategy` with streams. -* concision: this is irrelevant. When we look at code we care about maintainability, not how many - keystrokes the author saved. The for-loop version of the above example is just 5 lines long, and - your brain doesn't take longer to parse it. - -The bottom line: don't try to "be functional" at all cost. Plain old for-loops are often just as -simple. - -### Never assume a specific format for `toString()` - -Only use `toString()` for debug logs or exception messages, and always assume that its format is -unspecified and can change at any time. - -If you need a specific string representation for a class, make it a dedicated method with a -documented format, for example `toCqlLiteral`. Otherwise it's too easy to lose track of the intended -usage and break things: for example, someone modifies your `toString()` method to make their logs -prettier, but unintentionally breaks the script export feature that expected it to produce CQL -literals. - -`toString()` can delegate to `toCqlLiteral()` if that is appropriate for logs. - - -### Concurrency annotations - -We use the [JCIP annotations](http://jcip.net/annotations/doc/index.html) to document thread-safety -policies. - -Add them for all new code, with the exception of: - -* enums and interfaces; -* utility classes (only static methods); -* test code. - -Make sure you import the types from `net.jcip`, there are homonyms in the classpath. +### Build and IDE Configuration +- Ensure Maven is installed and you are using Java 8. +- Build the project with: + ``` + mvn clean package -DskipTests + ``` +- If using an IDE like IntelliJ and encountering issues with guava-shaded classes: + - Run: + ``` + mvn clean install -DskipTests + ``` + - If IntelliJ uses a different Maven version, use the Maven window in IntelliJ: under `Lifecycle`, click `clean` and then `install`. -### Nullability annotations +### Running the Tests -We use the [Spotbugs annotations](https://spotbugs.github.io) to document nullability of parameters, -method return types and class members. +#### Unit Tests -Please annotate any new class or interface with the appropriate annotations: `@NonNull`, `@Nullable`. Make sure you import -the types from `edu.umd.cs.findbugs.annotations`, there are homonyms in the classpath. +- Ensure you are using Java 8. +- Run: + ``` + mvn clean install -DskipTests + mvn test + ``` +#### Integration Tests + +1. Install Cassandra Cluster Manager (CCM) following its [README](https://github.com/apache/cassandra-ccm). +2. On macOS, enable loopback aliases: + ```shell + for i in {2..255}; do sudo ifconfig lo0 alias 127.0.0.$i up; done + ``` + Note: This may slow down networking. To remove the aliases after testing: + ```shell + for i in {2..255}; do sudo ifconfig lo0 -alias 127.0.0.$i up; done + ``` +3. Run integration tests: + ``` + mvn clean verify + ``` + To target a specific Cassandra version: + ``` + mvn verify -Dccm.version=3.11.0 + ``` + +### Code Formatting and License Headers + +- We follow the [Google Java Style Guide](https://google.github.io/styleguide/javaguide.html). See [google-java-format](https://github.com/google/google-java-format) for IDE plugins. +- To format code: + ``` + # Java files + mvn fmt:format + # XML files + mvn xml-format:xml-format + # License headers + mvn license:format + ``` -## Coding style -- test code +### Pre-commit Hook (Highly Recommended) -Static imports are permitted in a couple of places: -* All AssertJ methods, e.g.: - ```java - assertThat(node.getDatacenter()).isNotNull(); - fail("Expecting IllegalStateException to be thrown"); +- Make `pre-commit.sh` executable: ``` -* All Mockito methods, e.g.: - ```java - when(codecRegistry.codecFor(DataTypes.INT)).thenReturn(codec); - verify(codec).decodePrimitive(any(ByteBuffer.class), eq(ProtocolVersion.DEFAULT)); + chmod +x pre-commit.sh ``` -* All Awaitility methods, e.g.: - ```java - await().until(() -> somethingBecomesTrue()); +- Set up the pre-commit hook: ``` + ln -s ../../pre-commit.sh .git/hooks/pre-commit + ``` +- This script will format files and run unit tests before each commit. -Test methods names use lower snake case, generally start with `should`, and clearly indicate the -purpose of the test, for example: `should_fail_if_key_already_exists`. If you have trouble coming -up with a simple name, it might be a sign that your test does too much, and should be split. - -We use AssertJ (`assertThat`) for assertions. Don't use JUnit assertions (`assertEquals`, -`assertNull`, etc). - -Don't try to generify at all cost: a bit of duplication is acceptable, if that helps keep the tests -simple to understand (a newcomer should be able to understand how to fix a failing test without -having to read too much code). - -Test classes can be a bit longer, since they often enumerate similar test cases. You can also -factor some common code in a parent abstract class named with "XxxTestBase", and then split -different families of tests into separate child classes. For example, `CqlRequestHandlerTestBase`, -`CqlRequestHandlerRetryTest`, `CqlRequestHandlerSpeculativeExecutionTest`... - -### Unit tests - -They live in the same module as the code they are testing. They should be fast and not start any -external process. They usually target one specific component and mock the rest of the driver -context. - -### Integration tests - -They live in the `integration-tests` module, and exercise the whole driver stack against an external -process, which can be either one of: -* [Simulacron](https://github.com/datastax/simulacron): simulates Cassandra nodes on loopback - addresses; your test must "prime" data, i.e. tell the nodes what results to return for - pre-determined queries. - - For an example of a Simulacron-based test, see `NodeTargetingIT`. -* [CCM](https://github.com/pcmanus/ccm): launches actual Cassandra nodes locally. The `ccm` - executable must be in the path. - - You can pass a `-Dccm.version` system property to the build to target a particular Cassandra - version (it defaults to 3.11.0). `-Dccm.directory` allows you to point to a local installation - -- this can be a checkout of the Cassandra codebase, as long as it's built. See `CcmBridge` in - the driver codebase for more details. - - For an example of a CCM-based test, see `PlainTextAuthProviderIT`. - -#### Categories - -Integration tests are divided into three categories: - -##### Parallelizable tests - -These tests can be run in parallel, to speed up the build. They either use: -* dedicated Simulacron instances. These are lightweight, and Simulacron will manage the ports to - make sure that there are no collisions. -* a shared, one-node CCM cluster. Each test works in its own keyspace. - -The build runs them with a configurable degree of parallelism (currently 8). The shared CCM cluster -is initialized the first time it's used, and stopped before moving on to serial tests. Note that we -run with `parallel=classes`, which means methods within the same class never run concurrent to each -other. - -To make an integration test parallelizable, annotate it with `@Category(ParallelizableTests.class)`. -If you use CCM, it **must** be with `CcmRule`. - -For an example of a Simulacron-based parallelizable test, see `NodeTargetingIT`. For a CCM-based -test, see `DirectCompressionIT`. - -##### Serial tests - -These tests cannot run in parallel, in general because they require CCM clusters of different sizes, -or with a specific configuration (we never run more than one CCM cluster simultaneously: it would be -too resource-intensive, and too complicated to manage all the ports). - -The build runs them one by one, after the parallelizable tests. - -To make an integration test serial, do not annotate it with `@Category`. The CCM rule **must** be -`CustomCcmRule`. - -For an example, see `DefaultLoadBalancingPolicyIT`. - -Note: if multiple serial tests have a common "base" class, do not pull up `CustomCcmRule`, each -child class must have its own instance. Otherwise they share the same CCM instance, and the first -one destroys it on teardown. See `TokenITBase` for how to organize code in those cases. - -##### Isolated tests - -Not only can those tests not run in parallel, they also require specific environment tweaks, -typically system properties that need to be set before initialization. - -The build runs them one by one, *each in its own JVM fork*, after the serial tests. - -To isolate an integration test, annotate it with `@Category(IsolatedTests.class)`. The CCM rule -**must** be `CustomCcmRule`. - -For an example, see `HeapCompressionIT`. - -#### About test rules - -Do not mix `CcmRule` and `SimulacronRule` in the same test. It makes things harder to follow, and -can be inefficient (if the `SimulacronRule` is method-level, it will create a Simulacron cluster for -every test method, even those that only need CCM). - -##### Class-level rules - -Rules annotated with `@ClassRule` wrap the whole test class, and are reused across methods. Try to -use this as much as possible, as it's more efficient. The fields need to be static; also make them -final and use constant naming conventions, like `CCM_RULE`. - -When you use a server rule (`CcmRule` or `SimulacronRule`) and a `SessionRule` at the same level, -wrap them into a rule chain to ensure proper initialization order: - -```java -private static final CcmRule CCM_RULE = CcmRule.getInstance(); -private static final SessionRule SESSION_RULE = SessionRule.builder(CCM_RULE).build(); - -@ClassRule -public static final TestRule CHAIN = RuleChain.outerRule(CCM_RULE).around(SESSION_RULE); -``` - -##### Method-level rules - -Rules annotated with `@Rule` wrap each test method. Use lower-camel case for field names: - -```java -private CcmRule ccmRule = CcmRule.getInstance(); -private SessionRule sessionRule = SessionRule.builder(ccmRule).build(); - -@ClassRule -public TestRule chain = RuleChain.outerRule(ccmRule).around(sessionRule); -``` - -Only use this for: - -* CCM tests that use `@CassandraRequirement` or `@DseRequirement` restrictions at the method level - (ex: `BatchStatementIT`). -* tests where you *really* need to restart from a clean state for every method. - -##### Mixed - -It's also possible to use a `@ClassRule` for CCM / Simulacron, and a `@Rule` for the session rule. -In that case, you don't need to use a rule chain. - -## Running the tests - -### Unit tests - - mvn clean test - -This currently takes about 30 seconds. The goal is to keep it within a couple of minutes (it runs -for each commit if you enable the pre-commit hook -- see below). - -### Integration tests - - mvn clean verify - -This currently takes about 9 minutes. We don't have a hard limit, but ideally it should stay within -30 minutes to 1 hour. - -You can skip test categories individually with `-DskipParallelizableITs`, `-DskipSerialITs` and -`-DskipIsolatedITs` (`-DskipITs` still works to skip them all at once). - -### Configuring MacOS for Simulacron - -Simulacron (used in integration tests) relies on loopback aliases to simulate multiple nodes. On -Linux or Windows, you shouldn't have anything to do. On MacOS, run this script: - -``` -#!/bin/bash -for sub in {0..4}; do - echo "Opening for 127.0.$sub" - for i in {0..255}; do sudo ifconfig lo0 alias 127.0.$sub.$i up; done -done -``` - -Note that this is known to cause temporary increased CPU usage in OS X initially while mDNSResponder -acclimates itself to the presence of added IP addresses. This lasts several minutes. Also, this does -not survive reboots. - - -## License headers - -The build will fail if some license headers are missing. To update all files from the command line, -run: - -``` -mvn license:format -``` - -## Pre-commit hook (highly recommended) - -Ensure `pre-commit.sh` is executable, then run: - -``` -ln -s ../../pre-commit.sh .git/hooks/pre-commit -``` - -This will only allow commits if the tests pass. It is also a good reminder to keep the test suite -short. - -Note: the tests run on the current state of the working directory. I tried to add a `git stash` in -the script to only test what's actually being committed, but I couldn't get it to run reliably -(it's still in there but commented). Keep this in mind when you commit, and don't forget to re-add -the changes if the first attempt failed and you fixed the tests. - -## Speeding up the build for local tests - -If you need to install something in your local repository quickly, you can use the `fast` profile to -skip all "non-essential" checks (licenses, formatting, tests, etc): - -``` -mvn clean install -Pfast -``` - -You can speed things up even more by targeting specific modules with the `-pl` option: - -``` -mvn clean install -Pfast -pl core,query-builder,mapper-runtime,mapper-processor,bom -``` - -Please run the normal build at least once before you push your changes. - -## Commits - -Keep your changes **focused**. Each commit should have a single, clear purpose expressed in its -message. - -Resist the urge to "fix" cosmetic issues (add/remove blank lines, move methods, etc.) in existing -code. This adds cognitive load for reviewers, who have to figure out which changes are relevant to -the actual issue. If you see legitimate issues, like typos, address them in a separate commit (it's -fine to group multiple typo fixes in a single commit). - -Isolate trivial refactorings into separate commits. For example, a method rename that affects dozens -of call sites can be reviewed in a few seconds, but if it's part of a larger diff it gets mixed up -with more complex changes (that might affect the same lines), and reviewers have to check every -line. - -Commit message subjects start with a capital letter, use the imperative form and do **not** end -with a period: - -* correct: "Add test for CQL request handler" -* incorrect: "~~Added test for CQL request handler~~" -* incorrect: "~~New test for CQL request handler~~" - -Avoid catch-all messages like "Minor cleanup", "Various fixes", etc. They don't provide any useful -information to reviewers, and might be a sign that your commit contains unrelated changes. - -We don't enforce a particular subject line length limit, but try to keep it short. - -You can add more details after the subject line, separated by a blank line. The following pattern -(inspired by [Netty](http://netty.io/wiki/writing-a-commit-message.html)) is not mandatory, but -welcome for complex changes: - -``` -One line description of your change - -Motivation: - -Explain here the context, and why you're making that change. -What is the problem you're trying to solve. - -Modifications: - -Describe the modifications you've done. - -Result: - -After your change, what will change. -``` - -## Pull requests - -Like commits, pull requests should be focused on a single, clearly stated goal. - -Don't base a pull request onto another one, it's too complicated to follow two branches that evolve -at the same time. If a ticket depends on another, wait for the first one to be merged. +## Before You Submit a Pull Request -If you have to address feedback, avoid rewriting the history (e.g. squashing or amending commits): -this makes the reviewers' job harder, because they have to re-read the full diff and figure out -where your new changes are. Instead, push a new commit on top of the existing history; it will be -squashed later when the PR gets merged. If the history is complex, it's a good idea to indicate in -the message where the changes should be squashed: +### Pull Request Titles and Commit Messages -``` -* 20c88f4 - Address feedback (to squash with "Add metadata parsing logic") (36 minutes ago) -* 7044739 - Fix various typos in Javadocs (2 days ago) -* 574dd08 - Add metadata parsing logic (2 days ago) -``` +- Pull request titles must follow the format: `{JIRA-ticket-number}: {JIRA-ticket-title}` + Example: `CASSJAVA-40: Driver testing against Java 21` +- Before merging, squash your commits into a single commit with a message formatted as: + ``` + {JIRA-ticket-number}: {JIRA-ticket-title} + patch by {your-name}; reviewed by {approver-1} and {approver-2} for {JIRA-ticket-number} + ``` -(Note that the message refers to the other commit's subject line, not the SHA-1. This way it's still -relevant if there are intermediary rebases.) +### Coding Styles -If you need new stuff from the base branch, it's fine to rebase and force-push, as long as you don't -rewrite the history. Just give a heads up to the reviewers beforehand. Don't push a merge commit to -a pull request. +1. Javadoc: All types in API packages must be documented. For internal packages, documentation is optional. +2. Nullability: Please use the [Spotbugs annotations](https://spotbugs.github.io) to document nullability of new class or interface using the `@Nullable` and `@NonNull` annotations from the package `edu.umd.cs.findbugs.annotations`. +3. Static imports: Please only use static imports for AssertJ, Mockito, and Awaitility. +4. Concurrency annotations: Please use the [JCIP annotations](http://jcip.net/annotations/doc/index.html) to document thread-safety. \ No newline at end of file diff --git a/pre-commit.sh b/pre-commit.sh index 912564ae81e..60b115dedd4 100755 --- a/pre-commit.sh +++ b/pre-commit.sh @@ -16,16 +16,12 @@ # specific language governing permissions and limitations # under the License. -# STASH_NAME="pre-commit-$(date +%s)" -# git stash save --keep-index $STASH_NAME - -mvn clean test +mvn fmt:format +mvn xml-format:xml-format +mvn license:format +mvn clean install -DskipTests +mvn test RESULT=$? -# STASHES=$(git stash list) -# if [[ $STASHES == *$STASH_NAME* ]]; then -# git stash pop -# fi - [ $RESULT -ne 0 ] && exit 1 exit 0