diff --git a/content/develop/clients/dotnet/produsage.md b/content/develop/clients/dotnet/produsage.md index 9d3bf7be20..1dc8d5a8d0 100644 --- a/content/develop/clients/dotnet/produsage.md +++ b/content/develop/clients/dotnet/produsage.md @@ -28,6 +28,7 @@ progress in implementing the recommendations. {{< checklist-item "#event-handling" >}}Event handling{{< /checklist-item >}} {{< checklist-item "#timeouts" >}}Timeouts{{< /checklist-item >}} {{< checklist-item "#exception-handling" >}}Exception handling{{< /checklist-item >}} + {{< checklist-item "#retries" >}}Retries{{< /checklist-item >}} {{< /checklist >}} ## Recommendations @@ -110,3 +111,68 @@ the most common Redis exceptions: (for example, trying to access a [stream entry]({{< relref "/develop/data-types/streams#entry-ids" >}}) using an invalid ID). + +### Retries + +During the initial `ConnectionMultiplexer.Connect()` call, `NRedisStack` will +keep trying to connect if the first attempt fails. By default, it will make +three attempts, but you can configure the number of retries using the +`ConnectRetry` configuration option: + +```cs +var muxer = ConnectionMultiplexer.Connect(new ConfigurationOptions { + ConnectRetry = 5, // Retry up to five times. + . + . +}); +``` + +After the initial `Connect()` call is successful, `NRedisStack` will +automatically attempt to reconnect if the connection is lost. You can +specify a reconnection strategy with the `ReconnectRetryPolicy` configuration +option. `NRedisStack` provides two built-in classes that implement +reconnection strategies: + +- `ExponentialRetry`: (Default) Uses an + [exponential backoff](https://en.wikipedia.org/wiki/Exponential_backoff) + strategy, where you specify an increment to the delay between successive + attempts and, optionally, a maximum delay, both in milliseconds. +- `LinearRetry`: Uses a linear backoff strategy with a fixed delay between + attempts, in milliseconds. + +The example below shows how to use the `ExponentialRetry` class: + +```cs +var muxer = ConnectionMultiplexer.Connect(new ConfigurationOptions { + // 500ms increment per attempt, max 2000ms. + ReconnectRetryPolicy = new ExponentialRetry(500, 2000), + . + . +}); +``` + +You can also implement your own custom retry policy by creating a class +that implements the `IReconnectRetryPolicy` interface. + +`NRedisStack` doesn't provide an automated retry mechanism for commands, but +you can implement your own retry logic in your application code. Use +a loop with a `try`/`catch` block to catch `RedisConnectionException` and +`RedisTimeoutException` exceptions and then retry the command after a +suitable delay, as shown in the example below: + +```cs +const int MAX_RETRIES = 3; + +for (int i = 0; i < MAX_RETRIES; i++) { + try { + string value = db.StringGet("foo"); + break; + } catch (RedisConnectionException) { + // Wait before retrying. + Thread.Sleep(500 * (i + 1)); + } catch (RedisTimeoutException) { + // Wait before retrying. + Thread.Sleep(500 * (i + 1)); + } +} +``` diff --git a/content/develop/clients/go/produsage.md b/content/develop/clients/go/produsage.md index 166f898470..6d3ece994d 100644 --- a/content/develop/clients/go/produsage.md +++ b/content/develop/clients/go/produsage.md @@ -28,6 +28,8 @@ progress in implementing the recommendations. {{< checklist-item "#health-checks" >}}Health checks{{< /checklist-item >}} {{< checklist-item "#error-handling" >}}Error handling{{< /checklist-item >}} {{< checklist-item "#monitor-performance-and-errors">}}Monitor performance and errors{{< /checklist-item >}} + {{< checklist-item "#retries" >}}Retries{{< /checklist-item >}} + {{< checklist-item "#timeouts" >}}Timeouts{{< /checklist-item >}} {{< /checklist >}} ## Recommendations @@ -68,3 +70,55 @@ you trace command execution and monitor your server's performance. You can use this information to detect problems before they are reported by users. See [Observability]({{< relref "/develop/clients/go#observability" >}}) for more information. + +### Retries + +`go-redis` will automatically retry failed connections and commands. By +default, the number of attempts is set to three, but you can change this +using the `MaxRetries` field of `Options` when you connect. The retry +strategy starts with a short delay between the first and second attempts, +and increases the delay with each attempt. The initial delay is set +with the `MinRetryBackoff` option (defaulting to 8 milliseconds) and the +maximum delay is set with the `MaxRetryBackoff` option (defaulting to +512 milliseconds): + +```go +client := redis.NewClient(&redis.Options{ + MinRetryBackoff: 10 * time.Millisecond, + MaxRetryBackoff: 100 * time.Millisecond, + MaxRetries: 5, +}) +``` + +You can use the observability features of `go-redis` to monitor the +number of retries and the time taken for each attempt, as noted in the +[Monitor performance and errors](#monitor-performance-and-errors) section +above. Use this data to help you decide on the best retry settings +for your application. + +### Timeouts + +`go-redis` supports timeouts for connections and commands to avoid +stalling your app if the server does not respond within a reasonable time. +The `DialTimeout` field of `Options` sets the timeout for connections, +and the `ReadTimeout` and `WriteTimeout` fields set the timeouts for +reading and writing data, respectively. The default timeout is five seconds +for connections and three seconds for reading and writing data, but you can +set your own timeouts when you connect: + +```go +client := redis.NewClient(&redis.Options{ + DialTimeout: 10 * time.Second, + ReadTimeout: 5 * time.Second, + WriteTimeout: 5 * time.Second, +}) +``` + +You can use the observability features of `go-redis` to monitor the +frequency of timeouts, as noted in the +[Monitor performance and errors](#monitor-performance-and-errors) section +above. Use this data to help you decide on the best timeout settings +for your application. If timeouts are set too short, then `go-redis` +might retry commands that would have succeeded if given more time. However, +if they are too long, your app might hang unnecessarily while waiting for a +response that will never arrive. diff --git a/content/develop/clients/jedis/connect.md b/content/develop/clients/jedis/connect.md index 5295e046de..66ac0bf5eb 100644 --- a/content/develop/clients/jedis/connect.md +++ b/content/develop/clients/jedis/connect.md @@ -368,3 +368,53 @@ poolConfig.setTimeBetweenEvictionRuns(Duration.ofSeconds(1)); // to prevent connection starvation JedisPooled jedis = new JedisPooled(poolConfig, "localhost", 6379); ``` + +### Retrying a command after a connection failure + +If a connection is lost before a command is completed, the command will fail with a `JedisConnectionException`. Although the connection pool manages the connections +for you, you must request a new connection from the pool to retry the command. +You would typically do this in a loop that makes several attempts to reconnect +before aborting and reporting that the error isn't transient. The example below +shows a retry loop that uses a simple +[exponential backoff](https://en.wikipedia.org/wiki/Exponential_backoff) +strategy: + +```java +JedisPooled jedis = new JedisPooled( + new HostAndPort("localhost", 6379), + clientConfig, + poolConfig +); + +// Set max retry attempts +final int MAX_RETRIES = 5; + +// Example of retrying a command +String key = "retry-example"; +String value = "success"; + +int attempts = 0; +boolean success = false; + +while (!success && attempts < MAX_RETRIES) { + try { + attempts++; + String result = jedis.set(key, value); + System.out.println("Command succeeded on attempt " + attempts + ": " + result); + success = true; + } catch (JedisConnectionException e) { + System.out.println("Connection failed on attempt " + attempts + ": " + e.getMessage()); + if (attempts >= MAX_RETRIES) { + System.out.println("Max retries reached. Giving up."); + throw e; + } + + // Wait before retrying + try { + Thread.sleep(500 * attempts); // Exponential backoff + } catch (InterruptedException ie) { + Thread.currentThread().interrupt(); + } + } +} +``` diff --git a/content/develop/clients/jedis/produsage.md b/content/develop/clients/jedis/produsage.md index d78a94193e..26250b9f53 100644 --- a/content/develop/clients/jedis/produsage.md +++ b/content/develop/clients/jedis/produsage.md @@ -26,6 +26,7 @@ progress in implementing the recommendations. {{< checklist "prodlist" >}} {{< checklist-item "#connection-pooling" >}}Connection pooling{{< /checklist-item >}} + {{< checklist-item "#connection-retries" >}}Connection retries{{< /checklist-item >}} {{< checklist-item "#client-side-caching" >}}Client-side caching{{< /checklist-item >}} {{< checklist-item "#timeouts" >}}Timeouts{{< /checklist-item >}} {{< checklist-item "#health-checks" >}}Health checks{{< /checklist-item >}} @@ -51,6 +52,13 @@ write your own code to cache and reuse open connections. See [Connect with a connection pool]({{< relref "/develop/clients/jedis/connect#connect-with-a-connection-pool" >}}) to learn how to use this technique with Jedis. +### Connection retries + +If a connection is lost before a command is completed, the command will fail with a `JedisConnectionException`. However, a connection error is often transient, in which case the +command will succeed after one or more reconnection attempts. See +[Retrying a command after a connection failure]({{< relref "/develop/clients/jedis/connect#retrying-a-command-after-a-connection-failure" >}}) +for an example of a simple retry loop that can recover from a transient connection error. + ### Client-side caching [Client-side caching]({{< relref "/develop/clients/client-side-caching" >}}) diff --git a/content/develop/clients/lettuce/produsage.md b/content/develop/clients/lettuce/produsage.md index 17364283e1..f4104708ee 100644 --- a/content/develop/clients/lettuce/produsage.md +++ b/content/develop/clients/lettuce/produsage.md @@ -29,6 +29,7 @@ progress in implementing the recommendations. {{< checklist-item "#cluster-topology-refresh">}}Cluster topology refresh{{< /checklist-item >}} {{< checklist-item "#dns-cache-and-redis" >}}DNS cache and Redis{{< /checklist-item >}} {{< checklist-item "#exception-handling" >}}Exception handling{{< /checklist-item >}} + {{< checklist-item "#connection-and-execution-reliability" >}}Connection and execution reliability{{< /checklist-item >}} {{< /checklist >}} ## Recommendations @@ -189,3 +190,51 @@ See the Error handling sections of the [Lettuce async](https://redis.github.io/lettuce/user-guide/async-api/#error-handling) and [Lettuce reactive](https://redis.github.io/lettuce/user-guide/reactive-api/#error-handling) API guides to learn more about handling exceptions. + + +## Connection and execution reliability + +By default, Lettuce uses an *at-least-once* strategy for command execution. +It will automatically reconnect after a disconnection and resume executing +any commands that were queued when the connection was lost. If you +switch to *at-most-once* execution, Lettuce will +not reconnect after a disconnection and will discard commands +instead of queuing them. You can enable at-most-once execution by setting +`autoReconnect(false)` in the +`ClientOptions` when you create the client, as shown in the example below: + +```java +RedisURI uri = RedisURI.Builder + .redis("localhost", 6379) + .withAuthentication("default", "yourPassword") + .build(); + +RedisClient client = RedisClient.create(uri); + +client.setOptions(ClientOptions.builder() + .autoReconnect(false) + . + . + .build()); +``` + +If you need finer control over which commands you want to execute in which mode, you can +configure a *replay filter* to choose the commands that should retry after a disconnection. +The example below shows a filter that retries all commands except for +[`DECR`]({{< relref "/commands/decr" >}}) +(this command is not [idempotent](https://en.wikipedia.org/wiki/Idempotence) and +so you might need to avoid executing it more than once). Note that +replay filters are only available in in Lettuce v6.6 and above. + +```java +Predicate > filter = + cmd -> cmd.getType().toString().equalsIgnoreCase("DECR"); + +client.setOptions(ClientOptions.builder() + .replayFilter(filter) + .build()); +``` + +See +[Command execution reliability](https://redis.github.io/lettuce/advanced-usage/#command-execution-reliability) +in the Lettuce reference guide for more information. diff --git a/content/develop/clients/nodejs/produsage.md b/content/develop/clients/nodejs/produsage.md index b6be0d89f9..80403c982a 100644 --- a/content/develop/clients/nodejs/produsage.md +++ b/content/develop/clients/nodejs/produsage.md @@ -27,7 +27,8 @@ progress in implementing the recommendations. {{< checklist "nodeprodlist" >}} {{< checklist-item "#handling-errors" >}}Handling errors{{< /checklist-item >}} {{< checklist-item "#handling-reconnections" >}}Handling reconnections{{< /checklist-item >}} - {{< checklist-item "#timeouts" >}}Timeouts{{< /checklist-item >}} + {{< checklist-item "#connection-timeouts" >}}Connection timeouts{{< /checklist-item >}} + {{< checklist-item "#command-execution-reliability" >}}Command execution reliability{{< /checklist-item >}} {{< /checklist >}} ## Recommendations @@ -63,10 +64,12 @@ own custom strategy. See [Reconnect after disconnection]({{< relref "/develop/clients/nodejs/connect#reconnect-after-disconnection" >}}) for more information. -### Timeouts +### Connection timeouts -To set a timeout for a connection, use the `connectTimeout` option: -```typescript +To set a timeout for a connection, use the `connectTimeout` option +(the default timeout is 5 seconds): + +```js const client = createClient({ socket: { // setting a 10-second timeout @@ -74,4 +77,31 @@ const client = createClient({ } }); client.on('error', error => console.error('Redis client error:', error)); -``` \ No newline at end of file +``` + +### Command execution reliability + +By default, `node-redis` reconnects automatically when the connection is lost +(but see [Handling reconnections](#handling-reconnections), if you want to +customize this behavior). While the connection is down, any commands that you +execute will be queued and sent to the server when the connection is restored. +This might occasionally cause problems if the connection fails while a +[non-idempotent](https://en.wikipedia.org/wiki/Idempotence) command +is being executed. In this case, the command could change the data on the server +without the client removing it from the queue. When the connection is restored, +the command will be sent again, resulting in incorrect data. + +If you need to avoid this situation, set the `disableOfflineQueue` option +to `true` when you create the client. This will cause the client to discard +unexecuted commands rather than queuing them: + +```js +const client = createClient({ + disableOfflineQueue: true, + . + . +}); +``` + +Use a separate connection with the queue disabled if you want to avoid queuing +only for specific commands. diff --git a/content/develop/clients/redis-py/connect.md b/content/develop/clients/redis-py/connect.md index df28635718..0d6bbb6e1a 100644 --- a/content/develop/clients/redis-py/connect.md +++ b/content/develop/clients/redis-py/connect.md @@ -242,3 +242,13 @@ r3.close() pool.close() ``` + +## Retrying connections + +A connection will sometimes fail because of a transient problem, such as a +network outage or a server that is temporarily unavailable. In these cases, +retrying the connection after a short delay will usually succeed. `redis-py` uses +a simple retry strategy by default, but there are various ways you can customize +this behavior to suit your use case. See +[Retries]({{< relref "/develop/clients/redis-py/produsage#retries" >}}) +for more information about custom retry strategies, with example code. diff --git a/content/develop/clients/redis-py/produsage.md b/content/develop/clients/redis-py/produsage.md index 6144609771..d767fbc6be 100644 --- a/content/develop/clients/redis-py/produsage.md +++ b/content/develop/clients/redis-py/produsage.md @@ -29,6 +29,7 @@ progress in implementing the recommendations. {{< checklist-item "#retries" >}}Retries{{< /checklist-item >}} {{< checklist-item "#health-checks" >}}Health checks{{< /checklist-item >}} {{< checklist-item "#exception-handling" >}}Exception handling{{< /checklist-item >}} + {{< checklist-item "#timeouts" >}}Timeouts{{< /checklist-item >}} {{< /checklist >}} ## Recommendations @@ -170,3 +171,29 @@ module. The list below describes some of the most common exceptions. - `WatchError`: Thrown when a [watched key]({{< relref "/develop/clients/redis-py/transpipe#watch-keys-for-changes" >}}) is modified during a transaction. + +### Timeouts + +After you issue a command or a connection attempt, the client will wait +for a response from the server. If the server doesn't respond within a +certain time limit, the client will throw a `TimeoutError`. By default, +the timeout happens after 10 seconds for both connections and commands, but you +can set your own timeouts using the `socket_connect_timeout` and `socket_timeout` parameters +when you connect: + +```py +# Set a 15-second timeout for connections and a +# 5-second timeout for commands. +r = Redis( + socket_connect_timeout=15, + socket_timeout=5, + . + . +) +``` + +Take care to set the timeouts to appropriate values for your use case. +If you use timeouts that are too short, then `redis-py` might retry +commands that would have succeeded if given more time. However, if the +timeouts are too long, your app might hang unnecessarily while waiting for a +response that will never arrive.