Skip to content

Conversation

@pav-kv
Copy link
Contributor

@pav-kv pav-kv commented May 18, 2023

This change fixes a bug in the Inflights tracker. The reset() method did not zero the bytes counter, which could result in a quota "leak" and delayed or stalled MsgApp sends.

The reset() method is used when the replication flow changes state between Probe/Replicate/Snapshot. If reset() is not called at an appropriate moment, when Inflights.Full(), the bytes counter would stay over the budget and stall the flow.

The test added in this PR failed before, and passes after the change.

This change fixes a bug in the Inflights tracker. The reset() method did not
zero the bytes counter, which could result in a quota "leak" and delayed or
stalled MsgApp sends.

The reset() method is used when the replication flow changes state between
Probe/Replicate/Snapshot. If reset() is not called at an appropriate moment,
when Inflights.Full(), the bytes counter would stay over the budget and stall
the flow.

Signed-off-by: Pavel Kalinnikov <[email protected]>
@pav-kv pav-kv requested a review from tbg May 18, 2023 09:18
Copy link
Contributor

@tbg tbg left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good find!

@pav-kv
Copy link
Contributor Author

pav-kv commented May 18, 2023

@ahrtr FYI, this bug shouldn't affect etcd (I assume you don't yet use the MaxInflightBytes option).

@ahrtr
Copy link
Member

ahrtr commented May 18, 2023

@ahrtr FYI, this bug shouldn't affect etcd (I assume you don't yet use the MaxInflightBytes option).

No, etcd doesn't use it at all.

Copy link
Member

@ahrtr ahrtr left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch.
thx

@ahrtr ahrtr merged commit 4967cff into etcd-io:main May 18, 2023
@pav-kv pav-kv deleted the fix-inflight-bytes-leak branch May 18, 2023 10:58
pav-kv added a commit to cockroachdb/raft that referenced this pull request May 19, 2023
…-bytes-leak

tracker: fix byte counter leak in Inflights tracker

This change fixes a bug in the Inflights tracker. The reset() method did not
zero the bytes counter, which could result in a quota "leak" and delayed or
stalled MsgApp sends.

The reset() method is used when the replication flow changes state between
Probe/Replicate/Snapshot. If reset() is not called at an appropriate moment,
when Inflights.Full(), the bytes counter would stay over the budget and stall
the flow.

Signed-off-by: Pavel Kalinnikov <[email protected]>
pav-kv added a commit to pav-kv/cockroach that referenced this pull request May 19, 2023
This commit cherry-picks a bug fix for Inflights type into the 23.1 release.
The fix is cherry-picked onto a branch parallel to the 23.1 release, in our
fork of etcd-io/raft.

Release justification: a bug fix in range availability
Epic: none
Release note: none
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants