Skip to content

Conversation

@erikgrinaker
Copy link

Backports etcd-io#52 on behalf of @tbg.

Move from StatePause->StateReplicate on heartbeat response when possible
See individual commits. Essentially, when a fully caught-up follower was
reported unreachable, it'd transition to StateProbe but then wouldn't
recover from that via heartbeats (once they resumed).

This caused some issues in CRDB because we rely on the reported status
to reason about the safety of leadership changes, etc.

This PR makes it such that StateProbe resolves on its own: when the
leader hears back from the follower via a heartbeat, it sends an
empty MsgApp, and as response to this moves the follower back into
StateProbe.

Touches cockroachdb/cockroach#104304.
Touches cockroachdb/cockroach#101624.

Signed-off-by: Tobias Grieger [email protected]

tbg added 4 commits July 7, 2023 10:10
This makes `go test -rewrite ./...` stable.

Signed-off-by: Tobias Grieger <[email protected]>
It will be touched in this PR.

Signed-off-by: Tobias Grieger <[email protected]>
After a call to `ReportUnreachable`, a fully caught up follower would end up in
StateReplicate and not leave it despite responding to heartbeats. This is a bug
which is going to be fixed in a follow-up commit.

Signed-off-by: Tobias Grieger <[email protected]>
See individual commits. Essentially, when a fully caught-up follower was
reported unreachable, it'd transition to `StateProbe` but then wouldn't
recover from that via heartbeats (once they resumed).

This caused some issues in CRDB because we rely on the reported status
to reason about the safety of leadership changes, etc.

This PR makes it such that StateProbe resolves on its own: when the
leader hears back from the follower via a heartbeat, it sends an
empty MsgApp, and as response to this moves the follower back into
StateProbe.

Signed-off-by: Tobias Grieger <[email protected]>
@erikgrinaker erikgrinaker requested a review from tbg July 7, 2023 10:12
@erikgrinaker erikgrinaker self-assigned this Jul 7, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants