kvserver: Under race we use a different batch that could cause RevertRange to fail

While writing tests for cluster to cluster streaming, we ran into a unique failure mode when running under race. In particular, once the stream ingestion job is "cutover", it attempts to RevertRange its target span to bring the cluster to a consistent state. RevertRange uses MVCCClearTimeRange if it is able to buffer a long enough (> 64 KVs) run of keys to clear. Under race, this call uses a `spanSetBatch` instead of a `pebbleBatch` as per this logic:
https://github.com/cockroachdb/cockroach/blob/master/pkg/kv/kvserver/replica_write.go#L692

In the case of a `spanSetBatch` there is an additional `CheckAllowedRange()` call in the stack, that ensures the RevertRange request span is valid before clearing the keys - https://github.com/cockroachdb/cockroach/blob/d12d281459568d7760b6c7643ea1d524f5c5cd70/pkg/kv/kvserver/spanset/spanset.go#L224

One of the checks in this method is to ensure that the `EndKey` in the RevertRange request is lexicographically before the `StartKey` https://github.com/cockroachdb/cockroach/blob/d12d281459568d7760b6c7643ea1d524f5c5cd70/pkg/roachpb/data.go#L2178
If the run of keys to be ClearRange'd by the RevertRangeRequest happen to be >64 versions of the same key, then the `StartKey` and `EndKey` are equal, thereby violating this invariant when run under race.

Jira issue: CRDB-3153

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

kvserver: Under race we use a different batch that could cause RevertRange to fail #60710

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

kvserver: Under race we use a different batch that could cause RevertRange to fail #60710

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions