-
Notifications
You must be signed in to change notification settings - Fork 4k
Closed
Labels
A-kv-distributionRelating to rebalancing and leasing.Relating to rebalancing and leasing.C-test-failureBroken test (automatically or manually discovered).Broken test (automatically or manually discovered).P-1Issues/test failures with a fix SLA of 1 monthIssues/test failures with a fix SLA of 1 monthX-duplicateClosed as a duplicate of another issue.Closed as a duplicate of another issue.branch-masterFailures and bugs on the master branch.Failures and bugs on the master branch.
Description
roachtest.acceptance/version-upgrade failed with artifacts on master @ e19c24fb62d24595e74c0bae0aaad0a736c2bdc7:
(mixedversion.go:547).Run: mixed-version test failure while running step 34 (restart node 1 with binary version master): COMMAND_PROBLEM: exit status 1
test artifacts and logs in: /artifacts/acceptance/version-upgrade/run_1
Parameters: ROACHTEST_arch=amd64
, ROACHTEST_cloud=gce
, ROACHTEST_cpu=4
, ROACHTEST_encrypted=false
, ROACHTEST_metamorphicBuild=false
, ROACHTEST_ssd=0
[mixed-version-test/34_restart-node-1-with-binary-version-master] 23:49:16 runner.go:336: mixed-version test failure while running step 34 (restart node 1 with binary version master): COMMAND_PROBLEM: exit status 1
CRDB logs from node 1 show this being repeatedly logged during the shutdown:
I231115 23:49:17.685544 55860 kv/kvserver/store.go:1759 ⋮ [T1,drain,n1,s1,r426/1:‹/Table/19{2-3}›] 1453 failed to transfer lease repl=(n1,s1):1 seq=26 start=1700092018.354238817,0 epo=34 pro=1700092018.357644666,0 for range r426:‹/Table/19{2-3}› [(n1,s1):1, (n2,s2):2, (n3,s3):3, next=4, gen=39] when draining: ‹no suitable transfer target found›
I231115 23:49:17.685626 55860 kv/kvserver/store.go:1760 ⋮ [T1,drain,n1,s1,r426/1:‹/Table/19{2-3}›] 1454 blocked for 93 microseconds on transfer attempt
W231115 23:49:18.556118 16550 kv/kvserver/store.go:1834 ⋮ [drain] 1455 unable to drain cleanly within 5s (cluster setting ‹server.shutdown.lease_transfer_wait›), service might briefly deteriorate if the node is terminated: waiting for 1 replicas to transfer their lease away
Warnings from KV distribution logs cockroach-kv-distribution.teamcity-12702441-1700091323-06-n4cpu4-0001.ubuntu.2023-11-15T23_47_19Z.011989.log
W231115 23:47:51.614476 287 13@kv/kvserver/store_rebalancer.go:323 ⋮ [T1,n1,s1,store-rebalancer,obj=‹cpu›] 157 StorePool missing descriptor for local store with ID 1, store list ‹ candidate: avg-ranges=161.50 avg-leases=108.50 avg-disk-usage=19 MiB avg-queries-per-second=10.44 avg-store-cpu-per-second=4ms›
W231115 23:47:51.614476 287 13@kv/kvserver/store_rebalancer.go:323 ⋮ [T1,n1,s1,store-rebalancer,obj=‹cpu›] 157 +‹ 4: ranges=158 leases=154 disk-usage=12 MiB queries-per-second=19.74 store-cpu-per-second=6ms io-overload=0.00›
W231115 23:47:51.614476 287 13@kv/kvserver/store_rebalancer.go:323 ⋮ [T1,n1,s1,store-rebalancer,obj=‹cpu›] 157 +‹ 2: ranges=165 leases=63 disk-usage=25 MiB queries-per-second=1.14 store-cpu-per-second=2ms io-overload=0.00›
W231115 23:47:51.614618 287 13@kv/kvserver/store_rebalancer.go:431 ⋮ [T1,n1,s1,store-rebalancer,obj=‹cpu›] 158 no rebalance context given, bailing out of rebalancing store, will try again later
W231115 23:48:39.194253 287 13@kv/kvserver/store_rebalancer.go:323 ⋮ [T1,n1,s1,store-rebalancer,obj=‹cpu›] 159 StorePool missing descriptor for local store with ID 1, store list ‹ candidate: avg-ranges=163.67 avg-leases=73.67 avg-disk-usage=26 MiB avg-queries-per-second=14.23 avg-store-cpu-per-second=8ms›
W231115 23:48:39.194253 287 13@kv/kvserver/store_rebalancer.go:323 ⋮ [T1,n1,s1,store-rebalancer,obj=‹cpu›] 159 +‹ 4: ranges=159 leases=138 disk-usage=12 MiB queries-per-second=24.22 store-cpu-per-second=6ms io-overload=0.00›
W231115 23:48:39.194253 287 13@kv/kvserver/store_rebalancer.go:323 ⋮ [T1,n1,s1,store-rebalancer,obj=‹cpu›] 159 +‹ 3: ranges=167 leases=12 disk-usage=38 MiB queries-per-second=0.83 store-cpu-per-second=7ms io-overload=0.00›
W231115 23:48:39.194253 287 13@kv/kvserver/store_rebalancer.go:323 ⋮ [T1,n1,s1,store-rebalancer,obj=‹cpu›] 159 +‹ 2: ranges=165 leases=71 disk-usage=27 MiB queries-per-second=17.65 store-cpu-per-second=12ms io-overload=0.00›
W231115 23:48:39.194389 287 13@kv/kvserver/store_rebalancer.go:431 ⋮ [T1,n1,s1,store-rebalancer,obj=‹cpu›] 160 no rebalance context given, bailing out of rebalancing store, will try again later
This test on roachdash | Improve this report!
Jira issue: CRDB-33554
Metadata
Metadata
Assignees
Labels
A-kv-distributionRelating to rebalancing and leasing.Relating to rebalancing and leasing.C-test-failureBroken test (automatically or manually discovered).Broken test (automatically or manually discovered).P-1Issues/test failures with a fix SLA of 1 monthIssues/test failures with a fix SLA of 1 monthX-duplicateClosed as a duplicate of another issue.Closed as a duplicate of another issue.branch-masterFailures and bugs on the master branch.Failures and bugs on the master branch.