Closed
Description
Tarantool version:
| Tarantool 2.7.0-74-g6ad720d6d
| Target: Darwin-x86_64-RelWithDebInfo
| Build options: cmake . -DCMAKE_INSTALL_PREFIX=/usr/local -DENABLE_BACKTRACE=ON
| Compiler: /Library/Developer/CommandLineTools/usr/bin/cc /Library/Developer/CommandLineTools/usr/bin/c++
| C_FLAGS: -Wno-unknown-pragmas -fexceptions -funwind-tables -fno-omit-frame-pointer -fno-stack-protector -fno-common -msse2 -std=c11 -Wall -Wextra -Wno-strict-aliasing -Wno-char-subscripts -Wno-gnu-alignof-expression -Werror
| CXX_FLAGS: -Wno-unknown-pragmas -fexceptions -funwind-tables -fno-omit-frame-pointer -fno-stack-protector -fno-common -msse2 -std=c++11 -Wall -Wextra -Wno-strict-aliasing -Wno-char-subscripts -Wno-invalid-offsetof -Wno-gnu-alignof-expression -Werror
OS version: OSX
Bug description: https://gitlab.com/tarantool/tarantool/-/jobs/874058059#L4993
Results file checksum: d41d8cd98f00b204e9800998ecf8427e
[067] sql-tap/index4.test.lua vinyl Test timeout of 110 secs reached [ fail ]
[067] Test failed! Output from reject file /tmp/tnt/rejects/sql-tap/index4.reject:
[067]
[067] Last 15 lines of Tarantool Log file [Instance "app_server"][/tmp/tnt/067_sql-tap/index4.test.lua:vinyl.tarantool.log]:
[067] Worker "067_sql-tap" got failed test; restarted the server
How to reproduce:
- use mac host with OSX
- create 2 similar repositories tnt and tarantool
- enter tnt repository, and patch test-run sub-directory
commit 5f55b46dc56f95ca98e0277ee8cfb75210a8bbfd (origin/avtikhon/tmp_all_flaky_fixes_min_prs)
Author: Alexander V. Tikhonov <[email protected]>
Date: Thu Dec 3 09:53:02 2020 +0000
Enable fragile list tests run in parallel
diff --git a/lib/worker.py b/lib/worker.py
index a8643fe..698e9d2 100644
--- a/lib/worker.py
+++ b/lib/worker.py
@@ -89,7 +89,7 @@ def get_task_groups():
res[key + '_fragile'] = {
'gen_worker': gen_worker,
'task_ids': fragile_task_ids,
- 'is_parallel': False,
+ 'is_parallel': suite.is_parallel(),
'show_reproduce_content': suite.show_reproduce_content(),
}
return res
- and run hard load processes in 100 loop (enough in time to get the issue):
( export PATH=$PATH:~/tnt/src ; export REPLICATION_SYNC_TIMEOUT=500 ; export TEST_TIMEOUT=510 ; export NO_OUTPUT_TIMEOUT=520 ; l=0 ; while ./test-run.py -j400 `for r in {1..100} ; do echo sql-tap/index4.test.lua ; done` `for s in {1..30} ; do echo replication/election_qsync_stress.test.lua ; done` `for r in {1..4} ; do echo vinyl/ ; done` 2>&1 ; do l=$(($l+1)) ; echo ======== $l ============= ; if [[ $l -gt 100 ]]; then break ; fi ; done ) > a.log &
# for Linux host dev1 use debian-stretch image or similar with command
rm -rf atop_all.log ; ( atop -a -R -w atop_all.log 10000 & echo $! >atop.pid & ) ; ( export PATH=$PATH:/tnt/src ; export REPLICATION_SYNC_TIMEOUT=100 ; export TEST_TIMEOUT=110 ; export NO_OUTPUT_TIMEOUT=120 ; date ; time ./test-run.py -j 200 --builddir /tnt --vardir var_hdd_vinyl `for r in {1..200} ; do echo vinyl/gh-5141-invalid-vylog-file ; done` --force 2>&1 ; sleep 1 ; kill -USR2 `cat atop.pid` ; date ) > test_atop.log &
- wait a few minutes to give the hard processes heavy load the host, check with:
iostat -c 1000
disk0 cpu load average
KB/t tps MB/s us sy id 1m 5m 15m
18.11 26 0.46 36 43 20 421.03 199.69 137.88
134.08 602 78.77 60 40 0 421.03 199.69 137.88
142.56 528 73.48 58 42 0 421.03 199.69 137.88
- enter tarantool repository, disable 'fragile' option in file 'sql-tap/suite.ini'
- check the test fails, run (test needs near to 350 secs):
(export PATH=$PATH:~/tarantool/src ; export REPLICATION_SYNC_TIMEOUT=100 ; export TEST_TIMEOUT=110 ; export NO_OUTPUT_TIMEOUT=120 ; time ./test-run.py sql-tap/index4.test.lua --conf vinyl ) > a.log 2>&1 &
- check the test passes, run (test needs near to 350 secs):
(export PATH=$PATH:~/tarantool/src ; export REPLICATION_SYNC_TIMEOUT=500 ; export TEST_TIMEOUT=510 ; export NO_OUTPUT_TIMEOUT=520 ; time ./test-run.py sql-tap/index4.test.lua --conf vinyl ) > a.log 2>&1 &
Checked test on timings with different configurations with hard load in parallel:
# of runs * conf | Real | User | Sys |
---|---|---|---|
1*memtx/vinyl | 5.16 real | 9.35 user | 0.92 sys |
10*memtx/vinyl | 17.52 real | 52.15 user | 4.44 sys |
50*memtx | 26.42 real | 79.33 user | 6.57 sys |
50*vinyl | 215.60 real | 201.73 user | 17.52 sys |
50*memtx/vinyl | 366.82 real | 278.42 user | 24.38 sys |
100*memtx | 242.83 real | 170.48 user | 14.80 sys |
100*vinyl | 690.04 real | 408.50 user | 35.93 sys |
100*memtx/vinyl | 730.13 real | 557.03 user | 48.60 sys |
Also check more inserts and found that test always passes:
diff --git a/test/sql-tap/index4.test.lua b/test/sql-tap/index4.test.lua
index 4ea713a58..c012646d7 100755
--- a/test/sql-tap/index4.test.lua
+++ b/test/sql-tap/index4.test.lua
@@ -41,6 +41,9 @@ test:do_execsql_test(
INSERT INTO t1 SELECT randomblob(102) FROM t1; -- 16384
INSERT INTO t1 SELECT randomblob(102) FROM t1; -- 32768
INSERT INTO t1 SELECT randomblob(102) FROM t1; -- 65536
+ INSERT INTO t1 SELECT randomblob(102) FROM t1; -- 131072
+ INSERT INTO t1 SELECT randomblob(102) FROM t1; -- 262144
+ INSERT INTO t1 SELECT randomblob(102) FROM t1; -- 524288
COMMIT;
]])
Checked w/o hard load, but with multi runs:
100*vinyl 931.07 real 3237.52 user 204.97 sys
Checked with hard load, but with single run:
1*vinyl 8m55.057s real 0m30.794s user 0m2.775s sys
Checked with hard load and with multi runs
100*vinyl 39m0.270s real 54m50.115s user 3m49.988s sys
Seems that the issue can be closed, due to no real exists except timeout must be tuned.
Metadata
Metadata
Assignees
Labels
No labels