Closed
Description
$ pg_probackup -V
pg_probackup 2.3.5 (Postgres Pro 9.6.15.2 enterprise)
Выполняется бэкап с помощью команды
pg_probackup backup -B /pgdump/backup --instance dummy -U backup -b FULL -j 30 --stream --delete-expired --delete-wal --log-level-console=VERBOSE
$ pg_probackup show -B /pgdump/backup/ | grep QICCT5
dummy 9.6 QICCT5 ---- FULL STREAM 145/0 1d:11h 11TB 0 2.17 1AC80/3804A610 0/0 RUNNING
$ pg_probackup show -B /pgdump/backup/ -i QICCT5 --instance dummy
#Configuration
backup-mode = FULL
stream = true
compress-alg = zlib
compress-level = 1
from-replica = true
#Compatibility
block-size = 8192
xlog-block-size = 8192
checksum-version = 1
program-version = 2.3.5
server-version = 9.6
#Result backup info
timelineid = 145
start-lsn = 1AC80/3804A610
stop-lsn = 0/0
start-time = '2020-10-17 13:22:17+03'
recovery-xid = 0
data-bytes = 11837113027057
wal-bytes = 0
uncompressed-bytes = 25733291653457
pgdata-bytes = 25733402122577
status = RUNNING
primary_conninfo = 'user=backup host=127.0.0.1 port=5432 sslmode=prefer sslcompression=1 krbsrvname=postgres hostorder=sequential'
content-crc = 4047173452
Последние строки лога бэкапа
INFO: Data files are transferred, time elapsed: 18h:28m
VERBOSE: (query) SET client_min_messages = warning;
VERBOSE: (query) SET datestyle = 'ISO, DMY';
VERBOSE: (query) SELECT pg_catalog.txid_snapshot_xmax(pg_catalog.txid_current_snapshot()), current_timestamp(0)::timestamptz, pg_catalog.pg_last_xlog_replay_location(), labelfile, spcmapfile FROM pg_catalog.pg_stop_backup(false)
VERBOSE: finished segment at 1AE6C/E0000000 (timeline 145)
INFO: wait for pg_stop_backup()
INFO: pg_stop backup() successfully executed
LOG: stop_lsn: 1AE6C/DF2F1680
LOG: Looking for LSN 1AE6C/DF2F1680 in segment: 000000910001AE6C000000DF
LOG: Found WAL segment: /pgdump/backup/backups/dummy/QICCT5/database/pg_xlog/000000910001AE6C000000DF
VERBOSE: Thread [0]: Need to switch to the next WAL segment, page LSN 1AE6C/DF000000, record being read LSN 1AE6C/DF2F1680
LOG: Thread [0]: Opening WAL segment "/pgdump/backup/backups/dummy/QICCT5/database/pg_xlog/000000910001AE6C000000DF"
LOG: Found LSN: 1AE6C/DF2F1680
LOG: finished streaming WAL at 1AE6C/E0A88000 (timeline 145)
LOG: Getting the Recovery Time from WAL
VERBOSE: Thread [0]: Need to switch to the next WAL segment, page LSN 1AE6C/DF000000, record being read LSN 1AE6C/DF2F1680
LOG: Thread [0]: Opening WAL segment "/pgdump/backup/backups/dummy/QICCT5/database/pg_xlog/000000910001AE6C000000DF"
LOG: Current minRecPoint 1AE6C/DE41F668
LOG: Setting minRecPoint to 1AE6C/DF2F1680
В лог больше ничего не добавляется.
Через отладчик видно, что процесс ещё активен и выполняется сверка контрольных сумм WAL в каталоге pg_xlog бэкапа
# gdb -p 43100 --ex bt --batch
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib64/libthread_db.so.1".
0x00007fb199f59b1d in read () at ../sysdeps/unix/syscall-template.S:81
81 T_PSEUDO (SYSCALL_SYMBOL, SYSCALL_NAME, SYSCALL_NARGS)
#0 0x00007fb199f59b1d in read () at ../sysdeps/unix/syscall-template.S:81
#1 0x00007fb199ef0499 in __GI__IO_file_xsgetn (fp=0x3f6b240, data=<optimized out>, n=65536) at fileops.c:1433
#2 0x00007fb199ee63d5 in __GI__IO_fread (buf=buf@entry=0xebd85e0, size=size@entry=1, count=count@entry=65536, fp=fp@entry=0x3f6b240) at iofread.c:47
#3 0x0000000000421f8a in fread (__stream=0x3f6b240, __n=65536, __size=1, __ptr=0xebd85e0) at /usr/include/bits/stdio2.h:295
#4 pgFileGetCRC (file_path=file_path@entry=0x7ffd52dcf300 "/pgdump/backup/backups/dummy/QICCT5/database/pg_xlog/000000910001ADAE00000036", use_crc32c=use_crc32c@entry=1 '\001', missing_ok=<optimized out>) at src/dir.c:300
#5 0x00000000004151fe in do_backup_instance (backup_conn=backup_conn@entry=0x67f460, nodeInfo=nodeInfo@entry=0x7ffd52dcf7b0, no_sync=no_sync@entry=0 '\000') at src/backup.c:639
#6 0x0000000000415a73 in do_backup (start_time=start_time@entry=1602930137, no_validate=<optimized out>, set_backup_params=0x0, no_sync=<optimized out>) at src/backup.c:911
#7 0x0000000000404e67 in main (argc=16, argv=0x7ffd52dcfe88) at src/pg_probackup.c:814
# gdb -p 43100 --ex bt --batch
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib64/libthread_db.so.1".
0x00007fb199f59b1d in read () at ../sysdeps/unix/syscall-template.S:81
81 T_PSEUDO (SYSCALL_SYMBOL, SYSCALL_NAME, SYSCALL_NARGS)
#0 0x00007fb199f59b1d in read () at ../sysdeps/unix/syscall-template.S:81
#1 0x00007fb199ef0499 in __GI__IO_file_xsgetn (fp=0x3f6b240, data=<optimized out>, n=65536) at fileops.c:1433
#2 0x00007fb199ee63d5 in __GI__IO_fread (buf=buf@entry=0xebd85e0, size=size@entry=1, count=count@entry=65536, fp=fp@entry=0x3f6b240) at iofread.c:47
#3 0x0000000000421f8a in fread (__stream=0x3f6b240, __n=65536, __size=1, __ptr=0xebd85e0) at /usr/include/bits/stdio2.h:295
#4 pgFileGetCRC (file_path=file_path@entry=0x7ffd52dcf300 "/pgdump/backup/backups/dummy/QICCT5/database/pg_xlog/000000910001ADB500000081", use_crc32c=use_crc32c@entry=1 '\001', missing_ok=<optimized out>) at src/dir.c:300
#5 0x00000000004151fe in do_backup_instance (backup_conn=backup_conn@entry=0x67f460, nodeInfo=nodeInfo@entry=0x7ffd52dcf7b0, no_sync=no_sync@entry=0 '\000') at src/backup.c:639
#6 0x0000000000415a73 in do_backup (start_time=start_time@entry=1602930137, no_validate=<optimized out>, set_backup_params=0x0, no_sync=<optimized out>) at src/backup.c:911
#7 0x0000000000404e67 in main (argc=16, argv=0x7ffd52dcfe88) at src/pg_probackup.c:814
Как я понимаю до завершения остаётся
000000910001AE6C000000DF(stopLSN) - 000000910001ADB500000081 =~ 45k
Т.е. это очень долго, при этом логирование VERBOSE вообще не даёт понимания, что сейчас происходит И когда начался этот шаг.