Skip to content

[2.4] TXG timestamp DB sync if idle causes unnecessary disk access/prevent spin down #18082

@PapaNappa

Description

@PapaNappa

System information

Type Version/Name
Distribution Name Arch Linux
Distribution Version latest
Kernel Version 6.12.63
Architecture x64
OpenZFS Version 2.4.0

Describe the problem you're observing

Even when the pool/dataset is idle, my hard disks are not spinning down.
They should spin down after being idle for 20 minutes.

Alternatively, when I set hd-idle timeout to <10 minutes, they spin down, but will spin up exactly every 10 minutes.

This happened after upgrading to ZFS 2.4, even before upgrading the pool to the latest ZFS features.

I can make the additional observations:

  • Checking with fatrace, I do not see any access to files on the pool
  • disk spin down is still not happening when all datasets are unmounted
  • exporting the pool will make the HDDs spin down and no longer spin up
  • importing the pool readonly will also make the disks spin down

When the pool and all datasets are mounted normally (rw), and I set spa_flush_txg_time to 1 week, the disks are also spinning down as they should.

Thus, it very much seems as it is related to spa_sync_time_logger and the features implemented in #16853.

It seems like spa_sync_time_logger is called as part of spa_sync in the txg_sync_thread every 5s.

It is probably not as simple as a missing else return; after

zfs/module/zfs/spa.c

Lines 2158 to 2165 in 962e688

if (txg > spa->spa_last_noted_txg) {
spa->spa_last_noted_txg_time = curtime;
spa->spa_last_noted_txg = txg;
mutex_enter(&spa->spa_txg_log_time_lock);
dbrrd_add(&spa->spa_txg_log_time, curtime, txg);
mutex_exit(&spa->spa_txg_log_time_lock);
}

(i.e. if there is no new TXG), but there are not many places where spa_flush_txg_time is used.

Preventing disks to spin down is not an issue in a professional setting. This is a home NAS use-case.
However, this issue also indicates potential unnecessary disk access and might thus be still worth fixing.

Describe how to reproduce the problem

  • Configure HDDs to spin down after a certain amount of inactivity.
  • Import a pool on the HDDs, but unmount all datasets to ensure there is no other activity on the disks
  • Notice that the disks are not spinning down.
    • Alternatively, if HDD idle-timeout is < 10 minutes, disks will spin up and down every 10 minutes.

Include any warning/errors/backtraces from the system logs

I do not see any errors or other log messages related to ZFS or this problem.

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type: DefectIncorrect behavior (e.g. crash, hang)

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions