This is not really an answer as there is not enough context to give the exact cause, but it is a description of how I managed to track this when it happened to me.
I noticed my jbd2/md0-8
kept showing up at the top of iotop
. I looked in /sys/kernel/debug/tracing/events/jbd2
to see what options there are to determine what jbd2
was doing.
NOTE-1: To see output for debug tracing events cat /sys/kernel/debug/tracing/trace_pipe
- I had this running in terminal while enabling/disabling traces.
NOTE-2: To enable events for tracing use e.g. echo 1 > /sys/kernel/debug/tracing/events/jbd2/jbd2_run_stats/enable
. To disable echo 0 > /sys/kernel/debug/tracing/events/jbd2/jbd2_run_stats/enable
.
I started by enabling /sys/kernel/debug/tracing/events/jbd2/jbd2_run_stats/enable
- but there was nothing that seemed particularly interesting in the output for it. I tried a few other events to trace and when I enabled /sys/kernel/debug/tracing/events/jbd2/jbd2_commit_flushing/enable
I saw it was occurring every second:
# cat /sys/kernel/debug/tracing/trace_pipe
...
jbd2/md0-8-2520 [004] .... 658660.216492: jbd2_commit_flushing: dev 9,0 transaction 32856413 sync 0
jbd2/md0-8-2520 [001] .... 658661.334900: jbd2_commit_flushing: dev 9,0 transaction 32856414 sync 0
jbd2/md0-8-2520 [001] .... 658661.394113: jbd2_commit_flushing: dev 9,0 transaction 32856415 sync 0
This looked like it was related to sync(2)
/fsync(2)
/msync(2)
, so I looked for some way to link this to a process and found this:
# find /sys/kernel/debug/tracing/events/ | grep sync.*enable
...
/sys/kernel/debug/tracing/events/ext4/ext4_sync_file_enter/enable
...
When I enabled it I saw the following output:
# cat /sys/kernel/debug/tracing/trace_pipe
...
nzbget-17367 [002] .... 658693.222288: ext4_sync_file_enter: dev 9,0 ino 301924373 parent 301924357 datasync 1
jbd2/md0-8-2520 [001] .... 658693.284080: jbd2_commit_flushing: dev 9,0 transaction 32856465 sync 0
nzbget-17367 [000] .... 658693.334267: ext4_sync_file_enter: dev 9,0 ino 301924357 parent 301924353 datasync 1
jbd2/md0-8-2520 [002] .... 658693.334275: jbd2_commit_flushing: dev 9,0 transaction 32856466 sync 0
nzbget-17367 [001] .... 658694.369514: ext4_sync_file_enter: dev 9,0 ino 301924367 parent 301924357 datasync 1
jbd2/md0-8-2520 [002] .... 658694.414861: jbd2_commit_flushing: dev 9,0 transaction 32856467 sync 0
nzbget-17367 [001] .... 658694.470872: ext4_sync_file_enter: dev 9,0 ino 301924357 parent 301924353 datasync 1
jbd2/md0-8-2520 [002] .... 658694.470880: jbd2_commit_flushing: dev 9,0 transaction 32856468 sync 0
This gave me the process name/id - and after doing some more debugging of this process (nzbget
) I discovered it was doing fsync(2)
every second. After I changed its config (FlushQueue=no
, undocumented I think, found it in source) to stop it from doing this per second fsync(2)
the problem went away.
My kernel version is 4.4.6-gentoo
.I think there were some options I enabled (either manually or with make oldconfig
) at some point in kernel config to get /sys/kernel/debug
with these events - so if you don't have it maybe just look around the internet for more information on enabling it.