What is Sysrq+J "Thaw frozen filesystems" and why it fails?

5

2

Is it expected to unfreeze processes that are in uninterruptible sleep because of filesystem issue (NFS, FUSE, bug)?

If I press Sysrq+J on my Linux, it halts the system and prints "Emergency Thaw on sda" in a loop endlessly, not allowing any other SysRqs? Only hard reboot helps then.

Vi.

Posted 2011-11-22T17:38:50.147

Reputation: 13 705

Probably it's a kernel bug, because I've the same problem here, after SysRq+J it started to report the same error in kern.log: "kernel: [160192.060834] Emergency Thaw on sda4" endless around 80000 times per second! After 47,5mln of records and 4,5GB of log within 12 minutes, my machine just DIED! – kenorb – 2012-09-19T09:58:41.253

Is there already a bug submitted or I should file one? – Vi. – 2012-09-19T17:30:16.980

I found only this: https://bugzilla.kernel.org/show_bug.cgi?id=12781, but I've just created a new one here: https://bugzilla.kernel.org/show_bug.cgi?id=47741

– kenorb – 2012-09-20T07:56:33.483

Answers

2

Vi.

Posted 2011-11-22T17:38:50.147

Reputation: 13 705

Still unpatched apparently. – neverMind9 – 2018-04-21T15:35:09.270

1

This is a bug in kernel and it is still present on kernel 3.13 amd64 (from Ubuntu Trusty).

As _Vi tested in his VM using VBoxManage controlvm <vm_name> keyboardputscancode 1d 38 54 24 a4 d4 b8 9d with the following results:

3.3.6-pf-vi+  : Reproducible
3.2.0-zen-vi+ : Reproducible
3.0.4-zen-vi+ : Reproducible
2.6.37.5-zen-... : Reproducible
2.6.33-zen2-... : Reproducible
2.6.32-zen1-... : Reproducible
2.6.31-zen11-... : Not reproducible
2.6.30-zen2-... : Not reproducible

From Dave Chinner we can read:

The thawing of a filesystem through sysrq-j loops infinitely as it incorrectly detects a thawed filesytsem as frozen and tries to unfreeze repeatedly. This is a regression caused by 4504230a71566785a05d3e6b53fa1ee071b864eb ("freeze_bdev: grab active reference to frozen superblocks") in that it no longer returned -EINVAL for superblocks that were not frozen.

Deeper problems arose on further inspection - filesystems frozen with freeze_super() could not be unfrozen by thaw_bdev() so emergency thawing didn't work on anything manually frozen, and deadlocks on sb->s_umount occur as superblocks are iterated in the emergency thaw with it already held for read.

Everywhere we freeze or thaw, we already have a superblock or can get one easily so could call freeze_super() directly. Hence we can kill the bdev level operations and move all the nesting infrastructure up into the superblock level so we have a single consistent interface.

Source: Re: 2.6.34 echo j > /proc/sysrq-trigger causes inifniteunfreeze/Thaw event at linux-kernel mailing list archive

kenorb

Posted 2011-11-22T17:38:50.147

Reputation: 16 795