0

Whenever I try to setup LUKS on my CentOS install on a Supermicro A2SDi-8C-HLN4F motherboard (Intel ATOM CPU C3758), everything seems to work fine until the process completely freezes when creating the filesystem (ext4) or trying to mount the filesystem (xfs). It doesn’t matter if I try to put it on using the —encrypted directive in my kickstart config, or try to do it on a physical drive (/dev/sde), RAID device (/dev/md127), physical volume or logical volume in LVM, or even a loopback-mounted file (/dev/loop0). The same problem shows up both in CentOS 7.7 and 8.1.

I read somewhere that memory might be an issue, so I switched my two DIMMs and ran MemTest86+ with 0 errors.

Steps used when trying to create the filesystem on the devices above:

  • cryptsetup --force-password luksFormat <device>
  • cryptsetup luksOpen <device> <testname>
  • mkfs -t ext4 /dev/mapper/<testname>

During the mkfs command, the process invariably hangs, and the process (or rather the mkfs.ext4 child process) can’t be killed, even with kill -9.

After a little while, the following shows up in my dmesg:

[  492.528687]       Not tainted 4.18.0-147.5.1.el8_1.x86_64 #1
[  492.528719] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[  492.528764] kworker/u17:0   D    0  4238      2 0x80000080
[  492.528777] Workqueue: kcryptd/253:3 kcryptd_crypt [dm_crypt]
[  492.528778] Call Trace:
[  492.528787]  ? __schedule+0x253/0x830
[  492.528791]  ? mempool_alloc+0x67/0x190
[  492.528793]  schedule+0x28/0x70
[  492.528795]  schedule_timeout+0x26d/0x390
[  492.528803]  ? qat_alg_sgl_to_bufl.isra.11+0x456/0x770 [intel_qat]
[  492.528807]  ? dma_direct_unmap_page+0x7a/0x80
[  492.528809]  wait_for_completion+0x11f/0x190
[  492.528811]  ? wake_up_q+0x70/0x70
[  492.528814]  crypt_convert+0xa13/0xf00 [dm_crypt]
[  492.528818]  ? bio_alloc_bioset+0xdc/0x210
[  492.528820]  ? __switch_to_asm+0x41/0x70
[  492.528822]  ? __switch_to_asm+0x35/0x70
[  492.528825]  kcryptd_crypt+0x2f3/0x3b0 [dm_crypt]
[  492.528828]  process_one_work+0x1a7/0x3b0
[  492.528831]  worker_thread+0x30/0x390
[  492.528833]  ? create_worker+0x1a0/0x1a0
[  492.528835]  kthread+0x112/0x130
[  492.528837]  ? kthread_flush_work_fn+0x10/0x10
[  492.528839]  ret_from_fork+0x35/0x40

Starting the command using strace mkfs ... always stops at exactly the same character in the output:

pwrite64(3, ”\3...”) = 4096
pwrite64(3, ”\3...”) = 4096
fsync(3

I don’t know the relevance of the missing closing parenthesis on the last line, but it always stops at this exact place.

How would I go about identifying more exactly what is going on and where the problem might lie?

MWinther
  • 1
  • 1

1 Answers1

0

What solved this for me what poking around in the error message. The intel_qat module turns out to be the culprit, with another, associated module qat_c3xxx. So by blacklisting the modules it stopped hanging.

blacklist intel_qat /bin/false
blacklist qat_c3xxx /bin/false

A little bit later, I received advice to check the BIOS, and there is a setting for QAT deep in there, and when disabling it in BIOS, the blacklist is no longer required.

MWinther
  • 1
  • 1
  • i have the same board here, and a _very_ similar issue ... where did you find the setting in the bios? – rmalchow May 12 '20 at 13:31