0

I met a problem when executing FIO stress test on a RAID0 which built from 6x SSDs thru mdadm under Yocto OS, below is the information:

  1. 6x PCIe NVMe SSD are the same vendor and model which is with 1.02TB automotive grade.
  2. FIO parameter used for the test: fio --filename=/dev/md127 --direct=1 --rw=randrw --bs=64k --ioengine=libaio --iodepth=64 --runtime=43200 --numjobs=16 --time_based --group_reporting --name=randomrw --eta-newline=1
  3. The system auto restart after 30 minutes run

The question is that I'd want to know why it would cause the system auto restart randomly, is that a software issue or software limitation, or a hardware issue? Would you suggest on how to isolate the issue?

I'm going to delete RAID0 and try agaig with same FIO parameters and on a single SSD first, if it cannot be reproduced, then will run the test again on all 6 SSDs with same parameters but without RAID mode.

Thanks, Jacky

Jacky Lee
  • 1
  • 2

2 Answers2

0

We found that:

  1. Both RAID0 and non-RAID mode are failed with same FIO parameter(only the --filename is with different target).
  2. When issue occurs, re-run the test by same FIO parameter will encounter the issue again immediately, except that you format the SSD, but will fail again after ~30mins run.
  3. Did not enounter this issue with given --size parameter.
  4. When issue occurs, the SSD encounters over current issue. (accept: under 2A, over current: 5.5A)
Jacky Lee
  • 1
  • 2
  • This doesn't constitute an answer to the question. Better [edit](https://serverfault.com/posts/1106911/edit) the question and put this information in it, and delete this "answer". // Regarding the issue: setup `netconsole` to send kernel log in real time over the network, and/or setup serial console to send console output on the nearby machine over serial link. This way you'll be able to save the latest `dmesg` messages, including ones that appear immediately before reboot. Also, check if automatic reboot on panic is enabled (and disable it) and also check system for power issues. – Nikita Kipriyanov Aug 03 '22 at 07:30
0

Just post the same issue to FIO GitHub and got reply from FIO developer, see below screenshot.

GitHub

Jacky Lee
  • 1
  • 2