0

My server crashed several times with rcu_sched self-detected stall on CPU errors. Now after restart it can not repair MySQL tables, some services start, some not.

What command I have to issue to check and repair the filesystem? I have never used fsck. Should I restart the server in rescue mode or it's not important?

Filesystem      1K-blocks     Used  Available Use% Mounted on
rootfs           20026172  8145824   10840020  43% /
/dev/root        20026172  8145824   10840020  43% /
devtmpfs         16419704        0   16419704   0% /dev
tmpfs             3284032      268    3283764   1% /run
tmpfs                5120        0       5120   0% /run/lock
tmpfs             6777360        0    6777360   0% /dev/shm
/dev/md3       1902052420 19534092 1785876644   2% /home
/dev/loop0        3997376     8192    3763088   1% /tmp
user1406271
  • 1,061
  • 4
  • 14
  • 20
  • I suggest providing more details on the problem, depending on what the reasons for "some services start, some not" the answer may change. I don't think fsck will help with a service not starting. – Daniel Mar 05 '15 at 21:57

2 Answers2

0

shutdown has -F for Force fsck on reboot.

# shutdown -rF now

It will create file /forcefsck and restart.

You could also force fsck on next reboot by creating a file /forcefsck manually.

# touch /forcefsck
# reboot

This is useful if the errors in the filesystem prevents you from using shutdown command.

Esa Jokinen
  • 43,252
  • 2
  • 75
  • 122
0

Having the RCU Scheduler issue a warning/info is not a problem which would cause a reboot. Double check your log files and see if those messages are WARN/INFO or ERR/PANIC. per the documentation here: https://www.kernel.org/doc/Documentation/RCU/stallwarn.txt

These should be info messages.

If you are manually rebooting the machine because of these messages, you can stop.

If you rebooted because of something else, that should be documented in the question.

If the system panic'd and halted/locked because of a kernel issue, that also should be documented in the question.

The problem with a service not starting is most likely not related to RCU scheduler. It is also probably not related to the filesystem -- as Esa Jokinen mentioned, force a consistency check on the next reboot. However I suspect you will still see a problem with services not starting -- check the error log for those specific services to determine the reason they aren't starting.

Daniel
  • 285
  • 2
  • 13
  • Daniel, not only these messages. At the same time when messages appear, the server freezes and I have to reboot it from datacenter CP. Crashed MySQL tables that I can't repair, messages like "kernel: BUG: unable to handle kernel paging request at ffff800788fd1c30", Rebuild21 event detected on md device /dev/md3 etc. – user1406271 Mar 05 '15 at 23:29
  • Okay, did you rebuild the SQL tables? If not, why did you accept the FSCK answer? If not, it sounds to me like a hardware problem. maybe bad memory? – Daniel Mar 06 '15 at 00:57
  • Daniel, I accepted the answer because I asked how to use fsck and Esa Jokinen answered this question. I asked it because I was not able even to repair just 1 MySQL table but then I did it somehow. Server started to crash often and often. But thanks to you as well for your answer. 20 hours ago I opened a ticket at OVH data center. Still not answer. Some people say it´s kernel issue, some say bad hardware. The server is online more than 5 months and I never had any problems with it. Probably it's a hardware problem. Php5-php also started to crash with segfaults errors. :( – user1406271 Mar 06 '15 at 03:50