Today one of my servers started sending me alerts about non-optimal RAID state. These were triggered by a very simple script run by cron -- if it detects that not all of the disks report 'Optimal' state, it sends an alert.
Now, the issue is that the RAID seems to be fine but the megacli -LDInfo -Lall -aALL
command invoked by the script fails repeatedly leaving a cryptic error message in syslog:
megacli: Failed to alloc kernel SGL buffer for IOCTL
. The curious thing is that the command does work sometimes and does return output, but most of the time it just returns two blank lines and the exit code:
# megacli -LDInfo -Lall -aALL Exit Code: 0x00
The same goes for megacli
with other parameters like megacli -AdpAllInfo -aAll
. Every time the command fails the said error appears in syslog.
This has never happened before, as far as I can remember. No changes were made at the server recently. The adapter is a PERC 6/i Integrated
and the server runs under Debian Wheezy.
What could possibly be the issue and where do I start resolving this?
EDIT:
# megacli -v MegaCLI SAS RAID Management Tool Ver 5.00.12 May 08, 2009 (c)Copyright 2009, LSI Corporation, All Rights Reserved. Exit Code: 0x00
At least this command works every time without triggering the error ;) I've just realised this is an old release of megacli. Still, it shouldn't matter since the very same setup has been working a couple dozen of months with no problem and now suddenly decided to go wild.