My goal is to automate a backup routine on a small OpenSolaris NAS (running OmniOS + napp-it on a HP Microserver N54L) in combination with SATA disks.
Background:
I have installed one of those 5.25" -> 3.5" carrier-less HDD trays that contain a simple SATA or SAS/SATA backplane with one port, a power button and some LEDs (power and HDD activity). To backup multiple HDDs (one each week in rotation, stored offsite), I have written a script that uses zfs send/recv
to dump the complete main pool including all snapshots (updating only new blocks). This script works fine when I manually start it.
I'd like to further automate that process, because the NAS does not have direct VGA or serial console attached and it is tedious to insert the disk, go back to the desktop system, log onto the web interface or SSH and start the script manually. Timed start via cron job is not an option, because the days of backup may vary slightly (forgot the disk, holidays, etc.). So the backup should start right after insertion of the disk.
Problem:
In the script I use cfgadm
to connect + configure and later unconfigure + disconnect the disks. If I only insert the disk and it spins up, I have no way of knowing that the disk is there. Possible solutions I've considered already:
- Probing for a new disk and zpool every x minutes continuously by using
cfgadm -f -c connect
and checking for error results. Not very elegant. - Checking
/var/adm/messages
every x minutes and grepping for device path or AHCI. Not possible, because messages are only written if the device is connected manually. - Using
iostat -En
. Displays the disks, but I have to grep for the exact serial numbers, because it does not list port information. Also needs to be done every x minutes. - Using
cfgadm
with SELECT syntax to filter for receptacle status. Does not work, because the insertion does not trigger anything (maybe backplane is too cheap for that). - Recognizing the power on/off of the enclosure. Would be okay, but I couldn't figure out how to accomplish this.
- Remapping the power button or adding another button to the machine. Could work, but I also don't know how to do this.
I think I would need two things:
- a reliable way to identify disk and port status in combination (so only the correct disk in the correct slot is detected)
- a way to register this detection and trigger an event (start shell script)
Is this possible? If not, what would you suggest as alternatives?
Final solution (updated 2015-01-26):
For anyone with similar problems in the future:
- Enable AHCI hotswap in OmniOS as detailed in the accepted answer by gea.
- Use
syseventadm
as detailed in my own answer to trigger the backup script when the disk comes online. - Make sure your cables, controller and disks are fault-free and play well together (I had problems with WD SE 4TB disks and the onboard AHCI SATA controller, which resulted in random
WARNING: ahci0: ahci_port_reset port 5 the device hardware has been initialized and the power-up diagnostics failed
messages in the system logs).