Recently one of our servers was hanged due to IPMI BMC failure. It is CentOS 6.3 OpenStack compute host serving KVM vitual machines with qcow2 backend.
There was running a VM based on EC2 Ubintu could image (precise-server-cloudimg-amd64-disk1.img).
After the system reboot I found a strange thing: ssh host keys on VM were recreated (13:25 - reboot time):
root@weather:~# ll /etc/ssh/*key
-rw------- 1 root root 668 Nov 21 13:25 /etc/ssh/ssh_host_dsa_key
-rw------- 1 root root 227 Nov 21 13:25 /etc/ssh/ssh_host_ecdsa_key
-rw------- 1 root root 1679 Nov 21 13:25 /etc/ssh/ssh_host_rsa_key
I also found that some orphan i-nodes was deleted during FS recovery process:
Nov 21 13:25:23 weather kernel: [ 0.901159] EXT4-fs (vda1): INFO: recovery required on readonly filesystem
Nov 21 13:25:23 weather kernel: [ 0.902688] EXT4-fs (vda1): write access will be enabled during recovery
Nov 21 13:25:23 weather kernel: [ 1.930773] EXT4-fs (vda1): ext4_orphan_cleanup: deleting unreferenced inode 1286
......
Nov 21 13:25:23 weather kernel: [ 1.940810] EXT4-fs (vda1): ext4_orphan_cleanup: deleting unreferenced inode 53755
Nov 21 13:25:23 weather kernel: [ 1.940815] EXT4-fs (vda1): ext4_orphan_cleanup: deleting unreferenced inode 53754
Nov 21 13:25:23 weather kernel: [ 1.940819] EXT4-fs (vda1): 8 orphan inodes deleted
My question is: why ssh keys could be recreated? Can it be a result of data loss in filesystem? And how to prevent this in the future?
qcow2 cache mode is set to write-through in libvirt VM configurations. Host filesystem is ZFS (zfsonlinux) placed on hardware RAID controller with BBU.
If this is a result of a file-system inconsistency on reboot - I am very mystified since ssh key files are not changed and all relevant data expected to be synced to stable media.