0

First thing, we are not even sure this is a udev problem, but we need somewhere to start asking... We have a Hitachi fibre-channel SAN serving volumes to a couple of machines running ubuntu server 12.04 amd64.

For mapping purposes we use the udev-generated /dev/disk/by-id identifiers

...
/dev/disk/by-id/scsi-1HITACHI_750505270125
/dev/disk/by-id/scsi-1HITACHI_750505270125-part1
/dev/disk/by-id/scsi-1HITACHI_750505270126
/dev/disk/by-id/scsi-1HITACHI_750505270126-part1
...

where the last 4 digits (0125, 0126, 0127...) identify the LUNs created on the Hitachi, so we know which physical volume we're accessing.

We found a weird problem, where we had a 1.1T volume on LUN 125 and we broke it down into smaller chunks on the cabin side. After reassigning one of the new drives to the server it seems the volume size is cached (see the 1150.5 GB size)...

root@server1:~# fdisk -l /dev/disk/by-id/scsi-1HITACHI_750505270125

Disk /dev/disk/by-id/scsi-1HITACHI_750505270125: 1150.5 GB, 1150514364416 bytes
255 heads, 63 sectors/track, 139875 cylinders, total 2247098368 sectors
Units = sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes
Disk identifier: 0x00000000

                                      Device Boot      Start         End      Blocks   Id  System
/dev/disk/by-id/scsi-1HITACHI_750505270125-part1              63  1048575999   524287968+  83  Linux

The weird part is that we have the same volumes connected to a different machine. They are not active, but they are still visible. We saw the same behaviour, but after rebooting the drives look as they should (see the 536.9 GB size):

root@server2:~# fdisk -l /dev/disk/by-id/scsi-1HITACHI_750505270125

Disk /dev/disk/by-id/scsi-1HITACHI_750505270125: 536.9 GB, 536870912000 bytes
255 heads, 63 sectors/track, 65270 cylinders, total 1048576000 sectors
Units = sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes
Disk identifier: 0x00000000

                                      Device Boot      Start         End      Blocks   Id  System
/dev/disk/by-id/scsi-1HITACHI_750505270125-part1              63  1048575999   524287968+  83  Linux

The funny part is that we partitioned the drive on the second server (server2), the one that sees the right size, and on the first server (server1) we can see that partition, even though the actual drive size is still the old one. We even formatted it and mounted it on server2, wrote a txt file, unmounted it, remounted it on server1 and, sure enough, we can see and access the txt file.

Looks like somewhere along the way someone is caching volume sizes?

Just in case, after detaching and reattaching the drives we re-scan the LUNs and run udevadm trigger to refresh the udev tree...

We are not really comfortable using the drives with this disparity, and if we need to reboot to get the system to show real sizes we lose all the advantages of hotplugging... Any ideas on how is this happening and is it safe to use those volumes without restarting?

As a side question, when we detach the drives from the fibre cabin, we run udevadm trigger and looks like udev just adds new drives (devices), but it doesn't remove devices that are gone... is that supposed to be that way?

NublaII
  • 63
  • 4

1 Answers1

1

There are several command that come to play as there are multiple layers involved.

For update / rescan

FC
To just simply scan the bus run:

echo "1" > /sys/class/fc_host/hostXYZ/issue_lip
echo "- - -" > /sys/class/scsi_host/hostXYZ/scan

If you know the bus/target/lun in advance you can just say:

echo "b t l" > /sys/class/scsi_host/hostXYZ/scan

you replace b t l with bus target and lun numbers.

A SCSI specific command to update a shrink/grown size of a disk is

echo 1 > /sys/block/sdX/device/rescan

You need to know the corresponding drive canonical name e.g. sda.


To remove / delete a disk

(you need to replace the sda and 0:0:0 obviously)

umount everything from that disk

Remove from the SCSI layer

echo 1 > /sys/class/block/sda/device/delete

Remove from the FC layer

echo 1 > /sys/class/fc_transport/target0\:0\:0/device/0\:0\:0\:0/delete

Now you can safely remove it from the SAN.

cstamas
  • 6,607
  • 24
  • 42