I know that I have to align my 4k drives by a multiple of 8 sectors, but what about md-RAID / LVM / dm-crypt? How do I tell these layers that my drive is 4k? If they don't respect the 4k sector size, the partition alignment is useless. How do I align LVM/md/crypto-layers? Thanks.
-
2I was just thinking, "Hmm, 4 kilobytes seems awfully small for a disk drive". Perhaps it's the new 640k. – Tom O'Connor Oct 10 '10 at 09:58
-
@Tom. Many filesystems use a 4 KB block size, so larger sectors would result in a lot of performance-sapping read-modify-writes. Secondly, the drive towards bigger sectors was primarily to increase the ECC efficiency, and there are diminishing returns for making it even bigger. – janneb Oct 10 '10 at 10:26
-
@janneb You are Buzz Killington AICMFP – Tom O'Connor Oct 10 '10 at 14:16
4 Answers
Be careful! gpt labels, required for disks > 2 TiB, are 39 (512-byte) sectors long. So if you create your first partition immediately after the label, it won't be on a 4KiB boundary.
GNU parted does not do this by default, probably because many "Advanced Format" drives falsely claim that their physical sectors, not just their logical sectors, are still only 512B.
So if you're using GNU parted, ensure that each partition starts on an LBA divisible by 8 (LBAs remain 512B, so 8*512B = 4KiB). LBAs originate at 0, so start the first partition at "40s".
Also, if you use GRUB, leave room for its second stage bootstrap. MS-DOS labels are 63 sectors, with enough unused room for GRUB to stash its second stage bootstrap, but there's no unused space in a gpt label. So make a small partition 1, set its "bios_grub" flag, and then create your "real" partitions after that -- making sure that each and every one begins on a LBA that's a multiple of 8.
- 41
- 1
See https://ata.wiki.kernel.org/index.php/ATA_4_KiB_sector_issues
The short version is that if you have a recent distro, it should automatically do the right thing. For older distros, it's a bit more complicated.
For LVM you should investigate the --dataalignment
option to pvcreate
, or for even older distros -–metadatasize
.
MD, AFAIK, puts its own metadata at the end of the partitions, so it should always be aligned to the underlying partition.
For mkfs, again the filesystem should be aligned with the underlying partition. For some filesystems you can add options for stripe width and stripe size in case you're running on a RAID device, so that the filesystem can try to align stuff on RAID stripe boundaries.
problem is mostly with alignment of partition beginning with structure of underlying disk. to keep backwards compatibility disks 'lie' to the bios/os that they have 512B sectors, while in fact they have 4096B sectors in case of modern hard drives, 32-64kB sectors in case of most common stripping raids/ssds.
misaligned partitions will hurt your performance. i have done some benchmarks only on regular partitions on the top of the disk - without lvm and my results measured with bonnie++ were without proper alignment:
Sequential Output Block: 29MB/s
Sequential Output Rewrite: 20MB/s
with alignment:
Sequential Output Block: 70MB/s
Sequential Output Rewrite: 37MB/s
check this for lvm alignment.
- 29,561
- 5
- 64
- 106
Most newer distributions are updated to know about the 4K thing by now. I just built a md-RAID/LVM/XFS setup on a bunch of 2TB drives with no problems. Didn't do anything special.
- 680
- 5
- 12
-
i dont agree... sure - everything works, but partitions are miss-aligned which hurts performance. – pQd Oct 10 '10 at 06:36
-
Well this begs the questions. I have several different raids using drives from 2TB to 500GB with an LVM on top of them formatted with XFS. So how exactly would everything work if all the drives are not 4k? I would love to do bench marks but my terra server is slower then the drives, so it would be pointless. – Porch Oct 10 '10 at 18:38