2

I have a PCIe SSD card that uses 8kb block cells. It supports "Virtual Controllers" that can split this drive in half and create LVM RAID0 with 8kb stripe size. On top of that I install a file system that uses 8kb block sizes. My application writes data in 8kb blocks.

Is there space overhead with each added layer that causes this 8kb block 'alignment' to shift and ultimately raw data that is being written to SSD to be (significantly?) larger?

If my application writes 8kb worth of data, does FS then writes 8kb+its metadata, which then translates into 8kb+fs metadata+lvm metadata, which ultimately comes out to 8.5kb and screws up all that alignment?

Mxx
  • 2,312
  • 2
  • 26
  • 40
  • Be aware that FusionIO ioDrive2 supports "Virtual Controllers", where if you split a drive in half you get 2x IOPs. – Mxx Feb 15 '13 at 21:41

1 Answers1

7

No, blocks don't get made bigger to fit metadata in. Metadata is either stored in dedicated blocks (in the case of filesystems), or in a special area (in the case of LVM and mdraid). You just need to make sure that the start of data areas are correctly lined up.

mdraid places its metadata at the end of the partition and stores data right at the beginning, so it's always aligned. LVM stores metadata at the beginning of PVs, controlled by pvcreate --dataalignment which should be set appropriately. The filesystem should have an appropriate block size and/or stride and stripe-width set.

mgorven
  • 30,036
  • 7
  • 76
  • 121
  • So the data is written in its "raw" block size all the way through to the physical layer? – Mxx Feb 15 '13 at 21:44
  • Not necessarily. The point is that the block size refers to how the layer uses the lower layer, e.g. if the filesystem has an 8K block size it uses the underlying device using 8K blocks. – mgorven Feb 15 '13 at 21:46
  • In other words 8K of application data might end up being more than 8K of "raw" data, but it will still be written to disk in 8K blocks, there's just going to be more than 1 8KB written? – Mxx Feb 15 '13 at 21:52
  • No, 8K of application data might end up as 2x4K, or 16x512b. – mgorven Feb 15 '13 at 23:55
  • While still doing more research on this topic, I found this great post that explains this issue with very good details http://www.mysqlperformanceblog.com/2011/06/09/aligning-io-on-a-hard-disk-raid-the-theory/ – Mxx Mar 04 '13 at 05:25