I'm new to LVM and RAID management.
I've got a Linux machine (Ubuntu 18.04) with two 4Tb HDDs that are joined into an LVM group. That PC has also two 1.8 Tb SSDs. OS is installed on a separate NVME drive.
Recently our bosses have purchased 10 more SSDs of 4 Tb. I've installed 8 of them into the PC, other two drives are left as spare parts (there were no free slots for all ten drives).
Due to warranty restrictions from the PC vendor, I cannot replace HDDs or 1.8Tb SSDs with 4Tb SSDs. I can just add new SSDs.
I'd like to use all these solid state drives as one large data storage with capabilities of error recovery.
Purpose of that data storage is storing data sets for machine learning tasks (many relatively small files accessed randomly).
From what I've read, I conclude that I should join those drives to one or several RAID5 groups and then create LVM group on top of them.
However, I cannot figure out optimal grouping.
Things are somewhat complicated by the fact that some of these drives are used now, and data should not be lost. And I don't have additional drives for backup.
My current config is the following: five 4Tb SSDs are free and not formatted; one 4Tb SSD is occupied by almost 100%, two 1.8Tb SSDs and two 4Tb SSDs are joined to one LVM group.
Here is the output of lsblk:
NAME MAJ:MIN RM SIZE RO TYPE MOUNTPOINT
loop0 7:0 0 89.1M 1 loop /snap/core/7917
loop1 7:1 0 89.1M 1 loop /snap/core/8039
sda 8:0 0 3.7T 0 disk
└─vg0-lv--0 253:0 0 10.8T 0 lvm /ssdata
sdb 8:16 0 3.7T 0 disk
└─vg0-lv--0 253:0 0 10.8T 0 lvm /ssdata
sdc 8:32 0 3.7T 0 disk
sdd 8:48 0 3.7T 0 disk
sde 8:64 0 1.8T 0 disk
└─vg0-lv--0 253:0 0 10.8T 0 lvm /ssdata
sdf 8:80 0 1.8T 0 disk
└─vg0-lv--0 253:0 0 10.8T 0 lvm /ssdata
sdg 8:96 0 3.7T 0 disk
└─vg1-lv--0 253:1 0 7.3T 0 lvm /home
sdh 8:112 0 3.7T 0 disk
└─vg1-lv--0 253:1 0 7.3T 0 lvm /home
sdi 8:128 0 3.7T 0 disk /mnt/SAITds
sdj 8:144 0 3.7T 0 disk
sdk 8:160 0 3.7T 0 disk
sdl 8:176 0 3.7T 0 disk
nvme0n1 259:0 0 477G 0 disk
├─nvme0n1p1 259:1 0 1M 0 part
├─nvme0n1p2 259:2 0 200G 0 part /
├─nvme0n1p3 259:3 0 150G 0 part
└─nvme0n1p4 259:4 0 127G 0 part [SWAP]
This output shows my previous experiments. I've played with LVM grouping and extended SSD LVM group with two more 4Tb SSDs.
My questions are:
what is the best grouping of these drives into RAID and LVM?
am I right that RAID5 array is not possible on drives of different sizes?
what could be the algorithm for "online" creation and growing of RAID/LVM groups so that data are preserved (moving data to another directory is allowed)?
Update: Motherboard of that PC is ASUS Z10PE-D8 WS. According to the specifications, it supports RAID - 0, 1, 5, and 10. However, I cannot figure what this "support" means.
So, I've got two options: software Linux raid with md utilities or hardware raid from the motherboard.