23

I'm trying to figure out how LVM snapshots work so I can implement it on my fileserver but I'm having difficulty finding anything on google that explains how it works, instead of how to use it for a base backup system.

From what I've read I think it works something like this:

  • You have an LVM with a primary partition and lots and lots of unallocated freespace not in the partition
  • Then you take a snapshot and mount it on a new Logical Volume. Snapshots are supposed to have changes so this first snapshot would be a whole copy, correct?
  • Then, the next day you take another snapshot (this one's partition size doesn't have to be so big) and mount it.
  • Somehow the LVM keeps track of the snapshots, and doesn't store unchanged bits on the primary volume.
  • Then you decide that you have enough snapshots and get rid of the first one. I have no idea how this works or how that would affect the next snapshot.

Can someone correct me where I'm wrong. At best, I'm guessing, I can't find anything on google.


vgdiplay

obu1:/home/jail/home/qps/backup/D# vgdisplay
  --- Volume group ---
  VG Name               fileserverLVM
  System ID
  Format                lvm2
  Metadata Areas        1
  Metadata Sequence No  3
  VG Access             read/write
  VG Status             resizable
  MAX LV                0
  Cur LV                2
  Open LV               2
  Max PV                0
  Cur PV                1
  Act PV                1
  VG Size               931.51 GB
  PE Size               4.00 MB
  Total PE              238467
  Alloc PE / Size       238336 / 931.00 GB
  Free  PE / Size       131 / 524.00 MB
  VG UUID               qSGaG1-SQYO-D2bm-ohDf-d4eG-oGCY-4jOegU
Gautam Somani
  • 296
  • 3
  • 14
Malfist
  • 797
  • 3
  • 9
  • 21

5 Answers5

36

Why not have a look at the snapshots section of the LVM-HOWTO?

LVM snapshots are your basic "copy on write" snapshot solution. The snapshot is really nothing more than asking the LVM to give you a "pointer" to the current state of the filesystem and to write changes made after the snapshot to a designated area.

LVM snapshots "live" inside the volume group hosting the volume subject to the snapshot-- not another volume. Your statement "...lots and lots of unallocated freespace not it the partition" makes it sound like your thinking is that the snapshots "live" outside the volume group subject to snapshot, and that's not accurate. Your volume group lives in a hard disk partition, and the volume being subject to snapshot and any shapshots you've taken live in that volume group.

The normal way that LVM snapshots are used is not for long-term storage, but rather to get a consistent "picture" of the filesystem such that a backup can be taken. Once the backup is done, the snapshot is discarded.

When you create an LVM snapshot you designate an amount of space to hold any changes made while the snapshot is active. If more changes are made than you've designated space for the snapshot becomes unusable and must be discarded. You don't want to leave snapshots laying around because (a) they'll fill up and become unusable, and (b) the system's performance is impacted while a snapshot is active-- things get slower.

Edit:

What Microsoft Volume Shadow Copy Services and LVM snapshots do aren't too tremendously different. Microsoft's solution is a bit more comprehensive (as is typically the case with Microsoft-- for better or for worse their tools and products often seek to solve pretty large problems versus focusing on one thing).

VSS is a more comprehensive solution that unifies support for hardware devices that support snapshots and software-based snapshots into a single API. Further, VSS has APIs to allow applications to be made quiescent through the snapshot APIs, whereas LVM snapshots are just concerned with snapshots-- any quiescing applications is your problem (putting databases into "backup" states, etc).

Evan Anderson
  • 141,071
  • 19
  • 191
  • 328
  • 1
    So it's not truly modeled after Volume Shadow Copy (VSS), because that's not how VSS works? – Malfist Jul 15 '09 at 13:35
  • This makes a lot more sense. – Malfist Jul 15 '09 at 13:42
  • VSS doesn't require more than one partition. It does everything on the same partition, and you can delete old snapshots with impunity – Malfist Jul 15 '09 at 15:01
  • 1
    I think you're misunderstanding LVM snapshots somewhat. LVM snapshots create "virtual" devices that are mounted like standalone volumes, but they're not actually "partitions". LVM snapshots "live in" the volume being subject to snapshot, just like VSS snapshots. – Evan Anderson Jul 15 '09 at 15:07
  • That's odd, when I tried to create a snapshot I was told there wasn't enough unallocated space on the LVM, I've only used like ~30MB of the partition (488GB total). I tried to create a 10GB snapshot thingy – Malfist Jul 15 '09 at 16:08
  • I can't tell you why you saw that w/o seeing it. The command to create the snapshot you're describing would've been something like: lvcreate -L10240M -s -n snapshot1 /dev/volume-group-name/volume-name – Evan Anderson Jul 15 '09 at 16:28
  • that's exactly what I used – Malfist Jul 15 '09 at 17:47
  • obu1:/home/jail/home/qps/backup/D# lvcreate -L1024M -s -n snap7-15 /dev/fileserverLVM/home Insufficient free extents (131) in volume group fileserverLVM: 256 required – Malfist Jul 15 '09 at 17:49
  • Can you post the output of a "vgdisplay" somewhere? It sounds like you don't have enough free physical extents in the volume group. I'm guessing that your "home" volume is very close to the size of the fileserverLVM volume group. – Evan Anderson Jul 15 '09 at 17:53
  • Re-reading our whole discussion here, it occurs to me that we've both been a bit imprecise in our communication. You used the term "LVM" to mean "volume group", and "partition" to mean volume. In my answer (now corrected), I used the word "volume" alone when I should have said "volume group" (being precise). That's probably the source of some of the confusion here. – Evan Anderson Jul 15 '09 at 18:01
  • I appended the vgdisplay to the question – Malfist Jul 15 '09 at 18:09
  • So snapshots do live in a separate volume and not in the free space of the current volume? – Malfist Jul 15 '09 at 18:11
  • Snapshots live in a separate volume inside the same volume group. I was taking your use of the word "partition" to mean "volume group", not "volume". – Evan Anderson Jul 15 '09 at 19:00
  • FWIW, Linux LVM is actually a distant relative of HP-UX LVM. While they dont' exactly share the same code, the Linux version was "modelled" heavily on how the HP-UX version worked. Just a bit of background history to give a reference to work with http://en.wikipedia.org/wiki/Logical_Volume_Manager_(Linux) – Avery Payne Jul 15 '09 at 19:30
  • 1
    Could you please clarify where the updated data goes while a snapshot is active ? to the main LV and the snapshot stores a copy of the old block ? or to the snapshot LV while the main LV remains untouched ? – Benoît May 15 '12 at 14:59
  • 1
    @benoit the link in the answer's first line covers this. Read the note there on LVM1 read-only snapshot behaviour and I think you'll have your answer. (It's the first approach you describe, not the second.) – Peter Hansen Jan 03 '14 at 15:54
36

LVM snapshots are an example of a copy-on-write snapshot solution, as Evan said. How it works is a bit different from from Evan implied, but not by a whole lot.

When you have an LVM volume with no snapshots, writes to the volume happen as you'd expect. A block is changed, and that's it.

As soon as you create a snapshot, LVM creates a pool of blocks. This pool also contains a full copy of the LVM metadata of the volume. When writes happen to the main volume such as updating an inode, the block being overwritten is copied to this new pool and the new block is written to the main volume. This is the 'copy-on-write'. Because of this, the more data that gets changed between when a snapshot was taken and the current state of the main volume, the more space will get consumed by that snapshot pool.

When you mount the snapshot, the meta-data written when the snapshot was taken allows the mapping of snapshot pool blocks over changed blocks in the volume (or higher level snapshot). This way when an access comes for a specific block, LVM knows which block access. As far as the filesystem on that volume is concerned, there are no snapshots.

James pointed out one of the faults of this system. When you have multiple snapshots of the same volume, every time you write to a block in the main volume you potentially trigger writes in every single snapshot. This is because each snapshot maintains its own pool of changed blocks. Also, for long snapshot trees, accessing a snapshot can cause quite a bit of computation on the server to figure out which exact block needs to be served for an access.

When you dispose of a snapshot, LVM just drops the snapshot pool and updates the snapshot tree as needed. If the dropped snapshot is part of a snapshot tree, some blocks will be copied to lower level snapshot. If it is the lowest snapshot (or the only one), the pool just gets dropped and the operation is very fast.


Some file-systems do offer in-filesystem snapshots, ZFS and BTRFS are but two of the better known ones. They work similarly, though the filesystem itself manages the changed/unchanged mapping. This is arguably a better way of doing it since you can fsck an entire snapshot family for consistency, which is something you can't do with straight up LVM.

sysadmin1138
  • 131,083
  • 18
  • 173
  • 296
  • Thankful for this detailed explanation. Sorry that I'm confused about **"As far as the filesystem on that volume is concerned, there are no snapshots."** Could you explain more about what that mean ? Very appreciate for any response ~ – Carr Oct 20 '16 at 06:59
  • 2
    @Carr It means that snapshots are handled outside of the filesystem entirely. Other filesystems that have snapshot capability built in, like BTRFS and XFS, do have a concept of snapshots and you shouldn't use LVM snapshots with those systems. – sysadmin1138 Oct 20 '16 at 17:02
  • @sysadmin1138 I'm curious about the builtin snapshots with XFS you mentioned, for the purpose of consistency check/repair of FS. I have a multi-TB XFS FS that went down in a dirty way and I want to check/fix it, without putting it offline (hundreds of users, can't go offline for hours). I'm thinking creating a XFS snapshot and then running fsck on it to find/fix errors while the live filesystem is kept online, and then if fix is made, swap with the live filesystem. Would an XFS snapshot be better for this purpose than an LVM snapshot? – Ján Lalinský Sep 24 '19 at 11:46
2

LVM snapshots are inefficient, the more snapshots there are the slower the system will go.

I only support xfs as its what we use and xfs_freeze can be used to halt new access to the file system and creates a stable image on disk.

Copy on Write is used so the disc space is used efficiently.

You have create a filesystem in a logical volume that has spare space in it for the snapshots.

This is an example from the FAQ

James
  • 2,212
  • 1
  • 13
  • 19
2

You don't specify whether you are using Linux or HP-UX. In HP-UX, you create a logical volume and mount it as a snapshot of another logical volume. In Linux, you create a logical volume as a snapshot volume.

Removing a snapshot in HP-UX is done by umounting the volume; in Linux it is done by using lvremove to remove the logical volume.

In any case, the changes are the only thing that is stored on your snapshot. The longer the snapshot remains available, the more changes it stocks up - and there is the chance it could fill up if not properly sized or released.

The speed of disk access on a snapshot volume is slower than it would be to a normal volume; you must take that into account.

Mei
  • 4,560
  • 8
  • 44
  • 53
1

@Evan Anderson and @sysadmin1138 answers, while very instructive and spot-on for their times (2009), are now somewhat outdated due to the existence of two distinct LVM snapshot methods:

  • the first one (let call it classical LVM) is the one described in the above answers. It basically set apart a specific disk portion where to copy to-be-overwritten data, meaning that multiple snapshots destroy performances (ie: if a single snapshot slowes down the system by 3-5x, two snapshots slow it by 6-10x, three snapshots by 12-15x, and so on). This, in turn, make them incapable of supporting a rolling-snapshot policy. Moreover, their metadata storage (plain text) was not optimized for speed. In fact, their main use was for backups: a single snapshot is taken and, after the backup, deleted;

  • the new one (called Thin LVM or lvmthin) is an entirely different beast. It heavily depend on binary optimized metadata (btree) to track space chunk quickly and efficiently. Taking a snapshot does not take any disk space (ie: the snapshot size should not be declared and it is not set apart), except for some more used metadata space. Overwriting an already-allocated chunk can again result in read-modify-write, but this can be entirely avoided for large writes (where "large" mean larger than the thin pool data chunks). More importantly, multiple snapshot does not copy any more data than a single snapshot, because only metadata are altered to point the various snapshots to the same data chunk. On the darker side, one should note that thin snapshot can "fill" the entire volume, causing all writes to stall.

Which LVM volume should you use? For root filesystem, I generally use classical LVM volumes: they are rock solid and are easier to recover. Moreover, a root partition often does not contain much valuable data by themself (and so normal backup procedure suffices). On the other hand for data volume I typically want some rolling-snapshot extending some days/weeks in the past, so I use Thin LVM (or a ZFS pool, but this is another story...). For some additional context, you can read here

shodanshok
  • 44,038
  • 6
  • 98
  • 162
  • You make it sound this "2nd" snapshot technology is called "lvmthin" or "thin provisioning". It's more correct to say the snapshots of thin provisioned LV are different and more efficient, as these snapshot share the same blocks of the thin proc. LV, which is not the case of snapshot of normal provisioned snapshots. – MrCalvin Jul 18 '21 at 20:50