29

I read in one of the VMware KB articles that snapshots will directly affect VM performance.

But my team keeps asking me how snapshots can affect performance.

I would like to give them solid reason behind the statement the snapshots are performance killers.

Can anyone explain a little bit theory about how snapshots are actually affecting the performance? Is it just because Disk I/O rate of hard disk would be slow?

flooose
  • 105
  • 5
Samselvaprabu
  • 1,311
  • 5
  • 13
  • 26
  • 2
    Not sure if [this is the KB article](http://kb.vmware.com/kb/1025279) you read or not. I thought I would add it as reference. – Aaron Copley Sep 20 '12 at 14:56

4 Answers4

30

When you create a snapshot, the original disk image is "frozen" in a consistent state, and all write accesses from then on will go to a new differential image. Even worse, as explained here and here, the differential image has the form of a change log, that records every change made to a file since the snapshot was taken. This means, that read accesses would have to read not only one file, but also all difference data (the original data plus every change made to the original data). The number increases even more when you cascade snapshots.

Ansgar Wiechers
  • 4,197
  • 2
  • 17
  • 26
  • 2
    Best explanation. You are not only doubling IOPS, but there is CPU overhead in calculating the block-level difference. – Aaron Copley Sep 20 '12 at 14:55
  • 3
    After reading this article linked by Aaron Copley (http://kb.vmware.com/selfservice/microsites/search.do?language=en_US&cmd=displayKC&externalId=1025279) it seem worse than that. A Snapshot is not a differential image, it's a change log, so if you write the same data on the same place 10 times, the snapshot will increase it's size by 10 multiplied the size of data you wrote. Instead a differential image should be more efficient because it should overwrite the data rewritten at the same location. – Max Sep 25 '12 at 07:18
  • @Ansgar Wiechers: Did you read Max comments? He mentioned that as it is maintaining just change log it is worse than having different image. If that concept is more accurate. Please edit your answer. I felt your explanation is good and i would like to make it as answer. – Samselvaprabu Sep 26 '12 at 05:35
  • @Samselvaprabu Updated. – Ansgar Wiechers Sep 26 '12 at 10:45
  • Is only VMWare affected by this problem? What about others, like Hyper-V? – Andrew Savinykh Nov 07 '13 at 10:43
  • 1
    @zespri The problem affects all virtualization platforms using this kind of snapshot technology, including Hyper-V. – Ansgar Wiechers Nov 08 '13 at 13:37
  • @AnsgarWiechers Just to clarify, though... VirtualBox uses a differencing image, right? I'm pretty sure it's not a changelog. Of course, if you're really worried, you always have the option of making a full clone in VirtualBox. – Parthian Shot Mar 14 '16 at 14:10
  • 1
    This is plain wrong. See @Falcon Momot's comment for the correct answer. Even the linked article just states the disk "can run out of space" which is clear when the space left before taking a snapshot is smaller than the snapshotted disk and the deltadisk does not have enought space to expand. – Daniel Sep 19 '17 at 20:03
  • @Daniel Quoting from that very article: *"The snapshot file is only a change log of the original virtual disk"*. Quoting from KB article 1015180 (added to the answer): *"From the original parent disk, each child constitutes a redo log pointing back from the present state of the virtual disk, one step at a time, to the original"*. If you have other information contradicting VMware's knowledge base: please provide evidence. – Ansgar Wiechers Sep 20 '17 at 09:22
  • 2
    @AnsgarWiechers This answer is definitely wrong. The linked article (https://kb.vmware.com/s/article/1015180) states; "The child disk, which is created with a snapshot, is a sparse disk. Sparse disks employ the copy-on-write (COW) mechanism, in which the virtual disk contains no data in places, until copied there by a write." Followed by; "If a virtual machine is running off of a snapshot, it is making changes to a child or sparse disk. The more write operations made to this disk, the larger it grows, to an upper limit of the size of the base disk plus a small amount of overhead." – Steve365 Jan 03 '18 at 14:51
6

When you create a snapshot on a VM this creates a Delta Disk and the operating system writes to this file instead of the original VMDK. This file is called VM_Name-Delta.VMDK but if the system needs to refers to a file before the snapshot it will refers to VM_Name.VMDK increasing the I/O of this operation. If you take multiple snapshots you are referring to the last delta file of the last snapshot not the original VMDK thus increasing I/O.

Example.

OS ---> Snapshot (File A Created) ---> (Snapshot File B Created)

If I need to refer to File A it will be looking through 3 VMDK's to find this.

Also if you include the memory state of the VM at the time of snapshotting this creates a this again is a delta file and refers to the original memory files if needed.

A file is created this lists all the files created at the time of the snapshot process

Zapto
  • 1,824
  • 6
  • 23
  • 39
4

As far as I can tell, VMWare is using copy-on-write logic to implement their snapshots. Therefore, when you create one, every operation done on your VM (eg. almost everything in runtime) would cause a little bit of the VM to be copied until the whole thing was essentially cloned.

Another performance issue with this is that reads would have to cascade to the original copy if the working copy doesn't yet have data (because nothing changed to cause a copy).

If you want to have the snapshots as a backup but can't tolerate a small performance decrease, consider cloning the VM instead.

Falcon Momot
  • 24,975
  • 13
  • 61
  • 92
-2

From High co-stop (%CSTP) values seen during virtual machine snapshot activities:

As the size and number of snapshots on a virtual machine increase, so does the number of storage command operations within vmkernel. For each storage command issued by the virtual machine guest OS, multiple storage command operations may be necessary to traverse the entire snapshot chain to read the most appropriate block of data.