Network tape restore is faster than disk to disk copy

Question

How can this be?

Running a cp or rsync (with -W --inplace) takes two hours for 93Gb; a tape restore over the dedicated backup network is 41 minutes. Tape restore is 50 Mb/s; disk to disk was measured and calculated to be 16 Mb/s tops - 2 Mb/s if the CPU is busy.

The restore software is Veritas NetBackup; the disks are on an EMC Symmetrix array over fiber. The box is an HP rx6600 (Itanium) with 16 Gb running HP-UX 11i v2. All the disks are on one fiber card, listed as:

HP AD194-60001 PCI/PCI-X Fibre Channel 2-port 4Gb FC/2-port 1000B-T Combo Adapter (FC Port 1)

The disks are also all using Veritas Volume Manager (instead of HP LVM).

Update: It occurs to me that this is not just a straight disk-to-disk copy; in reality, it is a snapshot to disk copy. Could reading the snapshot be slowing things down that much? The snapshot is an HP VxFS snapshot (not a vxsnap); perhaps the interaction between the snapshot and VxVM is causing speed degradation?

Update: Using fstyp -v, it appears that the block size (f_bsize) is 8192; the default UNIX block size is 512 (or 8192/16). When testing with dd, I used a block size of 1024k (or 1048576, or 8192*128).

I really wonder if it is the block size. I read over at PerlMonks that the Perl module File::Copy is faster than cp; that is intriguing: I wonder.

If NetBackup is using tar, then it is not using cp: that might explain the speed increase as well.

Update: It appears that reading from snapshot is almost twice as slow as reading from the actual device. Running cp is slow, as is tar writing to the command line. Using tar is slightly better (when using a file) but is limited to 8Gb files (file in question is 96Gb or so). Using perl's File::Copy with a non-snapshot volume seems to be one of the fastest ways to go.

I'm going to try that and will report here what I get.

score 2 · Answer 1 · answered Jul 10 '09 at 11:52

Another question is whether you're IO bound inside the FC network, ask the SAN guys to demonstrate (graphs are good) actual spare bandwidth available (oh, and if the FC switches are the Cisco ones how they're ensuring they're avoiding the bandwidth issues inside the switch)

score 1 · Answer 2 · answered Jul 09 '09 at 23:16

1

Are you limited by reading from, and writing to, the same disk in the array?

answered Jul 09 '09 at 23:16

crb

7,928
37
53

Don't think so. Have to talk to the SAN fellas tomorrow. The copy is between two different volumes on two different disk groups. – Mei Jul 09 '09 at 23:33
I had the same thought but didn't post an answer. – 3dinfluence Jul 10 '09 at 00:02
What does a simple dd show for read and write performance. What is the filesystem like for meta data performance ( say /usr/sbin/bonnie++ -d . -s 0 -f -n 8 ) – James Jul 10 '09 at 06:24

score 1 · Answer 3 · answered Jul 10 '09 at 13:46

1

If your tape is also on the SAN, then it's possible that the xfer is being handed off and going straight from tape to disk, while a copy is being required to be passed through the host doing the copy, and is therefore slower.

answered Jul 10 '09 at 13:46

pjz

10,497
1
31
40

Nope: no direct tape-to-SAN. – Mei Jul 10 '09 at 16:00

score 1 · Answer 4 · answered Jul 15 '09 at 03:32

1

To ensure your test is like for like, try doing the disk copy via tar (NetBackup uses tar to read from tape):

$ tar cf - oldstuff | (cd newdir; tar xf -)

If all of your disks are on the same fibre card, you could theoretically be IO bound on that one card, but I doubt it.

The VxFS snapshot could be adding overhead, especially if the original source file system is busy with writes at the time. VxFS does copy on write, so if the original disk is receiving writes, the snapshot disks will be busy receiving the original disk data.

If the original file system is idle, you can rule out the VxFS being a factor.

answered Jul 15 '09 at 03:32

cjs

111
2

Turns out there are two fiber cards, and neither is anywhere near pegged. I wonder about the use of tar: might be an idea. – Mei Jul 16 '09 at 21:29
The tar utility uses a block size of 512 by default and a block size of 1 if the output is stdout (as in your example). I think cp may use a block size of 1 (though I don't remember where I saw that). The block size of tar can be adjusted by using the -b option to tar. – Mei Jul 16 '09 at 21:56
If NetBackup uses HP-UX tar, I'd be surprised: found this in the man page for tar: "Because of industry standards and interoperability goals, tar does not support the archival of files of size 8GB or larger [...]" (under WARNINGS). – Mei Jul 16 '09 at 22:14

score -1 · Answer 5 · answered Jul 17 '09 at 01:34

-1

If the disks are connected to different buses on your mother board, the data may be copied across 3 or more internal buses and the latency is killing your IO for disk to disk copy. In this case it is entirely possible that the network tape drive has an inherently higher bandwidth path to the target disk than the source disk does.

answered Jul 17 '09 at 01:34

Jeff Leonard

343
1
4
8

You didn't read closely enough: both disks are over dual fiber-channel links. – Mei Jul 17 '09 at 17:45

Network tape restore is faster than disk to disk copy

5 Answers5