Why am I seeing such low SMB transfer throughput?



Ok, there's a bit more to the story than the title implies.

Background and Environment: I'm copying several TB from an older Ubuntu server to a newer Windows 2012 server over SMB. (Technically, it's commodity hardware, but they're servers around here.) Everybody is on a gigabit LAN, and the older Ubuntu box has a bonded interface. I believe the Ubuntu server has two Rosewill PCI-e 1x ethernet cards and the Windows server has one reasonably nice PCI Intel ethernet card.

The destination computer (the Windows server) is running a Storage Pool with parity over 4x 2TB drives. It is running Microsoft's new ReFS. The source computer (the Ubuntu server) is running a software RAID mirror. It is running good ol' EXT4.

The two servers are running through a single gigabit switch. I have experimented with breaking the bonding on the source (Ubuntu) computer without any improvement.

Problem: I have no trouble transferring at reasonable speeds from other computers to the Windows server. Other computers can hold 50-80MB/s without much difficulty, but transferring from that Ubuntu server tops out at no more than 20MB/s. 4+TB at 20MB/s takes a long time (something like 2.3 days), and I'm wondering what I can do to figure out where the bottleneck is.

Symptoms: CPU on both computers is pretty minimal, and certainly not prohibitively busy. Hard drives on both computers are active but not swamped, and CPU IOwait is almost 0% on at least the Ubuntu server.

I did a Wireshark trace for 35 seconds (presumably long enough to make sure all ACKs were for new packets) and noticed that there were quite a few things I didn't expect. (1) There weren't any checksums for the ACKs (and SOME SMB packets) from Windows to Ubuntu. However, Wireshark claims that this may be due to "IP checksum offload." Ok, I have a pretty nice card in there. I suppose it is possible that the network card could do checksum calculations. Fine. Moving on... (2) "TCP ACKed unseen segment." This one I have a problem with. The ACK number is within an acceptable range from what I can tell, and there are often huge blocks of these messages. Perhaps Wireshark is just too slow?

Summary: Transfer speed sucks (20MB/s over gigabit ethernet) and I don't know why. Wireshark claims Windows is ACKing things that were never sent by Ubuntu.

Guesses: My initial guess is that the cheaper Rosewill cards are getting swamped. My second guess is that the software RAID-like things on one end or the other is getting inundated with stuff to do.


Posted 2013-08-20T00:54:58.770

Reputation: 101

2What speeds do you get copying from the Ubuntu server to one of the desktops (not Server 2012)? Perhaps WinXP or Win7? I've had big problems with packet signing and encyrption with SMB with Server 2008 and up. – Dom – 2013-08-22T06:39:36.943

Update: I ended up having to reboot (thanks to a kernel panic). Unfortunately the system now has a kernel panic on every boot. I whipped out my trusty copy of Knoppix and mounted the drives, and everything is now fine and dandy. Now I'm copying over SSH and I still don't know where the bottleneck is. sshd is eating up 60% of one processor on the Knoppix side. In any case, my transfer is nearing completion.

@Dom: Now that you mention it, I don't recall putting all that data on there much faster than 30MBps in the first place. – Andy – 2013-08-23T00:02:28.297

2@LorenzoVonMatterhorn, please avoid using URL shorteners. – Cristian Ciupitu – 2013-10-20T20:39:52.800

Are you sure it is not an issue with your disks? – MariusMatutiae – 2013-11-04T17:44:08.233

2Windows implemented a much fast version of the SMB protocol (SMB 2) over the past 4-5 years that is much less chatty and more efficient on the wire. I don't know off hand when those changes rolled into Samba, but it sounds like the older Ubuntu has an older Samba and perhaps the Knoppix has a newer version. – uSlackr – 2013-12-13T14:22:04.090

what kernel version uname -r and samba version are you using? – cybernard – 2014-03-17T17:29:18.660



Your performance gap matches with a common experience when Samba (not sure if this is still the default; it was for a long time) is configured with the default read and write socket buffer size of 1024 bytes.

I used to see this frequently with Linux and Mac machines. Hopefully it's not still that case.

There is a socket option argument in samba's configuration file where you can set the read and write socket buffer size. Suggest you set both to 8192 bytes (8 KiB). 4 or 8 KB is often similar, but I haven't tested that on a gigabit link.

Also, don't expect a single TCP connection to benefit from a bonded link, the traffic will almost always go through one of the links; otherwise you end up with a lot of out-of-order packets to deal with; so only expect a load-balancing benefit when servicing multiple clients. Even then, you should look up the different bonding modes, and know that for at least "mode 4" (IEEE 802.3ad) bonding, there are basically two transmit hash modes, which determine which slave interface to send out on. There is layer-2 hashing (default) and layer-3 hashing. If sending the bulk of your data via gateway, the layer-2 hash will not distribute well, as the gateway's MAC address will be the same. Consider using layer-3 instead.

Cameron Kerr

Posted 2013-08-20T00:54:58.770

Reputation: 968


I once had two Ethernet cards in one Ubuntu computer and for some reason it didn't work properly - they both vied for the same packets it seemed, so sometimes I would get a reply sometimes I would not, depending if the other network card grabbed the packed. It was odd. I must have had it misconfigured somehow, but I would have thought it would have just worked. The cards had unique IP addresses of course.

Anyway, it would be simple for you to try it with just ONE Ethernet card in the machine connected to the network just to rule that out.


Posted 2013-08-20T00:54:58.770

Reputation: 111