11

We have a pair of new diversely-routed 1Gbps Ethernet links between locations about 200 miles apart. The 'client' is a new reasonably-powerful machine (HP DL380 G6, dual E56xx Xeons, 48GB DDR3, R1 pair of 300GB 10krpm SAS disks, W2K8R2-x64) and the 'server' is a decent enough machine too (HP BL460c G6, dual E55xx Xeons, 72GB, R1 pair of 146GB 10krpm SAS disks, dual-port Emulex 4Gbps FC HBA linked to dual Cisco MDS9509s then onto dedicated HP EVA 8400 with 128 x 450GB 15krpm FC disks, RHEL 5.3-x64).

Using SFTP from the client we're only seeing about 40Kbps of throughput using large (>2GB) files. We've performed server to 'other local server' tests and see around 500Mbps through the local switches (Cat 6509s), we're going to do the same on the client side but that's a day or so away.

What other testing methods would you use to prove to the link providers that the problem is theirs?

Chopper3
  • 100,240
  • 9
  • 106
  • 238
  • I'd also like to know an answer to this one. We get our 100Mbit leased line installed next week sometime :) – Tom O'Connor Apr 16 '10 at 10:51
  • as user37899 says - results would be appreciated. – pQd Apr 16 '10 at 11:16
  • Any updates? I'm curious how this one turns out. – Kyle Brandt Apr 21 '10 at 17:42
  • I beat up the link providers "quite badly" (ironically they're part of the same organisation I work for!) - they've not come back to us yet. – Chopper3 Apr 21 '10 at 18:02
  • 1
    Ah okay, and by the way, if you can figure out why I get 7 votes for http://serverfault.com/questions/134467/what-is-the-real-difference-between-a-nas-and-nfs-or-why-pick-a-nas-device-ov/134470#134470 and 1 for this, I would like to know ;-) – Kyle Brandt Apr 21 '10 at 19:12

4 Answers4

10

Tuning an Elephant:
This could require tuning, probably not the issue here as pQd says though. This sort of link is known "Long, Fat Pipe" or elephant (see RFC 1072). Because this is a fat gigabit pipe going over a distance (distance is really time/latency in this case), the tcp receive window needs to be large (See TCP/IP Illustrated Volume 1, TCP Extensions Section for pictures).

To figure out what the receiving window needs to be, you calculate the bandwidth delay product:

Bandwidth * Delay = Product

If there is 10MS latency, this calculator estimates you want a receive window of about 1.2 MBytes. We can do the calculation ourselves with the above formula:

echo $(( (1000000.00/.01)/8  )) 
12500000

So you might want to run a packet dump to see if tcp window scaling (the TCP extension that allows for larger windows) is happening right to tune this once you figure out whatever the large problem is.

Window Bound:
If this is the problem, that you are window size bound with no scaling, I would expect the following results if no Window scaling is in place and there is about 200ms latency regardless of the pipe size:

Throughput = Recieve Window/Round Trip Time

So:

echo $(( 65536/.2 ))
327680 #Bytes/second

In order to get the results you are seeing you would just need to solve for latency, which would be:

RTT = RWIN/Throughput

So (For 40 kBytes/s):

echo $(( 65536.0/40000.0 )) 
1.63 #Seconds of Latency

(Please check my Math, and these of course don't include all the protocol/header overhead)

Kyle Brandt
  • 82,107
  • 71
  • 302
  • 444
  • You know I felt a bit guilty for temporarily 'overtaking' you on rep the other week, and the reason is because of how damn good your answers are - and BOOM! you even use a shell to do your maths, not the 1.5MB Mac Calculator.app I do! :) Thank you. – Chopper3 Apr 16 '10 at 13:14
  • 1
    You have good answers too, and I like that I have someone I am close to in rep, enhances the game a little :-) Quick google query reminds me that you have answered my questions as well: http://serverfault.com/questions/107263/second-redundant-power-supply-for-a-cisco-3825-router . I just really appreciate the active users trying to make this community 'happen'. But thank you for the complement! – Kyle Brandt Apr 16 '10 at 13:20
  • Me too, there's nothing I like more than knowing we've helped someone who felt they were on their own with a frustrating problem - apart from cheese of course. That said I do hate it when we get badly formed questions too, did you you hear my question on SO podcast 82? got a free SF tshirt out of it too! – Chopper3 Apr 16 '10 at 13:23
  • I listen to most of the podcasts but missed that one, will go back and check it out (probably this weekend). – Kyle Brandt Apr 16 '10 at 13:26
  • Sorry about that pQd, I have actually always read your nick as PDQ as in PDQ Bach: http://en.wikipedia.org/wiki/P._D._Q._Bach :-) – Kyle Brandt Apr 16 '10 at 14:45
  • @kyle, your the man! – The Unix Janitor Apr 16 '10 at 14:57
  • @Kyle Brandt - no worries... just phonetically spelled parts of my first/last name, but i tend to capitalize it as in my nick. – pQd Apr 16 '10 at 16:42
6

40kbps is very low [up to the point that i would suspect faulty media converters/duplex mismatch [but you have gigabit so there is no place for half duplex!] etc]. there must be packet losses or very high jitter involved.

iperf is first tool that comes to my mind to measure available throughput. run at one side

iperf -s 

and on the other:

iperf -t 60 -c 10.11.12.13

then you can swap client/server roles, use -d for duplex etc. run mtr between both machines before start of the test and see what latency / packet losses you have on unused link, and how do they change during the data transfer.

you would like to see: very small jitter and no packet losses until link is saturated at 90-something percent of its capacity.

iperf for *nix and win, read here and here about it.

mtr for *nix and win.

pQd
  • 29,561
  • 5
  • 64
  • 106
  • We know that the link is made up of 6 1000-base-zx links so there's bound to be latency introduced by all that repeating but even so I'm surprised as you are how low it is, great tip on the iperf thing by the way, I'd totally forgotten it existed! – Chopper3 Apr 16 '10 at 11:03
  • please post your results! – The Unix Janitor Apr 16 '10 at 11:05
1

tracepath can show you routing problems between the two sites.

iperf, ttcp and bwping can give you useful information.

do you know how this 1GB link is being provisioned? are you bridging or routing over this link? What is your SLA for the link? you could being shaped by your link provider?

if your only getting 40kbs , then there is a serious problem, are you sure that it's not a 1MB's link rather than 1GB/s link. You'll probably find that the speed of the link is not what you think it it :-)

The Unix Janitor
  • 2,388
  • 14
  • 13
  • Thanks for your answer, it's a dedicated multi-segment bridged single-mode fibre link, there's no shaping at all involved as it's just L2 all the way - oh and I do so hope it's not a 1Mbps link, not with the money it's costing :) – Chopper3 Apr 16 '10 at 11:08
  • 1
    if your bridging to your LAN, i.e. no routing anywhere, then network broadcasts will be wasting link capacity, true for 1gb's it will be a small fraction, but a misbehaving network service could flatten the link. I presume these bridges are out of your control. These switches may be overloaded, or incurring very high latency. High latency means low bandwidth. – The Unix Janitor Apr 16 '10 at 11:17
  • @user37899 - high latency does not have to mean low bandwidth, but requires tuning... anyway - how much latency can you get on 200 miles - if things are ok - no more than 3-10ms. arp [or other] broadcast at the gigabit link is probably very small fraction of the whole available capacity. – pQd Apr 16 '10 at 11:35
  • 1
    If you have network broadcasts occurring at such a level as to affect the performance of the link, then I suspect you would have had internal performance problems long before this new line came in and would have noticed as much. – joeqwerty Apr 16 '10 at 11:38
  • @pQd i was actually talking about a broadcast storm. – The Unix Janitor Apr 16 '10 at 11:39
  • @user37899: If it was a one time broadcast storm then it's an anomoly and subsequent tests of the link would be fine. If it's an ongoing broadcast storm then, again, evidence of it would be apparent internally and would be affecting internal performance. – joeqwerty Apr 16 '10 at 12:50
0

RFC 2544 or Y.156sam

These are network tests that are done to prove out the SLA by the carrier. IPERF and the like are not verifiable network test methods.