Solution for noisy network that corrupts downloaded files

1

The network on the company where my wife works has issues. Almost every single file bigger then a few hundred kB is corrupted during download and even web pages are sometimes garbled. We checked all we could possibly check and it really isn't a problem on her computer (specially cause every other computer terminal in the building have the same problem).

All files are corrupted, including documents, photos, web pages... this happens for files downloaded through any means (various browsers, wget, download-managers, ...) except Dropbox. but the worst problem is that she can't install anything in her Ubuntu machine because the packages are corrupted at download and there's a checksum mismatch every time she tries to use apt-get or even run a downloaded script.

The network admin doesn't seem to be interested in correcting this and he doesn't even seem to understand what the problem is and what can be causing the problem (he insists that this is caused by heavy traffic in the network).

What is strange is that the files downloaded by dropbox are not corrupted. I think maybe it uses a checksum test and tries to download again in case it fails.

So, we thought that maybe there's some program that makes some kind of check during the download and download again the corrupted bits? Is there any way we can use this noisy network?

There's really no other option, it's the only fast network she has access to... :(

EDIT: I'm even losing ssh connections over this network. Can't manage to stay connected via ssh for more than 30 seconds... :( I get a:

Corrupted MAC on input.
Disconnecting: Packet corrupt

Rafael S. Calsaverini

Posted 2011-06-03T14:27:27.550

Reputation: 173

Reading this, it seems like it's more that the Internet connectivity where your wife works has problems. If the corruption isn't happening on internal network operations, it's not the network. – fencepost – 2011-06-03T14:51:18.943

If it is only internet traffic, the network admin may see this as a feature rather than a bug. – horatio – 2011-06-03T14:56:10.393

“Heavy traffic”? That should not be a problem, otherwise it would be like saying that if one process on a multi-tasking system is busy, then all other processes crash; it’s wrong because all it does is create a queue (for the most part; there are some situations in which a process could crash or a network connection could get dropped, resulting in an incomplete or corrupt file, but that’s when it gets bogged down and has to wait a lot). – Synetech – 2011-06-03T16:03:14.240

The TCP/IP stack should already be handling corrupt packets and causing them to be re-sent. However since it’s a problem on all the systems, something else wrong with the network itself. Have your wife do a test to see if she can transfer big files between other computers on the network (ie to a local colleague). If that works, then it’s the gateway, if not, then there may be something wrong with the cabling or some kind of interference or other problem with the network itself. – Synetech – 2011-06-03T16:04:48.390

actually this doesn't belong here (or perhaps serverfault..) – bubu – 2011-06-03T17:56:49.843

Answers

1

TL;DR: Sounds like Internet connection oversaturation. Try off-times, torrents, download managers, anything able to redownload only damaged portions of files. Also some network tips at the end.

I'll spare you most of the scolding for what sounds like your wife making personal use of the network at her office (/me slaps you on the wrist, consider yourself scolded), but if an office is the only available high-speed connection it makes me think you're in an area with generally poor broadband availability.

If that's the case, it's quite possible that the business connection in question is a T-1 line or perhaps bundled T-1s for higher capacity (or even a satellite link). Any of those options generally give performance that is at the low end of modern broadband speeds, and it's quite possible that the external network connection is saturated. Heavily oversaturated connections can easily lose packets, at least as far as the applications expecting them are concerned, because by the time a missed packet is re-requested and delivered the application has given up expecting it.

There's not going to be much that you can do beyond using block-oriented download methods such as BitTorrent (and Dropbox) that do checksumming of each block and which can transfer only mismatched blocks to reduce network traffic. Rsync (possibly tunneled over SSH) is another method to transfer files with that kind of block-level checking. Your best bet (at least for file-based transfers, not much help for email/browsing) may be to get another account outside the network where you can download files and "stage" them either into your Dropbox account or so you can then use rsync for downloads.

If you can get the network manager interested (assuming he hasn't already done some of these things) there may be ways to reduce the demand on the network connection such as implementing a transparent caching Squid proxy, banning/blocking streaming & torrents, and adding "greylisting" or an email spam filtering system that includes it between your mail server and the Internet (assuming that email is handled in-house).

Edit: Squid proxies and greylisting can actually be surprisingly simple to set up, particularly if you're using VMWare. Prebuilt VMs with Squid and filtering are available, and for greylisting ESVA (www.global-domination.org) is a decent choice if somewhat neglected these days.

fencepost

Posted 2011-06-03T14:27:27.550

Reputation: 1 086

As a quick note, QoS (Quality of Service) management to prioritize traffic based on protocol will probably not make much difference - if the network connection is oversaturated, the QoS would be managed on the business's end of the connection, but the saturation is happening on the carrier side. It's like pushing the output of a garden hose through a straw - it doesn't matter what kind of filtering/QoS you do at the output end of the straw, that's not the bottleneck. – fencepost – 2011-06-03T15:43:40.177

The OP already said they tried wget, download-managers, etc. If resuming is not working, then it is indeed that the file is getting corrupted. In the past I would have tried cutting off some of the file at the end and resuming from there, but then I discovered that the corruption could have happened anywhere in the file, even relatively early, which makes resuming completely useless. Your suggestion of P2P is good for things that can be downloaded like that due to it being cut into pieces and each one hashed, but it won’t help with single photos, web pages, etc. like they said. – Synetech – 2011-06-03T15:59:53.333

0

TCP/IP has got checksum automatically. If there is corruption, it sends reset packet and all starts over again

we used to have oversaturated internet line all the time but underlying HW always worked flawlessly.

it is hard to pinpoint what is wrong unless I see how is all connected, set-up.

Change the job :-P

David Hajes

Posted 2011-06-03T14:27:27.550

Reputation: 25

0

My guess is there is file inspection happening that is causing the corruption. Maybe there is a Anti-malware gateway on the network and perhaps this service cannot inspect SSL sessions (would explain Dropbox success).

I do not know whether apt-get does or can use SSL connections, but its worth a look.

uSlackr

Posted 2011-06-03T14:27:27.550

Reputation: 8 755

Is there a way to use wget or a similar tool with SSL? If the file I'm downloading is not located in a server where I must log in though SSL, how can I "disguise" the download session as a SSL session to test this hypothesis? – Rafael S. Calsaverini – 2011-06-03T14:44:27.257

wget will use ssl if the url is https and the other end supports it. It's always worth a try. Many sites configure SSL but don't require it. Do https web site display OK? – uSlackr – 2011-06-03T14:55:02.067

0

You could use tcpdump or Wireshark or some other package to trace the network traffic.
(Note: 3rd party compiled package needed for Wireshark on Ubuntu).

This could help you figure out what is going on.

harrymc

Posted 2011-06-03T14:27:27.550

Reputation: 306 093