random internet requests failing; advise on pinpointing exact cause requested

0

I've got this weird problem with our internet connection. At random times requests just fail. Usually with an error message that looks like "no response". Repeated attempts usually result in a correct answer though.

This happens on all computers locally in the network. The computers have a variety of operating systems. Mostly windows. I'm on linux (ubuntu 13.04).

Ping requests pointed at local network devices locally and the "first hop outside" just keep on going without any problem.

When I run a "wget" on a url this works most of the time. But in some cases comes with the error: (note: example has been anonimized)

$ wget -v http://www.example.com/ --debug
DEBUG output created by Wget 1.14 on linux-gnu.

URI encoding = ‘UTF-8’
--2013-08-08 15:25:11--  http://www.example.com/
Resolving www.example.com (www.example.com)... ext.ern.al.ip
Caching www.example.com => ext.ern.al.ip
Connecting to www.example.com (www.example.com)|ext.ern.al.ip|:80... connected.
Created socket 3.
Releasing 0x00000000007777c0 (new refcount 1).

---request begin---
GET / HTTP/1.1
User-Agent: Wget/1.14 (linux-gnu)
Accept: */*
Host: www.example.com
Connection: Keep-Alive

---request end---
HTTP request sent, awaiting response... No data received.
Closed fd 3
Retrying.

--2013-08-08 15:25:12--  (try: 2)  http://www.example.com/
Found www.example.com in host_name_addresses_map (0x7777c0)
Connecting to www.example.com (www.example.com)|ext.ern.al.ip|:80... connected.
Created socket 3.
Releasing 0x00000000007777c0 (new refcount 1).

---request begin---
GET / HTTP/1.1
User-Agent: Wget/1.14 (linux-gnu)
Accept: */*
Host: www.example.com
Connection: Keep-Alive

---request end---
HTTP request sent, awaiting response... No data received.
Closed fd 3
Retrying.

--2013-08-08 15:25:14--  (try: 3)  http://www.example.com/
Found www.example.com in host_name_addresses_map (0x7777c0)
Connecting to www.example.com (www.example.com)|ext.ern.al.ip|:80... connected.
Created socket 3.
Releasing 0x00000000007777c0 (new refcount 1).

---request begin---
GET / HTTP/1.1
User-Agent: Wget/1.14 (linux-gnu)
Accept: */*
Host: www.example.com
Connection: Keep-Alive

---request end---
HTTP request sent, awaiting response... 
---response begin---
HTTP/1.1 302 Moved Temporarily

This can happen 1 time or several times (or not at all). Repeated calls to the same url usually result in repeated success after an initial failure. This also happens on "normal" browsers, but they don't tend to give much information on what is happening.

The problem is erratic and annoying, fortunately not bad enough to still allow work to being done. But we do want to fix it.

I suspect a local network device (router/switch) to be the problem. But it is a bit hard to find out which one. Users are continuously using the network, so doing some extensive testing by removing/replacing hardware temporarily is problematic.

Are there any linux-tools to figure out what device is causing the problem? (or the internet service provider). I'm running Ubuntu 13.04

user244238

Posted 2013-08-08T13:01:02.077

Reputation: 101

Try doing a continuous trace route to the first hop outside of your network. BTW - do all URL's fail or just a specific one. – dbasnett – 2013-08-08T13:29:41.613

is there a simple way to do a continuous traceroute under linux? (other than watch). Also, this happens for all url's, sometimes even a webpage loads, but the css+js doesn't (but those might be on different domains). – user244238 – 2013-08-08T13:42:06.997

Watch seems like the easiest way. – dbasnett – 2013-08-08T14:39:09.677

No answers