Connection fails after one packet, server is unreachable by some clients while reachable by others

1

I am clueless about this issue. I have a machnine with Windows 2003 Server R2. From time to time the following issue arises:

  • The server is accessible from certain clients
  • From other clients (on the same network) it is unreachable and when I ping it the first packet gets a reply but only the first, the rest gets a ping timeout. If I continue pinging, no packet gets any reply. However if I try the ping a few minutes later, the first packet gets a reply again, then silence.
  • All the while the server is readily accessible from some clients
  • The same "only one ping reply" happens if I try to ping from the server, again, some machines can be accessed, but some not.

It is totally random which connection is working and which is not. It looks like some kind of routing issue, but I already tried the following, without any luck:

  • I changed the UTP cables
  • I plugged the server into a different port on the switch
  • I reinstalled the NIC driver on the server
  • I switched off the firewalls on the clients

Sometimes it helps if I disable then enable the NIC on the server, sometimes it doesn't help.

The strangest thing is that this issue is recurring: it arises, then a few days later it disappears, then it arises again after a few weeks.

With the help of suggestions below, I traced down the issue to a colleague's android phone. Now I only want to know how did the packets got sent to the phone's MAC address when the IP of the phone was delegated by the DHCP and was different than the server's.

Moha

Posted 2014-01-15T09:17:26.120

Reputation: 141

Could you have a duplicate IP address on the network? – Paul – 2014-01-15T09:35:35.660

No, I have only one DHCP server (a router) that grants IPs from 192.168.1.101 to 199, two servers on .100 and .200 and a bunch of printers with static IPs from .50 to .60 No IP collision. – Moha – 2014-01-15T10:01:55.743

If it were my issue I'd be running wireshark on the server to see if the pings got there or not. Does the ping fail if you do it from the server to the client that fails? – Paul – 2014-01-15T11:22:55.973

Tried Wireshark. When that first ping gets a reply, wireshark also displays echo request and reply. For the further packets, the echo request doesn't even reach the server (Wireshark shows nothing). The same happens if I try to ping a client from the server, except that requests are sent out but no repy. – Moha – 2014-01-15T12:30:23.043

Ok, try that from the client side as well, to make sure the client is actually sending. Is there anything in the path between the client and server or are they on the same network? – Paul – 2014-01-15T13:17:34.757

Tried that. Echo requests are sent out, only the first one gets reply. The only thing is that header checksum is wrong. Wireshark says> Header checksum: 0x0000 [incorrect, should be 0x9903 (may be caused by "IP checksum offload"?)] But this checksum is wrong for the first packet (which gets reply) too. Also it's not just one client. Sometimes client A works but client B not, some times vice-versa. It must be some issue on the server as there's no such error when the connection doesn't involve this server. All clients are on the same network, ther's only a 24 port switch between them. – Moha – 2014-01-15T14:04:43.533

The last thing to check is that in the wireshark on the client check that the successful and not successful packets are being sent to the name MAC address (in the ethernet section) and that the mac address matches the NIC mac address. If they are, it pretty much points to a faulty NIC on the server. – Paul – 2014-01-15T21:02:28.880

Strangely enough, the issue resolved itself. Now everything is working fine, every client can access the server. But I'm sure this issue will return as it did several times. My next move will be a new NIC, hope it will totally eliminate the problem. – Moha – 2014-01-16T12:26:41.827

I think I'm now closer to finding a culprit. Indeed the first ICMP packet that gets a reply is sent to the correct MAC, but the others are sent to a different MAC, that belongs to - according to wireshark - a Samsung device. How can I find the device that this MAC belongs to? – Moha – 2014-01-16T14:22:52.810

This is definitely a duplicate IP address issue then. You will need to get into the switch, and see which port has the mac address learnt, then trace the cable coming out of that port to see what device it is. – Paul – 2014-01-16T23:17:43.553

No answers