-1

We look after a series of restaurants which is a collection of point of sale terminals and a typically a single workstation per site.

We have one site that is causing a fair amount of head ache and no amount of google searching is answering the question.

In one particular site the network keep crashing. Crashing on the lines of all network devices stop responding , you can't see anything on the network and you can not get on to the internet. This happens multiple times a day or can go for days working fine. Once the issue happens they can no longer process credit cards of interac transactions... primarily the reason they would like to see it solved.

To fix the issue the staff unplug all the ports from the switch and power cycle the switch and router. ( I am unclear if this was a result of the power cycling, however we replaced a switch yesterday that had quite literally burnt up and also took out the router )

So after much checking we have narrowed the issue down to a the point of sale terminals. Which if we pull the power on the terminals and then power them back up also seems to fix the issue.

The network is nothing fancy a basic switch and lines running out the the registers and the point of sale terminals.

We have explored the ideas of putting in a managed switch and using spanning tree or storm control etc... based on that it could be a broadcast storm of some sorts. I am not wanting to go buy a bunch of more expensive switches if it won't solve the issue, however the client is not against it. Alternatively maybe there is a setting on the POS terminals that needs to be turned off, however we have been in contact with the provider with no success.. further to that all of the other stores are working fine.

Currently we have not run wireshark or anything on the network yet, however it could be a next step. But, if someone had experienced this before or has an idea ... it might save me some time.

Thanks!

  • 1
    `But, if someone had experienced this before or has an idea` Put managed switches in there so you can actually monitor what's going on and fix it. It's really as simple as that. – HopelessN00b Apr 27 '16 at 16:46
  • Yes that is an obvious solution, however I am trying to dig deeper here about if anyone has experienced what might be causing this issue as well. There are multiple sites and if starts to happen else where I would like to understand the problem and the solution. So far I have not found any results searching for this kind of error. – Kris Brunsgaard Apr 27 '16 at 16:51
  • 3
    Well, digging deeper requires (at a minimum) tools for digging and/or more information. So get yourself some tools and use them to get more information. All you have now is that the network goes down, and a correlation of that even with a business-critical network attached device. That barely qualifies as information, and is certainly insufficient to dig deeper with or determine a cause from. Seeing as you had an incident where an unmanaged switch literally crashed and burned, and took a router out with it, replacing your switches with something more suitable is a good starting point. – HopelessN00b Apr 27 '16 at 16:57
  • It's important to note, if you literally only have 1 switch, then by definition this cannot be a broadcast storm. A broadcast storm requires switches to keep forwarding broadcast packets. With only one switch (and a router which is likely just the gateway) there is no loop for the broadcast packets to follow. Of course, this assumes the router is just the gateway put in by the ISP. – Naryna Apr 27 '16 at 17:05
  • Do you know if the issue is caused by one or multiple POS terminals? Do you have these POS terminal models in other sites/subnets? Is there any access to the OS of the terminals? – Jimmy Lem Apr 27 '16 at 16:42
  • At this point we have to power off all of the terminals. We have not narrowed it down to a single terminal. These terminals are a standard issue and do exist in other sites. The networks are site specific and all on the default subnets currently. The only access we have to the terminals is the interface on the terminals as they are locked by the provider. – Kris Brunsgaard Apr 27 '16 at 16:48
  • 1
    @BrandynBaryski: I'd probably beg to differ with you regarding your definition of a broadcast storm. A malware infected (or otherwise misbehaving) device is perfectly capable of generating enough network traffic to cause a broadcast storm. I wouldn't rule that out simply because there's only a single switch in the environment. – joeqwerty Apr 27 '16 at 17:14
  • 4
    @joeqwerty It really flows down to the different definitions then. From a networking perspective a broadcast storm happens when broadcast packets continue looping through the network. This is what STP prevents. STP is not going to prevent a malware infected machine from sending out hundreds of thousands of broadcast packets to a one switch topology. Of course, there's nothing saying that it can't flood the switch with unicast packets (then it really isn't a broadcast storm) There simply is no loop for STP to stop. Storm control, however, can work. – Naryna Apr 27 '16 at 17:20
  • @BrandynBaryski: I gotcha. I'm using the term broadcast storm a little more broadly. At any rate, I wouldn't rule out a broadcast or flooding issue in this scenario. – joeqwerty Apr 27 '16 at 17:38
  • @KrisBrunsgaard I'm actually a little more interested by what you mean by all network devices stop responding. You said there is at least one standard workstation. Does it freeze completely, or just lose network connectivity? Can you open task manager on it and see utilization (especially network) at this time? If it does freeze completely, can you unplug the network cable and see if it works normally? This is a cheap and dirty way of ruling out something sending out countless broadcast packets. It does not, however, rule out unicast packet flooding. – Naryna Apr 27 '16 at 17:47
  • @BrandynBaryski From what I understand it is the network connection that is lost. However I have shown up to fix it while it was still happening and was able to continue to use the computer. – Kris Brunsgaard Apr 27 '16 at 18:29

1 Answers1

0

These are just Point Of Sale machines and should not require any fancy or expensive networking.

I suggest you try to go over the following areas, you've might have missed something.

  1. Network Switches and Routers - a switch can go bad. Some switches will have some bad ports after sometime that would cause network packets to crash. Routers would also get issues after a certain period of use. These devices are up for replacement. But you've done this already.
  2. IP Address Allocation - be sure all machines are properly getting IPs. IP Conflicts can also cause networks to crash, not see each other or make any network connection. Your DHCP might be full or not working properly. Or if your using Static IP, you might have used the same IP on more than 1 machine.
  3. Isolated Test - Do local testing first. Machines should see each other in the local network. If they are all there, then your local network is fine. The problem now is with your internet gateway. Something might be wrong there. Do pings directly from your internet gateway to see if you're getting packet losses.

Best of luck!

jarvis
  • 1,956
  • 4
  • 17
  • 31
  • Thanks for your reply, yes I have replaced the unit.. and actually it has been replaced a once before as well. I will check in to the static IP vs DHCP and conflicts idea, however the random requirements of rebooting the system and then it working for a while with out issue I am not sure how that would effect it unless something else is cycling through IP addresses. – Kris Brunsgaard Apr 28 '16 at 23:16