1

after some advice regarding a problem i am getting using a linux based piece of software to balance traffic between two servers.

Basically we have our production website and a backup system (at remote site). the production is being mirrored to the backup constantly to keep them synced. our domain name points at a Linux Ubuntu 9.04 server (clean install nothing apart from the load balancing software). which is running the lastest version of Crossroads (aka XR).

XR is setup to hand all connections to the live webserver until it looses its "heartbeat" connection with that server, once that happens it bounces the connections to our backup system.

The problem i am getting is manifesting itself as a lack of response from our webserver, basically the client passes a correctly formed XML message to a .NET webservice, this service does some calulations and changes to the data then replys with an XML response, however the client never seems to get the response.

I have been using wireshark to investigate this problem and it appears as though half way through the response the connection gets cut off or dropped (not really sure due to my lack of experience with wireshark).

i have been speaking with the authors of the XR software and they cannot find any reasons or problems in the software itself that could explain this behavior, and belive it maybe something to-do with the distro of linux i am using or a kernel issue.

can anyone help me resolve this issue as we are due to take this system live in the next few weeks and this problem is holding us back.

I have now changed over from Ubuntu to CentOS 4 and have tried again, now i am getting random replys from the systems when i use wireshark. sometimes i get a fully formed XMl reply from the server, and the next try i might only get a partial reply before the Linux box sends a RST packet.

Murali Suriar
  • 10,166
  • 8
  • 40
  • 62
Kristiaan
  • 432
  • 1
  • 9
  • 21
  • I think that jumping to the conclusion that you have a kernel issue or a network transport bug without having proved that the problem is not just a misconfiguration issue is wrong. Try looking at the configuration of XR, checking DNS and routing, serving straight html from the server, then static xml. See if any of these fail. Enable verbose logging the load balaancer and your application. – pauliephonic Jun 22 '09 at 09:14

2 Answers2

0

Probably not the most useful answer, but have you tried another load balancer? Only suggesting this as you've had no response in 6+ days now :-)

Something like HAproxy (http://haproxy.1wt.eu) is pretty good and doesn't require a huge amount of setup to perform most tasks, here is a sample site config:

listen corporate_web_live
  bind 1.2.3.4:80  # www.site.com
  bind 1.2.3.5:80  # www.site.net
    option httpchk HEAD /server.txt HTTP/1.0
    cookie HAPSRV insert postonly indirect
    server webapp-corp-1 10.0.0.1:80 weight 50 maxconn 150 slowstart 30s cookie WAC1 check
    server webapp-corp-2 10.0.0.2:80 weight 50 maxconn 150 slowstart 30s cookie WAC2 check

It is written by Willy Tarreau who maintains the Linux 2.4 kernel tree and has been consistently benchmarked handling 10Gbps+ of throughput on relatively commonplace hardware. Handling SSL termination with Apache + mod_ssl or stunnel is very possible too.

When handling HTTP traffic it can also do some very funky Layer7 stuff, and it supports balancing other protocols also, the SMTP support is quite useful to me!

nixgeek
  • 874
  • 5
  • 14
0

Ok this turned out to be nothing at all to-do with the OS but an issue with the software (xr aka Crossroads) i was using. entering some time out values into the config xml file (thanks to suggestions from the softwares author) and this seems to have resolved the problem.

Here is an example of the entries you need to add in to the config file between the # symbols, the timeout given is a little excessive ( 1 minute )

<service>
    <name>web_http</name>
    <server>
      <address>x.x.x.x:80</address>
      <type>tcp</type>
      <dispatchmode>first-available</dispatchmode>

#

      <clienttimeout>60:60</clienttimeout>
      <backendtimeout>60:60</backendtimeout>

#

    </server>
      <backend>
        <address>x.x.x.x:80</address>
      </backend>
      <backend>
        <address>x.x.x.x:80</address>
      </backend>
      <backend>
        <address>x.x.x.x:80</address>
      </backend>
  </service>
Kristiaan
  • 432
  • 1
  • 9
  • 21