1

I am running memcached on my server and when it hits 600+ req/s it becomes unstable and causes a big load of problems. It appears when the request rate gets that high, my PHP applications at random times are unable to connect to the memcache server, causing slow load times which makes nginx and php-fpm freak out and I receive a bunch of 104: Connection reset by peer errors in my nginx logs.

I would like to point out that in my memcache server I have 'hot objects' - objects that at times receive 90% of the memcache requests. I also noticed when so many requests hit a single object, it slightly adds a little more load time to the overall page (when it manages to load).

I would greatly appreciate any help to this problem. Thanks so much!

Aco
  • 103
  • 1
  • 2
  • 11
  • 1
    what is your -c flag set to in memcache? – Mike Jul 21 '11 at 05:52
  • 100000, more than enough? :) – Aco Jul 21 '11 at 07:09
  • Is memcache on the same box and connecting via UNIX socket? Could you drop phpMemcacheAdmin (http://code.google.com/p/phpmemcacheadmin/) on your box to see patterns? Also, for your hot objects, since you're using php-fpm could you APC caching for those instances? – Eric Caron Jul 21 '11 at 16:27
  • It is connecting locally to 127.0.0.1:11211. I installed phpMemCacheadmin and I don't know what patterns I should be looking for. I see a bunch of stats, similar to memcache.php. I am aware that I can use APC to cache it locally, but I would want distributed caching in the even that I expand by adding an additional server, which is very likely to happen soon. I am really trying to figure out the root of the problem and then deal with it accordingly. – Aco Jul 22 '11 at 00:59

1 Answers1

2

This really sounds like an issue with the networking layer. When you run into this issue can you grab the output of netstat -ano and see how many connections are in certain buckets? If you see a ton of connections that aren't in ESTABLISHED but rather in TIME_WAIT/FIN_WAIT etc you probably need to enable time wait reuse and recycling. From:

http://www.speedguide.net/articles/linux-tweaking-121

TCP_TW_REUSE This allows reusing sockets in TIME_WAIT state for new connections when it is safe from protocol viewpoint. Default value is 0 (disabled). It is generally a safer alternative to tcp_tw_recycle

echo 1 > /proc/sys/net/ipv4/tcp_tw_reuse (boolean, default: 0)

Note: The tcp_tw_reuse setting is particularly useful in environments where numerous short connections are open and left in TIME_WAIT state, such as web servers. Reusing the sockets can be very effective in reducing server load.

polynomial
  • 3,968
  • 13
  • 24