6

We keep getting the following error popping up in the apache error log:

[error] (103)Software caused connection abort: cache: error returned while trying to return disk cached data

This error occurs at irregular intervals, but on average about 1-2 times every 10 minutes. Over the past 2 days the site has gone down several times, possibly because of this error.

The only other error that is popping up in the log is client denied by server configuration, which has occurred about 10 times over the past 2 days.

We're using Apache/2.2.14 (Ubuntu). Top returns:

top - 15:47:19 up  4:28,  2 users,  load average: 0.36, 0.78, 1.32
Tasks:  95 total,   1 running,  94 sleeping,   0 stopped,   0 zombie
Cpu(s):  0.0%us,  0.2%sy,  0.0%ni, 99.2%id,  0.7%wa,  0.0%hi,  0.0%si,  0.0%st
Mem:   3091660k total,   698416k used,  2393244k free,    58732k buffers
Swap:   492536k total,    31112k used,   461424k free,    52068k cached

Any ideas what might be causing this issue and how we might resolve it?

drAlberT
  • 10,871
  • 7
  • 38
  • 52
Rick Westera
  • 161
  • 1
  • 4
  • 1
    What does your apache configuration look like? – Jenny D Mar 26 '13 at 07:46
  • 1
    Hi - the config files are here: https://www.box.com/s/mmk462x5fu7k6qfhpgon . default and mysite are linked to in sites-enabled. – Rick Westera Mar 27 '13 at 21:12
  • Could it be related to this (https://issues.apache.org/bugzilla/show_bug.cgi?id=50024) bug? Maybe you could try to update Apache itself (latest version is now 2.2.27)? It also contain several security fixes, so it would be useful to update, even if the problem will not be solved. – Andrey Sapegin Jul 25 '14 at 10:36
  • 1
    Probably Won't help but here is the advice: This config is not production ready. First disable DNS lookups "HostnameLookups On" this is a killer for performance. Trim enabled modules as much as possible, raise expiration of css/txt/js (W3C total cache have a good set of rules) and use cache busting techniques to force new content, revisit mod_cache doc (thundering herd), keep cache files away from any webroot (caching on disk is more than rendered page), adjust cache expiration as needed. – zeridon Jul 28 '14 at 07:41

1 Answers1

1

I've seen the error message:

[error] (103)Software caused connection abort: cache: error returned while trying to return disk cached data

before in two situations, both which involved file-system problems (as the message itself kindof implies) One was a situation where the partition holding the cache was full. the caches grow really fast, and well, it ended up filling the partition. The other time the filesystem itself was damaged.

The error implies that its a read problem, but it can be a write problem too.

Root cause: your filesystem

Proposed solution: Check the integrity of the filesystem, and if that checks out, move the cache to a bigger partition or bigger disk, and you should be okay.

Joe Sniderman
  • 2,749
  • 1
  • 21
  • 26
  • I haven't been involved with this issue since about Apr 2013 so I have no idea whether this solves the problem. Given AlberT has opened the bounty, I'll have to let him decide. – Rick Westera Jul 30 '14 at 03:47