Occasionally yum's cache gets corrupted and we see errors like this:
error: db3 error(-30974) from dbenv->failchk: DB_RUNRECOVERY: Fatal error, run database recovery
error: cannot open Packages index using db3 - (-30974)
error: cannot open Packages database in /var/lib/rpm
The workaround is rm -f /var/lib/rpm/__db*
and then the next "yum" command regenerates the data.
My question is: what is likely to be causing this? Is there some common task that ignores locks or has other problem that causes this?
We have hundreds of CentOS machines and there is no pattern to which see this problem. It could be a "one in a million" issue, which at large scale is seen often.
NOTE: I realize this is a very "open ended" question, but if an answer finds the cause, I will go back and turn the question into something more canonical that directly relates to the specific issue.