The title is a bit confusing, but I can't think of a better one.
What I have is a simple vanilla MySQL replication, with the slave occasionally failing, with this error: Error 'Can't find record in 'my_tbl'' on query. Default database: 'my_db'. Query: 'UPDATE my_tbl SET ... WHERE ...'
(columns omitted for clarity).
What I'm assuming this error means, is that the slave sql thread executed this update, and received 0 rows affected
. This was not what it expected when comparing the result of 1 rows affected
from the relay log, thus generating an error.
When running this same update transaction manually, it works. Same thing when running START SLAVE
- it just starts working, and goes back to normal.
This doesn't make sense to me at all - if all it takes is a "retry" to fix this, how could this happen in the first place? Everything is executed in a serialized fashion, and nothing else is writing to the slave mysql server.
Can someone provide an explanation?
Some technicalities - this is a mixed replication setup from 5.5.7-rc to 5.5.12.