1

I have encountered an issue on a managed/cloud SQL where an instance has unexpectedly (also, outside the maintenance window set) entered maintenance mode and has remained unavailable for more than 10 hours.

Some details follow:

  • Second generation MySQL with the size of database around 180 gigs;

  • CPU usage is low, connection count low too;

  • connections seem to be accepted but then rejected immediately with: ERROR 2013 (HY000): Lost connection to MySQL server at 'reading initial communication packet', system error: 2

  • It is not possible to stop/restart nor edit the instance (as it is in maintenance mode). Most of the controls on google cloud console page for this SQL instance are disabled.

  • Seems like the nightly backup which has been skipped the night it went to maintenance mode

  • Mysql logs don't show anything suspicious, i.e. page_cleaner runs (but I am not an expert in parsing mysql logs)

Another thing that is suspicious is the size of the database "did not get a cut", which it seems to get regularly every night, -- probably because of a skipped cleanup of binary logs?

This serverfault post could be related to my issue but I am not sure. What I don't like is that the solution there is to drop the database and create it a-new. That is not an option for us as we have production data and a backup was skipped due to this issue for a backlog of one day already. - https://stackoverflow.com/questions/49424706/google-cloud-sql-instance-always-in-maintenance-status-binary-logs-issue

razzmatazz
  • 11
  • 2
  • The issue has been resolved by google tech support. Basically there has been an unscheduled (outside of maintenance window) update that for some reason has crashed/disabled our database instance. Took a really long time and stress (as this was a production database) to resolve. – razzmatazz Dec 17 '18 at 17:11

1 Answers1

2

The issue has been resolved by google tech support. Basically there has been an unscheduled (outside of maintenance window) update that for some reason has crashed/disabled our database instance. Took a really long time and stress (as this was a production database) to resolve.

Posted as answer the OP comment, to prevent community bumping as it's resolved.

yagmoth555
  • 16,300
  • 4
  • 26
  • 48