I have a 3-node galera cluster with a few hundred databases servicing various clients.
I'm only using galera as an easy way to handle replication and fallover. I'm not actually using multiple masters at once.
I need to run a rather expensive ALTER on some tables in each database. Typically, I would have a script that ran an upgrade script on each database, one by one, and just turn each site off one at a time.
But with this galera cluster, any slow DDL has resulted in a complete lock on all databases, not just the one it's run against. Basically it means whenever I need to run an upgrade, everyone goes offline for the entire time it takes me to run the upgrade for everyone.
I know there is wsrep_OSU_method, but changing this to RSU has problems of it's own and I don't think it helps.
Is there just a way to disable the lock? Or at least make it a database-level lock rather than a server-level lock?
What if I were to disable my other nodes in the cluster while running the query -- would this still result in a global lock? Like I said, I don't actually use multiple masters, so in this case, having the other nodes offline for a few minutes is okay (assuming if the usual automated rejoin process happens when I brought them back).