0

We recently reinstalled 3 of our mysql servers running in master slave with percona 5.7 and ubuntu 18.04 instead of oracle linux 7. These servers are currently connected to a mysql 5.6 master. And since the reinstallation we are getting way more slave lag than when we were running mysql 5.6 on the slaves (it was almost non existent).

The 5.7 servers have the same hardware just some more disks. And instead of one span of 12 disks the reinstalled servers are using 2 spans of 10disks. Both are running on an lvm.

All servers 5.6 and 5.7 are running the noop scheduler and we tested the server raid performance before installing using fio which was all fine (200k random iops). We tried to fine tune the 5.7 settings to mimic the 5.6 master as much as possible.

set global optimizer_switch='derived_merge=off,duplicateweedout=off,condition_fanout_filter=off';

We run a fairly big database 3TB on dual cpu xeon's with 512GB ram optimizing our schema is in progress but far from finished. It looks like the slaves are lagging on the sql_thread and not the io thread. Do you guys have any idea why the performance of 5.7 is worse then 5.6?

  • Have you compared the `my.cnf` files? Have you measured CPU, memory, disk while running each server? – Bert Sep 09 '20 at 14:00
  • my.cnf files are mostly identical. We needed to make some changes due to version compatibility. cpu / mem and disk usage are looking ok. – Jan Willem Mannaerts Sep 09 '20 at 14:24
  • "span"?? Raid-0 or 1 or 5 or 10? – Rick James Sep 10 '20 at 17:55
  • Using multiple threads for replication? Same database, or multiple databases? Please provide more details. – Rick James Sep 10 '20 at 17:56
  • Turn on the slowlog with long_query_time = 1 and the setting to include replicated queries. – Rick James Sep 10 '20 at 17:57
  • We are using 2 spans of 10 disks both disk spans are raid10. We have a master running mysql 5.6 with multiple databases replicating to a 5.7 slave with the same databases. The mysql slave is replicating to 2 other mysql 5.7 databases because we are planning to decommission the 5.6 server and also reinstall it. But the poor slave performance is holding us back at the moment. Alls 4 mysql servers are identically specced appart from the raid config. The 5.6 server is using 1 raid10 span of 12 disks. the 5.7 servers are using 2 raid10 spans of 10 disks. – Jan Willem Mannaerts Sep 11 '20 at 08:31
  • What we also see is that the io thread is keeping up with the master de sql threads run behind. The slaves seem to be hanging on system locks. I think this is the issue but no idea how to trouble shoot these locks. We have a few big tables without pk or uniques but this never caused a problem on mysql 5.6 – Jan Willem Mannaerts Sep 15 '20 at 08:56
  • https://pastebin.com/8Gqy76hJ the slave status. The master bin log is on par with slaves io thread bin log. – Jan Willem Mannaerts Sep 15 '20 at 09:02
  • @JanWillemMannaerts Do you think you will have time to provide the 5.7 Slaves additional information this week? Your slave is nearly 3 hours behind master and we know you want to be catching up. – Wilson Hauck Sep 19 '20 at 12:16
  • mysql slave config: https://pastebin.com/neJZuLvt – Jan Willem Mannaerts Sep 21 '20 at 07:07
  • after setting:set global sync_binlog = 0; set global innodb_flush_log_at_trx_commit = 2; on one of the slaves the performance increased by almost 400%. But what does this indicate? i know these settings are not as safe as sync_binlog = 1 and innodb_flush_log_at_trx_commit = 1; – Jan Willem Mannaerts Sep 22 '20 at 11:03
  • How much RAM is on your 5.7 server? Could you post results of TOP display page 1? You should review this url - https://dev.mysql.com/doc/refman/5.7/en/replication-options-binary-log.html - and consider going back to the 'greatest possible durability and consistency in a replication setup' . Please post to pastebin.com TEXT results of SHOW GLOBAL STATUS; and SHOW GLOBAL VARIABLES; for server workload tuning analysis. – Wilson Hauck Oct 08 '20 at 15:56
  • How much RAM is on your slave server? From the Slave server, please post to pastebin.com TEXT results of SHOW GLOBAL STATUS; after minimum 24 hours uptime and SHOW GLOBAL VARIABLES; for server workload tuning analysis – Wilson Hauck Oct 08 '20 at 16:20
  • Same as the master 512GB hardware is identical. We have no cpu spikes or abnormal memory usage. They are 24 core dual cpu setups the load is always under 6. Also running the same mysql settings, just moved away from binlog 1 and changed flush binlog settings to 2. The only downfall is that we would corrupt a slave in a power failure. Time to bring in the cavalry though hired percona to have a look in a few weeks. – Jan Willem Mannaerts Oct 09 '20 at 18:26
  • @JanWillemMannaerts In a few weeks let us know summary outcome of Percona support activities, please. – Wilson Hauck Nov 12 '20 at 15:09

0 Answers0