As we're relying on RDS Postgresql manual backup for our backup strategy, we encountered the issue with the possible downtime of the RDS instance (single AZ) during snapshot creation. According to AWS:
Creating this DB snapshot on a Single-AZ DB instance results in a brief I/O suspension that can last from a few seconds to a few minutes, depending on the size and class of your DB instance.
which is not really clear how we can be sure if the DB instance I/O is functioning normally during snapshotting period, as if the DB is down for a short period we'd like to stop our corresponding web server or take it out of the load balancer to ensure no connection interruption could happen from customer side.
What made us quite wondering are:
Does the DB really have downtime during snapshotting, AWS just says about "I/O suspension" and "latencies"? I read somewhere that the downtime lasts for short period (from few seconds to minute) just during snapshot initialization, can we know if that downtime has passed and the DB instance is ready to serve (while its snapshot still being created)?
What is general best practice to deal with these IO suspensions? As seems it happens even with automated backup, does it mean the site could possibly have a downtime everyday when DB snapshot creation is in progress?