We're running node.js servers on App engine flexible that are connecting to mysql using a shared private network. (app engine instances are deployed into the same vpc subnetwork that mysql lives in).
So on app engine side in app.yaml:
network:
name: projects/<projectId>/global/networks/<subnetwork>
subnetwork_name: <subnetwork>
The mysql instance is a MYSQL_8_0, db-n1-standard-1 instance with 10GB storage attached, availability zonal, auto backup enabled. The mysql database exposes a private IP and only allows SSL connections. Generally the instance is almost unused with max 6% usage and plenty of free storage left (even though Auto storage increase is enabled).
Our node servers use Sequelize to connect to mysql whith all the right SSL configs setup:
sequelizeClient = new Sequelize(MYSQL_CONF.database, MYSQL_CONF.user, dbPass, {
host: MYSQL_CONF.host,
dialect: "mysql",
dialectOptions: {
ssl: {
key: cKey,
cert: cCert,
ca: cCA,
},
},
});
MYSQL_CONF.host being the private ip used to connect for ex: 10.75.0.3.
Generally this setup works well but intermittently (happened twice) we started getting mysql connection timeouts with following message in logs:
On mysql side:
"2022-08-01T08:33:48.239519Z 1899 [Note] [MY-010914] [Server] Got timeout reading communication packets"
On node side:
"SequelizeConnectionError: connect ETIMEDOUT
at ConnectionManager.connect (/app/node_modules/sequelize/lib/dialects/mysql/connection-manager.js:102:17)
at runMicrotasks (<anonymous>)
at processTicksAndRejections (node:internal/process/task_queues:96:5)
at async ConnectionManager._connect (/app/node_modules/sequelize/lib/dialects/abstract/connection-manager.js:220:24)
but also:
"SequelizeConnectionError: connect EPIPE 10.75.0.3:3306
at ConnectionManager.connect (/app/node_modules/sequelize/lib/dialects/mysql/connection-manager.js:102:17)
at runMicrotasks (<anonymous>)
at processTicksAndRejections (node:internal/process/task_queues:96:5)
at async ConnectionManager._connect (/app/node_modules/sequelize/lib/dialects/abstract/connection-manager.js:220:24)"
and:
"SequelizeConnectionAcquireTimeoutError: Operation timeout
at ConnectionManager.getConnection (/app/node_modules/sequelize/lib/dialects/abstract/connection-manager.js:204:15)
at runMicrotasks (<anonymous>)
at runNextTicks (node:internal/process/task_queues:61:5)
at processTimers (node:internal/timers:497:9)
at async /app/node_modules/sequelize/lib/sequelize.js:301:26
at async MySQLQueryInterface.select (/app/node_modules/sequelize/lib/dialects/abstract/query-interface.js:407:12)
at async Function.findAll (/app/node_modules/sequelize/lib/model.js:1134:21)
at async Function.findOne (/app/node_modules/sequelize/lib/model.js:1228:12)
at async readUser (/app/dist/web/services/user.js:35:18)
at async getCurrentUser (/app/dist/web/controllers/user.js:49:20)"
also:
"SequelizeConnectionError: 139662677354432:error:1408F10B:SSL routines:ssl3_get_record:wrong version number:../deps/openssl/openssl/ssl/record/ssl3_record.c:332:
at ConnectionManager.connect (/app/node_modules/sequelize/lib/dialects/mysql/connection-manager.js:102:17)
at runMicrotasks (<anonymous>)
at processTicksAndRejections (node:internal/process/task_queues:96:5)
at async ConnectionManager._connect (/app/node_modules/sequelize/lib/dialects/abstract/connection-manager.js:220:24)"
and:
"Error: Client network socket disconnected before secure TLS connection was established
at connResetException (node:internal/errors:691:14)
at TLSSocket.onConnectEnd (node:_tls_wrap:1585:19)
at TLSSocket.emit (node:events:402:35)
at endReadableNT (node:internal/streams/readable:1343:12)
at processTicksAndRejections (node:internal/process/task_queues:83:21)"
and:
"SequelizeConnectionRefusedError: connect ECONNREFUSED 10.75.0.3:3306
at ConnectionManager.connect (/app/node_modules/sequelize/lib/dialects/mysql/connection-manager.js:92:17)
at runMicrotasks (<anonymous>)
at processTicksAndRejections (node:internal/process/task_queues:96:5)
at async ConnectionManager._connect (/app/node_modules/sequelize/lib/dialects/abstract/connection-manager.js:220:24)"
This has happened once in production and lasted about 8 hours and another time in our dev environment and lasted about the same amount of time. Both envs have same infra but are on different gcp projects.
Just to make sure, the connection between our services and database is generally stable and everything works well, it's just for several hours at a time the performance degrades.
No transactions are being left uncommitted, config max connections is not even closely achieved - max open connections were 21 at the time of this issue occurring and the max amount is 4030. No MySQL configs/database flags were changed, everything is in the default state left by Cloud SQL. Generally the SQL commands are very basic and the max communication payload is never reached.
Is this just GCP degraded performance or is it a user error in using their services? Did anyone else encounter these issues?
Variables:
SHOW GLOBAL VARIABLES LIKE '%time%';
Variable_name |Value |
-------------------------------+---------+
binlog_max_flush_queue_time |0 |
connect_timeout |10 |
default_password_lifetime |0 |
delayed_insert_timeout |300 |
explicit_defaults_for_timestamp|ON |
flush_time |0 |
have_statement_timeout |YES |
innodb_flush_log_at_timeout |1 |
innodb_lock_wait_timeout |50 |
innodb_old_blocks_time |1000 |
innodb_rollback_on_timeout |OFF |
interactive_timeout |28800 |
lc_time_names |en_US |
lock_wait_timeout |31536000 |
log_timestamps |UTC |
long_query_time |10.000000|
max_execution_time |0 |
net_read_timeout |30 |
net_write_timeout |60 |
regexp_time_limit |32 |
replica_net_timeout |30 |
rpl_stop_replica_timeout |31536000 |
rpl_stop_slave_timeout |31536000 |
slave_net_timeout |30 |
slow_launch_time |2 |
system_time_zone |UTC |
time_zone |SYSTEM |
wait_timeout |28800 |
SHOW GLOBAL STATUS LIKE 'aborted%';
Variable_name |Value|
----------------+-----+
Aborted_clients |4 |
Aborted_connects|49 |
SHOW GLOBAL STATUS LIKE '%flush%';
Variable_name |Value |
--------------------------------+------+
Com_flush |196 |
Flush_commands |3 |
Innodb_buffer_pool_pages_flushed|313855|
Key_blocks_not_flushed |0 |
Not_flushed_delayed_rows |0 |
Uptime_since_flush_status |176244|
SHOW GLOBAL STATUS LIKE 'uptime%';
Variable_name |Value |
-------------------------+------+
Uptime |176264|
Uptime_since_flush_status|176264|
SHOW GLOBAL VARIABLES LIKE '%flush%';
Variable_name |Value |
----------------------------------+--------+
binlog_max_flush_queue_time |0 |
flush |OFF |
flush_time |0 |
innodb_adaptive_flushing |ON |
innodb_adaptive_flushing_lwm |10 |
innodb_flush_log_at_timeout |1 |
innodb_flush_log_at_trx_commit |1 |
innodb_flush_method |O_DIRECT|
innodb_flush_neighbors |2 |
innodb_flush_sync |ON |
innodb_flushing_avg_loops |5 |
innodb_idle_flush_pct |100 |
innodb_log_wait_for_flush_spin_hwm|400 |
SHOW GLOBAL STATUS LIKE '%rollback%';
Variable_name |Value|
--------------------------+-----+
Com_rollback |2 |
Com_rollback_to_savepoint |0 |
Com_xa_rollback |0 |
Handler_rollback |4 |
Handler_savepoint_rollback|0 |
SHOW GLOBAL STATUS LIKE '%savepoint%';
Variable_name |Value|
--------------------------+-----+
Com_release_savepoint |0 |
Com_rollback_to_savepoint |0 |
Com_savepoint |0 |
Handler_savepoint |0 |
Handler_savepoint_rollback|0 |