Redis benchmark vs reality: unexpected behavior

0

I have a Redis standalone instance in production. Earlier 8 instances of my application, each having 64 Redis connections(total 12*64) at a rate of 2000 QPS per instance would give me a latency of < 10ms(which I am fine with). Due to an increase in traffic, I had to increase the number of application instances to 16, while also decreasing the connection count per instance from 128 to 16 (total 16*16=256). This was done after benchmarking with memtier benchmark as below

12        Threads
64        Connections per thread
2000      Requests per thread

ALL STATS
========================================================================
Type        Ops/sec     Hits/sec   Misses/sec      Latency       KB/sec
------------------------------------------------------------------------
Sets           0.00          ---          ---      0.00000         0.00
Gets       79424.54       516.26     78908.28      9.90400      2725.45
Waits          0.00          ---          ---      0.00000          ---
Totals     79424.54       516.26     78908.28      9.90400      2725.45
16        Threads
16        Connections per thread
2000      Requests per thread


ALL STATS
========================================================================
Type        Ops/sec     Hits/sec   Misses/sec      Latency       KB/sec
------------------------------------------------------------------------
Sets           0.00          ---          ---      0.00000         0.00
Gets       66631.87       433.11     66198.76      3.32800      2286.47
Waits          0.00          ---          ---      0.00000          ---
Totals     66631.87       433.11     66198.76      3.32800      2286.47

Redis benchmark gave similar results.

However, when I made this change in Production, (16*16), the latency shot up back to 60-70ms. I thought the connection count provisioned was less (which seemed unlikely) and went back to 64 connections (64*16), which as expected increased the latency further. For now, I have half of my applications hitting the master Redis and the other half connected to slave with each having 64 connections (8*64 to master, 8*64 to slave) and this works for me(8-10ms latency).

Here is the question, What could have gone wrong that the latency increased with 256 (16*16) connections but reduced with 512(64*8)connections even though the benchmark says otherwise? I agree to not fully trust the benchmark, but even as a guideline, these are polar opposite results.

Note: 1. Application and Redis are colocated, there is no network latency, memory used is about 40% in Redis and the fragmentation ratio is about 1.4. The application uses Jedis for connection pooling. 2. The latency does not include the overhead of Redis miss, only the Redis round trip is considered.

skott

Posted 2019-09-24T14:37:08.613

Reputation: 101

No answers