3

I've read through the RabbitMQ Production Checklist and we've made some changes to ensure that RabbitMQ pods in our Kubernetes cluster don't crash by setting the memory limits in the deployment to 1.3G and the RabbitMQ vm_memory_high_watermark.absolute to 1024MB, but my problem is understand exactly what happens when we reach the alarm point, and beyond. The docs state:

Before the broker hits the high watermark and blocks publishers, it will attempt to free up memory by instructing queues to page their contents out to disc. Both persistent and transient messages will be paged out

But at the very beginning of the memory document it also states:

It is strongly recommended that OS swap or page files are enabled.

We run our Kubernetes cluster at Google (GKE) and not only do we not have any control over whether the machines get swap configured (they don't), my understand is that Kubernetes is also not setup to use swap anyway. My concern is whether there will be a negative impact to how RabbitMQ pods can write the messages to disk without it. Does anyone have any insight into this?

GalloCedrone
  • 371
  • 1
  • 9
  • 3
    most cloud instances don't have swap, you can create your own though. Can you log in to the pods and create a disk? https://cloud.google.com/compute/docs/disks/mount-ram-disks – Sum1sAdmin Apr 17 '18 at 15:31
  • My concern about this is that if the pod dies for any reason while it can rebuild the queue (as it's setup in HA all mode), if the process even works in the pod, this would need to be done again. The point behind letting kubernetes restart the pod on failure is that I don't need to be involved at all. – Alex Liffick Apr 18 '18 at 13:19
  • 2
    The lore from the RabbitMQ mailing list is to not set the `vm_memory_high_watermark` to more than 50% as garbage collection can double the usage. And swap is considered evil as it can induce notable delays. – Martin Schröder Apr 22 '18 at 11:01
  • Do you happen to have a link to a copy of that mailing that I could read more and understand why? Thanks! – Alex Liffick Apr 23 '18 at 14:07
  • Did you solve this issue? It's quite old question, did you try use newer version of RabbitMQ and GKE? – PjoterS Feb 15 '21 at 07:41

0 Answers0