I deploy Bitanmi Kafka Helm chart on AWS provisioned with Terraform. I find the documentation storage and persistence allocation, very confusing. From what I understood from the documentation; logs are chunks of messages in a topic, when configurable quotas of bytes, messages or time, are exceeded, the logs get flushed to files in storage.
The Helm chart stateful set has these volumes:
volumeMounts:
- name: data
mountPath: {{ .Values.persistence.mountPath }}
- name: logs
mountPath: {{ .Values.logPersistence.mountPath }}
logPersistence is enabled in helm chart by me to retain logs, in an attached volume provisioned by the Helm chart unless I provide an alternative (which I don't).
What happens if logPersistence is exhausted? Can I configure a fail safe i.e. configure Kafka to retain everything and delete old log files if quota is exceeded? How do I retrieve logs that are persisted? Can a consumer ask for topic messages from an early offset and cause latency or failure? If so, what are some strategies to recover?
Edit
I have learned through observing the directories in the Kafka pods that
By default the Bitnami Helm chart persists messages to persistence mapped to /bitnami/kafka/data/ directory
By default Kafka server logs are sent to stdout and can be configured to be stored in logPersistence mapped to /opt/bitnami/kafka/logs
Naturally these directories become full and can cause a crash.
I do not yet know if there is a configurable Kafka setting to avoid the crash i.e. clear old messages from storage if it nears being exhausted.