We want to suggest the following based on our issues on kafka disks
We have many HDP clusters ( based on ambari , and all machines are redhat version 7.2 )
Each cluster include 3 kafka machines , while each kafka include disk with ~15 T
Because we have many issues that disk increased to 100% used capacity ( kafka Retention from some reason not works as should be )
Then we think about cron job that will run on kafka machines every min
And when kafka disk size will be for example - ~90%
then cron job will stop all kafka brokers ( kafka service )
And by this we avoid the kafka disk to became 100% , ( as all know when disk is 100% then the purging process will not works )
Please share your opinion