1. Kafka has amazing performance thanks to the page cache which utilizes your RAM
2. Understanding RAM in Kafka means understanding two parts:
- The Java HEAP from the Kafka process
- The rest of the RAM used by the OS page cache
3. Let's understand how both of those should be sized
4. Overall, your Kafka production machines should have at least 8GB of RAM to them(the more the better - it's common to have 16GB or 32GB per broker)
* Java Heap
5. When you launch Kafka, you specify Kafka Heap Options(KAFKA_HEAP_OPTS environment variable)
6. I recommend to assign a MAX amount (-Xms) of 4GB to get started to the kafka heap:
7. export KAFKA_HEAP_OPTS="-Xmx4g"
8. Don't set -Xms (starting heap size):
- Ket heap grow over time
- Monitor the heap over time to see if you need to increases Xmx
9. Kafka should keep a low heap usage over time, and heap should increase only if you have more partitiions in your broker
* OS Page Cache
10. The remaining RAM will be used automatically for the Linux OS Page Cache.
11. This is used to buffer data to the disk and this is what gives Kafka an amazing performance
12. You don't have to specify anything!
13. Any un-used memory will automatically be leveraged by the Linux Operating System and assign memory to the page cache
14. Note: Make sure swapping is disabled for Kafka entirely
vm.swappiness=0 or vm.swappiness=1(default is 60 on Linux)