1. Producer usually send data that is text-based, for example with JSON data
2. In this case, it is important to apply compression to the producer.
3. Compression is enabled at the Producer level and doesn't require any configuration change in the Brokers or in the Consumers
4. "compression.type" can be 'none'(default), 'gzip', 'lz4', 'snappy'
5. Compression is more effective the bigger the batch of message being sent to Kafka!
6. Benchmarks here: https://blog.cloudflare.com/squeezing-the-firehose/
7. The compressed batch has the following advantage:
- Much smaller producer request size (compression ration up to 4x!)
- Faster to transfer data over the network => less latency
- Better throughput
- Better disk utilisation in Kafka (stored messages on disk are smaller)
8. Disadvantages (very minor):
- Producers must commit some CPU cycles to compression
- Consumers must commit some CPU cycles to decompression
9. Overall:
- Consider testing snappy or lz4 for optimal speed / compression ratio
댓글 없음:
댓글 쓰기