kafka-topics
2024년 6월 22일 토요일
Kafka Command Line Interface 101
Start Kafka
https://kafka.apache.org/ download
mkdir
C:\kafka_2.12-3.7.0\data\zookeeper
setting path: bin
C:\kafka_2.12-3.7.0\config\zookeeper.properties edit
dataDir=C:/kafka_2.12-3.7.0/data/zookeeper
C:\kafka_2.12-3.7.0 zookeeper-server-start.bat config\zookeeper.properties
C:\kafka_2.12-3.7.0>bin\windows\zookeeper-server-start.bat config\zookeeper.properties
[2024-06-23 08:53:35,580] INFO binding to port 0.0.0.0/0.0.0.0:2181 (org.apache.zookeeper.server.NIOServerCnxnFactory)C:\kafka_2.12-3.7.0\config\server.properties edit
log.dirs=C:/kafka_2.12-3.7.0/data/kafka
C:\kafka_2.12-3.7.0>.\bin\windows\kafka-server-start.bat config\server.properties
[2024-06-23 09:00:46,925] INFO [KafkaServer id=0] started (kafka.server.KafkaServer)
2024년 6월 21일 금요일
Kafka Guarantees
- Messages are appended to a topic-partition in the order they are sent
- Consumers read messages in the order stored in a topic-partition
- With a replication factor of N, producers and consumers can tolerate up to N-I brokers being down
- This is why a replication factor of 3 is a good idea:
- Allows for on broker to be taken down for maintenance
- Allows for another broker to be taken down unexpectedly
- As long as the number of partitions remains constant for a topic(no new partitions), the same key will always go to the same partition
Zookeeper
- Zookeeper manages brokers (keeps a list of them)
- Zookeeper helps in performing leader election for partitions
- Zookeeper sends notifications to Kafka in case of changes (e.g. new topic, broker dies, broker comes up, delete topics, etc...)
- Kafka can't work without Zookeeper
- Zookeeper by design operates with an odd number of servers (3, 5, 7)
- Zookeeper has a leader (handle writes) the rest of the servers are followers (handle reads)
- (Zokeeper does NOT store consumer offsets with Kafka > v0.10)
Kafka Broker Discovery
- Every Kafka broker is also called a "bootstrap server"
- That means that you only need to connect to one broker, and you will be connected to the entire cluster.
- Each broker knows about all brokers, topics and partitions(metadata)
Delivery semantics for consumers
- Consuemers choose when to commit offsets.
- There are 3 delivery sematics:
- At most once:
- offsets are committed as soon as the message is received.
- If the processing goes wrong, the message will be lost(it won't be read again).
- At least once(usually preferred):
- offsets are committed after the message is processed.
- If the processing goes wrong, the message will be read again.
- This can result in duplicate processing of messages. Make sure your processing is idempotent (i.e. processing again the messages won't impact your systems)
- Exactly once:
- Can be achieved for Kafka => Kafka workflows using Kafka Streams API
- For Kafka => External System workflows, use an idempotent consumer.