Lesson 1: AWS Building Blocks
Lesson 2: AWS Global Infrastructure
Module 1: AWS Overview
Module 2: AWS Identity and Access Management
Module 3: AWS Network Services
Module 4: AWS Compute Services
Module 5: AWS Storage Services
Module 6: AWS Database Services
Module 7: AWS High Availability Services
Module 8: AWS Analytics Services
Module 9: AWS Management Tools
Module 10: AWS Monitoring and Automation Services
Module 11: AWS Security Services
Module 12: AWS Developer Services
Module 13: AWS Biling and Cost Management
Module 14: Course Wrap-Up And Next Steps
1. Brokers have defaults for all the topic configuration parameters
2. These parameters impact performance and topic behavior
3. Some topics may need different values than the defaults
- Replication Factor
- #of Partitions
- Message size
- Compression level
- Log Cleanup Policy
- Min Insync Replicas
- Other configurations
4. A list of configuration can be found at:
https://kafka.apache.org/documentation/#brokerconfigs
.\kafka-topics.bat --bootstrap-server 127.0.0.1:2181 --list
PS C:\kafka_2.12-3.7.0> .\bin\windows\kafka-topics.bat --bootstrap-server 127.0.0.1:2181 --create --topic configured-topic --partitions 3 --replication-factor 1
1. Kafka can only operate well in a single resion
2. Therefore, it is very common for enterprises to have Kafka clusters across the world, with some level of replication between them
3. A replication application at its core is just a consumer + a producer
4. There are different tools to perform it:
- Mirror Maker - open source tool that ships with Kafka
- netflix users Flink - they wrote their own applicaiton
- Uber uysers uRepli8cator - address performance and operations issues with MM
- Comcast has their own open source Kafka Connect Source
- Confluent has their own Kafka Connect Source(paid)
5. Overall, try these and see if it works for your use case before writing your own
6. There are two desings for cluster replication:
7. Active => Active:
- You have a global application
- You have a global dataset
8. Active => Passive:
- You want to have an aggregation cluster (for example for analytics)
- You want to create some form of disaster recovery strategy (it's hard)
- Cloud Migration (from on-premise cluster to Cloud cluster)
9. Replicating doesn't preserve offsets, just data!
1. Kafka Security is fairly new (0.10.0)
2. Kafka Security improves over time and becomes more flexible / easier to setup as time goes.
3. Currently, it i hard to setup Kafka Security.
4. Best support for Kafka Security for applications is with JAVA
1. You can mix
- Encryption
- Authentication
- Authorisation
2. This allows you Kafka clients to:
- Communicate securely to Kafka
- Clients would authenticate against Kafka
- Kafka can authorise clients to read / write to topics
1. Authentication in Kafka ensures that only clients thats can prove their identity can connect to our Kafka Cluster
2. This is similar concept to a login (username / password)
3. Authentication in Kafka can take a few forms