페이지

2024년 8월 18일 일요일

Hands On: Quorum Setup

1. Create an AMI (image) from  the existing machine

2. Create other 2 machines, and launch Zookeeper on them

3. Test that the Quorum is running and working



nano /home/ubuntu/kafka/config/zookeeper.properties
# the location to store the in-memory database snapshots and, unless specified otherwise, the transaction log of updates to the database.
dataDir=/data/zookeeper
# the port at which the clients will connect
clientPort=2181
# disable the per-ip limit on the number of connections since this is a non-production config
maxClientCnxns=0
# the basic time unit in milliseconds used by ZooKeeper. It is used to do heartbeats and the minimum session timeout will be twice the tickTime.
tickTime=2000
# The number of ticks that the initial synchronization phase can take
initLimit=10
# The number of ticks that can pass between
# sending a request and getting an acknowledgement
syncLimit=5
# zoo servers
# these hostnames such as `zookeeper-1` come from the /etc/hosts file
server.1=zookeeper1:2888:3888
server.2=zookeeper2:2888:3888
server.3=zookeeper3:2888:3888
bin/zookeeper-server-start.sh config/zookeeper.properties





sudo mkdir -p /data/zookeeper
sudo chown -R ubuntu:ubuntu /data/
# declare the server's identity
echo "1" > /data/zookeeper/myid
# edit the zookeeper settings
rm /home/ubuntu/kafka/config/zookeeper.properties
nano /home/ubuntu/kafka/config/zookeeper.properties
# restart the zookeeper service
sudo service zookeeper stop
sudo service zookeeper start
# observe the logs - need to do this on every machine
cat /home/ubuntu/kafka/logs/zookeeper.out | head -100
nc -vz localhost 2181
nc -vz localhost 2888
nc -vz localhost 3888
echo "ruok" | nc localhost 2181 ; echo
echo "stat" | nc localhost 2181 ; echo
bin/zookeeper-shell.sh localhost:2181
# not happy
ls /







2024년 8월 17일 토요일

Hands-On: Using Zookeeper Command Line Interface

1. Create nodes, sub nodes, etc...

2. Get / Set data for a node

3. Watch a node

4. Delete a node


# start zookeeper
sudo service zookeeper start
# verify it's started
nc -vz localhost 2181

bin/zookeeper-shell.sh localhost 2181
# display help
help
# display root
ls /
create /my-node "foo"


ls /
get /my-node
set /my-node "new data"
create /my-node/deeper-node "bar"
ls /
ls /my-node
ls /my-node/deeper-node
get /my-node/deeper-node
rmr /my-node/deeper-node
rmr /my-node
ls /
# create a watcher
create /node-to-watch ""
get /node-to-watch true
set /node-to-watch "new data"
set /node-to-watch "whatever"


Hands On:Single Machine Setup

1. SSH into our machine

2. Install some necessary (java) and helpful packages on the machine

3. Disable RAM Swap

4. Add hosts mapping from hostname to public ips to /etc/hosts

5. Download & Configure Zookeeper on the machine

6. Launch Zookeeper on the machine to test

7. Setup Zookeeper as a service on the machine


sudo apt-get update && \
sudo apt-get -y install wget ca-certificates zip net-tools vim nano tar netcat
sudo apt-get -y install openjdk-8-jdk
java -version
sudo sysctl vm.swappiness=1
echo 'vm.swappiness=1' | sudo tee --append /etc/sysctl.conf
cat /etc/hosts
echo "172.31.9.1 kafka1
172.31.9.1 zookeeper1
172.31.19.230 kafka2
172.31.19.230 zookeeper2
172.31.35.20 kafka3
172.31.35.20 zookeeper3" | sudo tee --append /etc/hosts
ping kafka1
ping kafka2
wget https://archive.apache.org/dist/kafka/0.10.2.1/kafka_2.12-0.10.2.1.tgz
tar -xvzf kafka_2.12-0.10.2.1.tgz
rm kafka_2.12-0.10.2.1.tgz
mv kafka_2.12-0.10.2.1 kafka
cd kafka/
cat config/zookeeper.properties
bin/zookeeper-server-start.sh config/zookeeper.properties
# Testing Zookeeper install
# Start Zookeeper in the background
bin/zookeeper-server-start.sh -daemon config/zookeeper.properties
bin/zookeeper-shell.sh localhost:2181
ls /
# demonstrate the use of a 4 letter word
echo "ruok" | nc localhost 2181 ; echo


sudo nano /etc/init.d/zookeeper
#!/bin/sh
#
# zookeeper Start/Stop zookeeper
#
# chkconfig: - 99 10
# description: Standard script to start and stop zookeeper

DAEMON_PATH=/home/ubuntu/kafka/bin
DAEMON_NAME=zookeeper

PATH=$PATH:$DAEMON_PATH

# See how we were called.
case "$1" in
start)
# Start daemon.
pid=`ps ax | grep -i 'org.apache.zookeeper' | grep -v grep | awk '{print $1}'`
if [ -n "$pid" ]
then
echo "Zookeeper is already running";
else
echo "Starting $DAEMON_NAME";
$DAEMON_PATH/zookeeper-server-start.sh -daemon /home/ubuntu/kafka/config/zookeeper.properties
fi
;;
stop)
echo "Shutting down $DAEMON_NAME";
$DAEMON_PATH/zookeeper-server-stop.sh
;;
restart)
$0 stop
sleep 2
$0 start
;;
status)
pid=`ps ax | grep -i 'org.apache.zookeeper' | grep -v grep | awk '{print $1}'`
if [ -n "$pid" ]
then
echo "Zookeeper is Running as PID: $pid"
else
echo "Zookeeper is not Running"
fi
;;
*)
echo "Usage: $0 {start|stop|restart|status}"
exit 1
esac

exit 0
sudo chmod +x /etc/init.d/zookeeper
sudo chown root:root /etc/init.d/zookeeper
sudo update-rc.d zookeeper defaults
sudo service zookeeper stop
nc -vz localhost 2181
sudo service zookeeper start
sudo service zookeeper status
nc -vz localhost 2181
echo "ruok" | nc localhost 2181 ; echo
cat logs/zookeeper.out

2024년 8월 16일 금요일

How to SSH into our machine

1. Windows: Install Putty

https://www.chiark.greenend.org.uk/~sgtatham/putty/latest.html

2. Mac/Linux: You have OpenSSH






Hands On:AWS Setup

1. Create an AWS Account
2. Setup network security to allow Zookeeper ports (2181, 2888, 3888)
3. Setup network security to allow myu IP only
4. Create I EC2 machines Ubuntu image t2.medium (4 GB RAM)
5. Reserve 3 private IPs for our machines










Zookeeper configuration

1. Zookeeper configuration can be very tricky to optimize and really depends on how your Kafka cluster is formed, as well as your network environment


2. We are going to set the most common settings for Zookeeper and discuss some more advance settings

# the location to store the in-memory database snapshots and, unless specified otherwise, the transaction log of updates to the database.
dataDir=/data/zookeeper
# the port at which the clients will connect
clientPort=2181
# disable the per-ip limit on the number of connections since this is a non-production config
maxClientCnxns=0
# the basic time unit in milliseconds used by ZooKeeper. It is used to do heartbeats and the minimum session timeout will be twice the tickTime.
tickTime=2000
# The number of ticks that the initial synchronization phase can take
initLimit=10
# The number of ticks that can pass between
# sending a request and getting an acknowledgement
syncLimit=5
# zoo servers
# these hostnames such as `zookeeper-1` come from the /etc/hosts file
server.1=zookeeper1:2888:3888
server.2=zookeeper2:2888:3888
server.3=zookeeper3:2888:3888

Zookeeper Architecture Quorum sizing

1. Zookeeper needs to have a strict majority of servers up to form a strict majority when votes happen

2. Therefore Zookeepr quorums have 1,3,5,7,9,(2N+1) servers

3. This allows for 0,1,2,3,4,N server to go down