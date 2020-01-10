Apache Kafka is a distributed streaming platform. With it's rich API (Application Programming Interface) set, we can connect mostly anything to Kafka as source of data, and on the other end, we can set up a large number of consumers that will receive the steam of records for processing. Kafka is highly scaleable, and stores the streams of data in a reliable and fault-tolerant way. From the connectivity perspective, Kafka can serve as a bridge between many heterogeneous systems, which in turn can rely on it's capabilities to transfer and persist the data provided.
In this tutorial we will install Apache Kafka on a Red Hat Enterprise Linux 8, create the
systemd unit files for ease of management, and test the functionality with the shipped command line tools.
- How to install Apache Kafka
- How to create systemd services for Kafka and Zookeeper
- How to test Kafka with command line clients
Software Requirements and Conventions Used
|Category
|Requirements, Conventions or Software Version Used
|System
|Red Hat Enterprise Linux 8
|Software
|Apache Kafka 2.11
|Other
|Privileged access to your Linux system as root or via the
sudo command.
|Conventions
| # - requires given linux commands to be executed with root privileges either directly as a root user or by use of
sudo command $ - requires given linux commands to be executed as a regular non-privileged user
How to install kafka on Redhat 8 step by step instructions
Subscribe to Linux Career NEWSLETTER and receive latest Linux news, jobs, career advice and tutorials.
Apache Kafka is written in Java, so all we need is OpenJDK 8 installed to proceed with the installation. Kafka relies on Apache Zookeeper, a distributed coordination service, that is also written in Java, and is shipped with the package we will download. While installing HA (High Availability) services to a single node does kill their purpose, we'll install and run Zookeeper for Kafka's sake.
- To download Kafka from the closest mirror, we need to consult the official download site. We can copy the URL of the
.tar.gzfile from there. We'll use
wget, and the URL pasted to download the package to the target machine:
# wget https://www-eu.apache.org/dist/kafka/2.1.0/kafka_2.11-2.1.0.tgz -O /opt/kafka_2.11-2.1.0.tgz
- We enter the
/optdirectory, and extract the archive:
And create a symlink called
# cd /opt # tar -xvf kafka_2.11-2.1.0.tgz
/opt/kafkathat points to the now created
/opt/kafka_2_11-2.1.0directory to make our lives easier.
ln -s /opt/kafka_2.11-2.1.0 /opt/kafka
- We create a non-privileged user that will run both
zookeeperand
kafkaservice.
# useradd kafka
- And set the new user as owner of the whole directory we extracted, recursively:
# chown -R kafka:kafka /opt/kafka*
- We create the unit file
/etc/systemd/system/zookeeper.servicewith the following content:
Note that we don't need to write the version number three times because of the symlink we created. The same applies to the next unit file for Kafka,
[Unit] Description=zookeeper After=syslog.target network.target [Service] Type=simple User=kafka Group=kafka ExecStart=/opt/kafka/bin/zookeeper-server-start.sh /opt/kafka/config/zookeeper.properties ExecStop=/opt/kafka/bin/zookeeper-server-stop.sh [Install] WantedBy=multi-user.target
/etc/systemd/system/kafka.service, that contains the following lines of configuration:
[Unit] Description=Apache Kafka Requires=zookeeper.service After=zookeeper.service [Service] Type=simple User=kafka Group=kafka ExecStart=/opt/kafka/bin/kafka-server-start.sh /opt/kafka/config/server.properties ExecStop=/opt/kafka/bin/kafka-server-stop.sh [Install] WantedBy=multi-user.target
- We need to reload
systemdto get it read the new unit files:
# systemctl daemon-reload
- Now we can start our new services (in this order):
If all goes well,
# systemctl start zookeeper # systemctl start kafka
systemdshould report running state on both service's status, similar to the outputs below:
# systemctl status zookeeper.service zookeeper.service - zookeeper Loaded: loaded (/etc/systemd/system/zookeeper.service; disabled; vendor preset: disabled) Active: active (running) since Thu 2019-01-10 20:44:37 CET; 6s ago Main PID: 11628 (java) Tasks: 23 (limit: 12544) Memory: 57.0M CGroup: /system.slice/zookeeper.service 11628 java -Xmx512M -Xms512M -server [...] # systemctl status kafka.service kafka.service - Apache Kafka Loaded: loaded (/etc/systemd/system/kafka.service; disabled; vendor preset: disabled) Active: active (running) since Thu 2019-01-10 20:45:11 CET; 11s ago Main PID: 11949 (java) Tasks: 64 (limit: 12544) Memory: 322.2M CGroup: /system.slice/kafka.service 11949 java -Xmx1G -Xms1G -server [...]
- Optionally we can enable automatic start on boot for both services:
# systemctl enable zookeeper.service # systemctl enable kafka.service
- To test functionality, we'll connect to Kafka with one producer and one consumer client. The messages provided by the producer should appear on the console of the consumer. But before this we need a medium these two exchange messages on. We create a new channel of data called
topicin Kafka's terms, where the provider will publish, and where the consumer will subscribe to. We'll call the topic
FirstKafkaTopic. We'll use the
kafkauser to create the topic:
$ /opt/kafka/bin/kafka-topics.sh --create --zookeeper localhost:2181 --replication-factor 1 --partitions 1 --topic FirstKafkaTopic
- We start a consumer client from the command line that will subscribe to the (at this point empty) topic created in the previous step:
We leave the console and the client running in it open. This console is where we will receive the message we publish with the producer client.
$ /opt/kafka/bin/kafka-console-consumer.sh --bootstrap-server localhost:9092 --topic FirstKafkaTopic --from-beginning
- On another terminal, we start a producer client, and publish some messages to the topic we created. We can query Kafka for available topics:
And connect to the one the consumer is subscribed, then send a message:
$ /opt/kafka/bin/kafka-topics.sh --list --zookeeper localhost:2181 FirstKafkaTopic
At the consumer terminal, the message should appear shortly:
$ /opt/kafka/bin/kafka-console-producer.sh --broker-list localhost:9092 --topic FirstKafkaTopic > new message published by producer from console #2
If the message appears, our test is successful, and our Kafka installation is working as intended. Many clients could provide and consume one or more topic records the same way, even with a single node setup we created in this tutorial.
$ /opt/kafka/bin/kafka-console-consumer.sh --bootstrap-server localhost:9092 --topic FirstKafkaTopic --from-beginning new message published by producer from console #2
Subscribe to Linux Career NEWSLETTER and receive latest Linux news, jobs, career advice and tutorials.