How to set up Kafka in a Docker container

At Calcey, we recently found ourselves having to deal with linking a legacy system with a new information system on behalf of a client. In order to avoid complications, we explored the possibility of deploying Kafka within a docker container.

What is Kafka?

Kafka is an open-source, fault-tolerant event streaming platform. Kafka can help bridge the information gap between legacy systems and newer systems. Imagine a situation where you have a newer, better system that needs data from an older, legacy system. Kafka can fetch this data on behalf of the developer without the need to build an actual connection between the two systems.

Kafka, therefore, will behave as an intermediary layer between the two systems.

In order to speed things up, we recommend using a ‘Docker container’ to deploy Kafka. For the uninitiated, a ‘Docker container’ is a lightweight, standalone, executable packages of software that include everything needed to run an application: code, runtime, system tools, system libraries, and settings.

To deploy Kafka, three pieces of the puzzle need to fall in place: the ZooKeeper server, Kafka server, and a connector to the data source. In addition, we will be making use of SQL server’s Change Data Capture (CDC) feature to feed data into Kafka. CDC records any insertion, updating, and deletion activity that is applied to a SQL Server table. This makes the details of the changes available in an easily consumed relational format. But that’s a topic for another day.

The easiest way to set all this up is to use Debezium. We recommend using the Debezium image which can be downloaded from https://debezium.io/docs/tutorial/. This image will allow you to configure ZooKeeper, Kafka and the Connector in one go.

With both ZooKeeper and Kafka now set up, all you have to do is tell Kafka where your data is located. To do so, you can connect Kafka to a data source by means of a ‘connector’. While there is a wide range of connectors available to choose from, we opted to use the SQLServer connector image created by Debezium. Once a connection is established with the data source, pointing the connector back to the Kafka server will ensure that all changes are persisted with.

And that’s all there is to deploying Kafka in a Docker Container!