Best Practices for Kafka deployment

What is this article about?

In this blog, I’ll try to explain the major key takeaways while deploying your very own Kafka cluster, whether it’s on-premise or on the cloud. It will also help you understand how to make your logs configurable and manageable, as in how we can efficiently handle the pile of logs and don’t run into management challenges over the long term at scale. What are all the hardware requirements, we as early developer new to Kafka often try to over-exaggerate and being over-concerned with the requirements that Kafka demands (more on this later)? How we can leverage ZooKeeper to its full potential, how many of them are actually required in the development and as well as in production. How we can use parallel processing, configure and isolate Kafka with security in mind. At last, but not least how to handle low network latency and monitor alerts. These are few questionnaires that are obvious to be spun around in the head at the beginning. 

The following are the key takeaways. 

• One of the greatest advantages with Kafka is its design, the way the architecture has been built has enabled us as a developer to pick inexpensive commodity hardware and still pull out the result quite well. 

• Zookeeper requires high bandwidth and best in class secondary storage for storing and retrieving logs and move between the brokers, isolating the Zookeeper process and disabling swaps can help reduce the latency. 

• Most production environment requires replication factor to be 2 increasing it to 3 would help us archive better failure-tolerance. 

• With a large number of partition comes with better parallelization and throughput, but with the cost of replication latency, rebalances and open server files. • Monitor system metrics suck as network throughput, open file handles, memory, load, disk usage, and heap usage.

