I am new to Kafka, trying to do a project. Wanted to do it as it would be in real life example, but I am kinda confused. While searching thru the internet I found that if I want to have 3 brokers and 3 zookeepers, to provide replication factor = 2 and quorum, I need 6 EC2 instances. I am looking thru youtube to find some examples, but as far as I see all of them show multiple brokers on one cluster. From my understanding it's better to keep ZKs and all brokers separately on each VM, so if one goes down I still have all of the rest. Can you confirm that ?
Also, wondering how to set partitioning. Is it important at the beginning of creating a topic, or I change that later when I need to scale ?
Thanks in advance
-
A slightly tangential point to note here is that going forward KRaft is the preferred mode and ZK is deprecated. So you can actually ignore those 3 nodes for ZK and just run broker nodes.brahmana– brahmana2025年01月28日 10:37:11 +00:00Commented Jan 28 at 10:37
1 Answer 1
better to keep ZKs and all brokers separately
Yes, that is correct.
You're seeing a tension between resources used for a teaching
example and for a production setup.
You need an odd number of ZKs, > 1
, for paxos to offer
meaningful CAP quorum guarantees.
In a demo / teaching setup, we might choose to have all
instances run on a single host.
They can still communicate amongst themselves and reach
consensus, but clearly if that host reboots there will
be common mode failure that impacts cluster availability.
In a small cluster, a broker might run on same node as a ZK.
In a larger production config, we might want to improve ZK stability by banishing brokers to specialized nodes.