{"id":55724,"date":"2022-10-31T16:48:26","date_gmt":"2022-10-31T11:18:26","guid":{"rendered":"https:\/\/www.tothenew.com\/blog\/?p=55724"},"modified":"2022-11-08T17:42:32","modified_gmt":"2022-11-08T12:12:32","slug":"mirror-maker-for-kafka-migration","status":"publish","type":"post","link":"https:\/\/www.tothenew.com\/blog\/mirror-maker-for-kafka-migration\/","title":{"rendered":"Mirror Maker for Kafka Migration"},"content":{"rendered":"<p><span style=\"font-weight: 400;\">For one of our Global Advertising Management Platform clients, we did one migration project with zero downtime for components like Platform DB, Ceph, Aerospike, Kafka (Zookeeper +data nodes), MapR (hive, oozie, hue), Druid (Zookeeper +data nodes), Flink (Zookeeper +data nodes), Monitoring (Icinga,collectd, cloudwatch), Logging (logstash &amp; Opensearch) &amp; Other Components ( Nexus, SFTP, Jenkins ).<\/span><\/p>\n<p>To start off this migration blog, kafka components and its migration using Mirror maker are explained below in detail.<\/p>\n<p><span style=\"font-weight: 400;\">MirrorMaker is a process in Apache Kafka to replicate or mirror data between Kafka Clusters. Don&#8217;t confuse it with the replication of data among Kafka nodes of the same cluster. One use case is to provide a replica of a complete Kafka cluster in another data center to cater to different use cases without impacting the original cluster.\u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0<\/span><\/p>\n<p><span style=\"font-weight: 400;\">In MirrorMaker, there is a consumer connector and a producer connector. The consumer will read data from topics in the source Kafka cluster, and the producer connector will write those events or data to the target Kafka Cluster. The source cluster and target cluster are independent of each other.<\/span><\/p>\n<p><img decoding=\"async\" loading=\"lazy\" class=\"alignnone wp-image-55726 size-full\" src=\"\/blog\/wp-ttn-blog\/uploads\/2022\/10\/Screenshot-from-2022-10-31-12-14-55.png\" alt=\"\" width=\"948\" height=\"587\" srcset=\"\/blog\/wp-ttn-blog\/uploads\/2022\/10\/Screenshot-from-2022-10-31-12-14-55.png 948w, \/blog\/wp-ttn-blog\/uploads\/2022\/10\/Screenshot-from-2022-10-31-12-14-55-300x186.png 300w, \/blog\/wp-ttn-blog\/uploads\/2022\/10\/Screenshot-from-2022-10-31-12-14-55-768x476.png 768w, \/blog\/wp-ttn-blog\/uploads\/2022\/10\/Screenshot-from-2022-10-31-12-14-55-624x386.png 624w\" sizes=\"(max-width: 948px) 100vw, 948px\" \/><\/p>\n<p><span style=\"font-weight: 400;\">Kafka&#8217;s mirroring feature makes it possible to maintain a replica of an existing Kafka cluster. The following diagram shows how to use the MirrorMaker tool to mirror a source Kafka cluster into a target (mirror) Kafka cluster. The tool uses a Kafka consumer to consume messages from the source cluster and re-publishes those messages to the local (target) cluster using an embedded Kafka producer.<\/span><\/p>\n<p><span style=\"font-weight: 400;\">For one of our customers, while migrating their entire system from an on-premises environment to the cloud (AWS), we used mirror maker as they needed migration with zero downtime to their existing pipeline. As shown in the diagram below, we created a separate AWS pipeline similar to Datacenter one and used a mirror maker to copy data from DC kafka to AWS kafka.<\/span><\/p>\n<p><span style=\"font-weight: 400;\"><img decoding=\"async\" loading=\"lazy\" class=\"alignnone wp-image-55725 size-full\" src=\"\/blog\/wp-ttn-blog\/uploads\/2022\/10\/Screenshot-from-2022-10-28-16-51-15.png\" alt=\"\" width=\"1132\" height=\"754\" srcset=\"\/blog\/wp-ttn-blog\/uploads\/2022\/10\/Screenshot-from-2022-10-28-16-51-15.png 1132w, \/blog\/wp-ttn-blog\/uploads\/2022\/10\/Screenshot-from-2022-10-28-16-51-15-300x200.png 300w, \/blog\/wp-ttn-blog\/uploads\/2022\/10\/Screenshot-from-2022-10-28-16-51-15-1024x682.png 1024w, \/blog\/wp-ttn-blog\/uploads\/2022\/10\/Screenshot-from-2022-10-28-16-51-15-768x512.png 768w, \/blog\/wp-ttn-blog\/uploads\/2022\/10\/Screenshot-from-2022-10-28-16-51-15-624x416.png 624w\" sizes=\"(max-width: 1132px) 100vw, 1132px\" \/>\u00a0 \u00a0<\/span><\/p>\n<p><b>Steps for setting up a mirror between source and target kafka clusters :<\/b><\/p>\n<ol>\n<li><span style=\"font-weight: 400;\">Create a list of topics present in the source cluster that needs to be copied to the target cluster\n<p><\/span><img decoding=\"async\" loading=\"lazy\" class=\"alignnone wp-image-55727 size-full\" src=\"\/blog\/wp-ttn-blog\/uploads\/2022\/10\/Screenshot-from-2022-10-18-19-16-19.png\" alt=\"\" width=\"846\" height=\"156\" srcset=\"\/blog\/wp-ttn-blog\/uploads\/2022\/10\/Screenshot-from-2022-10-18-19-16-19.png 846w, \/blog\/wp-ttn-blog\/uploads\/2022\/10\/Screenshot-from-2022-10-18-19-16-19-300x55.png 300w, \/blog\/wp-ttn-blog\/uploads\/2022\/10\/Screenshot-from-2022-10-18-19-16-19-768x142.png 768w, \/blog\/wp-ttn-blog\/uploads\/2022\/10\/Screenshot-from-2022-10-18-19-16-19-624x115.png 624w\" sizes=\"(max-width: 846px) 100vw, 846px\" \/><\/li>\n<li><span style=\"font-weight: 400;\">Create those topics on the target cluster with the same replication, partitions, and other configs\n<p><\/span><img decoding=\"async\" loading=\"lazy\" class=\"alignnone wp-image-55728 size-full\" src=\"\/blog\/wp-ttn-blog\/uploads\/2022\/10\/Screenshot-from-2022-10-31-12-31-16.png\" alt=\"\" width=\"936\" height=\"61\" srcset=\"\/blog\/wp-ttn-blog\/uploads\/2022\/10\/Screenshot-from-2022-10-31-12-31-16.png 936w, \/blog\/wp-ttn-blog\/uploads\/2022\/10\/Screenshot-from-2022-10-31-12-31-16-300x20.png 300w, \/blog\/wp-ttn-blog\/uploads\/2022\/10\/Screenshot-from-2022-10-31-12-31-16-768x50.png 768w, \/blog\/wp-ttn-blog\/uploads\/2022\/10\/Screenshot-from-2022-10-31-12-31-16-624x41.png 624w\" sizes=\"(max-width: 936px) 100vw, 936px\" \/><\/li>\n<li>Create a Mirror maker cluster depending on data load via topics\n<ul>\n<li><span style=\"font-weight: 400;\">Use proper instance type for mirror maker cluster depending on topics (This plays a vital role in reducing the lag between source and target kafka data)<\/span><\/li>\n<li><span style=\"font-weight: 400;\">In our case, we used multiple clusters configuring different topics as the data load in each topic was varying. We used a c5.18x instance-type cluster of 15 nodes for topics with a heavy data load. Whereas for topics with less data load, we used c5.18x instance types cluster of 2 nodes.<\/span><\/li>\n<li><span style=\"font-weight: 400;\">Setting up a mirror is easy &#8211; simply start the mirror-maker processes after bringing up the target cluster. At a minimum, the mirror maker takes one or more consumer configurations, a producer configuration, and either a whitelist or a blacklist. You need to point the consumer to the source cluster&#8217;s ZooKeeper, and the producer to the mirror cluster&#8217;s ZooKeeper (or use the broker.list parameter)\n<p><\/span><img decoding=\"async\" loading=\"lazy\" class=\"alignnone wp-image-55730 size-full\" src=\"\/blog\/wp-ttn-blog\/uploads\/2022\/10\/Screenshot-from-2022-10-19-18-13-32.png\" alt=\"\" width=\"1394\" height=\"71\" srcset=\"\/blog\/wp-ttn-blog\/uploads\/2022\/10\/Screenshot-from-2022-10-19-18-13-32.png 1394w, \/blog\/wp-ttn-blog\/uploads\/2022\/10\/Screenshot-from-2022-10-19-18-13-32-300x15.png 300w, \/blog\/wp-ttn-blog\/uploads\/2022\/10\/Screenshot-from-2022-10-19-18-13-32-1024x52.png 1024w, \/blog\/wp-ttn-blog\/uploads\/2022\/10\/Screenshot-from-2022-10-19-18-13-32-768x39.png 768w, \/blog\/wp-ttn-blog\/uploads\/2022\/10\/Screenshot-from-2022-10-19-18-13-32-624x32.png 624w\" sizes=\"(max-width: 1394px) 100vw, 1394px\" \/><\/li>\n<\/ul>\n<\/li>\n<li>Playing around with different config parameters to optimize the mirroring process.<\/li>\n<\/ol>\n<p><strong>\u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a01. consumer.properties<\/strong><\/p>\n<p><span style=\"font-weight: 400;\">\u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 fetch.message.max.bytes=40240000<\/span><\/p>\n<p><span style=\"font-weight: 400;\">\u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 client.id=prod-mirrormaker-group-300322_001<\/span><\/p>\n<p><span style=\"font-weight: 400;\">\u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0group.id=prod-mirrormaker-group-310322_01<\/span><\/p>\n<p><span style=\"font-weight: 400;\">\u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0exclude.internal.topics=true<\/span><\/p>\n<p><span style=\"font-weight: 400;\">\u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0num.consumer.fetchers=300<\/span><\/p>\n<p><span style=\"font-weight: 400;\">\u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0fetch.max.wait.ms=500<\/span><\/p>\n<p><span style=\"font-weight: 400;\">\u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0fetch.min.bytes=186384<\/span><\/p>\n<p><strong>\u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a02.\u00a0 producer.properties<\/strong><\/p>\n<p><span style=\"font-weight: 400;\">\u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 producer.type=async<\/span><\/p>\n<p><span style=\"font-weight: 400;\">\u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 queue.buffering.max.messages=2000<\/span><\/p>\n<p><span style=\"font-weight: 400;\">\u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 queue.buffering.max.ms=500<\/span><\/p>\n<p><span style=\"font-weight: 400;\">\u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 batch.num.messages=3000<\/span><\/p>\n<p><span style=\"font-weight: 400;\">\u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 send.buffer.bytes=1000000<\/span><\/p>\n<p><span style=\"font-weight: 400;\">\u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 client.id=prod-mirrormaker-group-240322_001<\/span><\/p>\n<p><span style=\"font-weight: 400;\">\u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 compression.codec= gzip<\/span><\/p>\n<p><span style=\"font-weight: 400;\">\u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 request.required.acks=1<\/span><\/p>\n<p><span style=\"font-weight: 400;\">\u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 batch.size = 1000<\/span><\/p>\n<p><span style=\"font-weight: 400;\">\u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 buffer.memory = 2000000000<\/span><\/p>\n<h3><span style=\"font-weight: 400;\">How to check whether a mirror is keeping up :<\/span><\/h3>\n<p><span style=\"font-weight: 400;\">The consumer offset checker tool is useful to gauge how well your mirror is keeping up with the source cluster. Note that the &#8211;zkconnect argument should point to the source cluster&#8217;s ZooKeeper (DC in this scenario). Also, if the topic is not specified, the tool prints information for all topics under the given consumer group. For example:<\/span><\/p>\n<p><img decoding=\"async\" loading=\"lazy\" class=\"alignnone size-medium wp-image-55731\" src=\"\/blog\/wp-ttn-blog\/uploads\/2022\/10\/Screenshot-from-2022-10-19-18-24-00-300x17.png\" alt=\"\" width=\"300\" height=\"17\" srcset=\"\/blog\/wp-ttn-blog\/uploads\/2022\/10\/Screenshot-from-2022-10-19-18-24-00-300x17.png 300w, \/blog\/wp-ttn-blog\/uploads\/2022\/10\/Screenshot-from-2022-10-19-18-24-00-1024x58.png 1024w, \/blog\/wp-ttn-blog\/uploads\/2022\/10\/Screenshot-from-2022-10-19-18-24-00-768x43.png 768w, \/blog\/wp-ttn-blog\/uploads\/2022\/10\/Screenshot-from-2022-10-19-18-24-00-624x35.png 624w, \/blog\/wp-ttn-blog\/uploads\/2022\/10\/Screenshot-from-2022-10-19-18-24-00.png 1078w\" sizes=\"(max-width: 300px) 100vw, 300px\" \/><\/p>\n<table>\n<tbody>\n<tr>\n<td><span style=\"font-weight: 400;\">Group\u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 <\/span><span style=\"font-weight: 400;\">Topic\u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 <\/span> <span style=\"font-weight: 400;\">Pid\u00a0 \u00a0 \u00a0 \u00a0Offset\u00a0 \u00a0 \u00a0 <\/span> <span style=\"font-weight: 400;\">logSize\u00a0 \u00a0 \u00a0 \u00a0 \u00a0<\/span><span style=\"font-weight: 400;\">Lag \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 Owner<\/span><\/p>\n<p><span style=\"font-weight: 400;\">KafkaMirror\u00a0 <\/span><span style=\"font-weight: 400;\">test-topic \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 0\u00a0 \u00a0 \u00a0 \u00a0 \u00a0 5 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 5 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 none<\/span><\/p>\n<p><span style=\"font-weight: 400;\">KafkaMirror <\/span>\u00a0<span style=\"font-weight: 400;\">test-topic \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 1\u00a0 \u00a0 \u00a0 \u00a0 \u00a0 3 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 4 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 1 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 none<\/span><\/p>\n<p><span style=\"font-weight: 400;\">KafkaMirror\u00a0 <\/span><span style=\"font-weight: 400;\">test-topic \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 2 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 6 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 9\u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 3 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 none<\/span><\/td>\n<\/tr>\n<tr>\n<td><\/td>\n<\/tr>\n<\/tbody>\n<\/table>\n<p>&nbsp;<\/p>\n<div class=\"ap-custom-wrapper\"><\/div><!--ap-custom-wrapper-->","protected":false},"excerpt":{"rendered":"<p>For one of our Global Advertising Management Platform clients, we did one migration project with zero downtime for components like Platform DB, Ceph, Aerospike, Kafka (Zookeeper +data nodes), MapR (hive, oozie, hue), Druid (Zookeeper +data nodes), Flink (Zookeeper +data nodes), Monitoring (Icinga,collectd, cloudwatch), Logging (logstash &amp; Opensearch) &amp; Other Components ( Nexus, SFTP, Jenkins ). [&hellip;]<\/p>\n","protected":false},"author":1506,"featured_media":0,"comment_status":"open","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":{"iawp_total_views":61},"categories":[1174,1395,4308,3479,2348,1],"tags":[5039,5040],"aioseo_notices":[],"_links":{"self":[{"href":"https:\/\/www.tothenew.com\/blog\/wp-json\/wp\/v2\/posts\/55724"}],"collection":[{"href":"https:\/\/www.tothenew.com\/blog\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/www.tothenew.com\/blog\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/www.tothenew.com\/blog\/wp-json\/wp\/v2\/users\/1506"}],"replies":[{"embeddable":true,"href":"https:\/\/www.tothenew.com\/blog\/wp-json\/wp\/v2\/comments?post=55724"}],"version-history":[{"count":6,"href":"https:\/\/www.tothenew.com\/blog\/wp-json\/wp\/v2\/posts\/55724\/revisions"}],"predecessor-version":[{"id":55752,"href":"https:\/\/www.tothenew.com\/blog\/wp-json\/wp\/v2\/posts\/55724\/revisions\/55752"}],"wp:attachment":[{"href":"https:\/\/www.tothenew.com\/blog\/wp-json\/wp\/v2\/media?parent=55724"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/www.tothenew.com\/blog\/wp-json\/wp\/v2\/categories?post=55724"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/www.tothenew.com\/blog\/wp-json\/wp\/v2\/tags?post=55724"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}