{"id":29060,"date":"2015-10-28T00:15:49","date_gmt":"2015-10-27T18:45:49","guid":{"rendered":"http:\/\/www.tothenew.com\/blog\/?p=29060"},"modified":"2016-01-19T13:29:02","modified_gmt":"2016-01-19T07:59:02","slug":"zookeeper-leader-election-simplified","status":"publish","type":"post","link":"https:\/\/www.tothenew.com\/blog\/zookeeper-leader-election-simplified\/","title":{"rendered":"ZooKeeper Leader-Election simplified"},"content":{"rendered":"<h4>Background<\/h4>\n<p>Recently (in a project) we were required to determine the master node from a pool of similar type of nodes. And if master node fails, any other node should take on the responsibility &#8211; so that the service remains available.<br \/>\nSo, the use-case was something like &#8211; Only single node should behave as a master node and it will coordinate with all the worker nodes to process the required tasks.<\/p>\n<p>Clearly, it\u2019s a leader-election recipe. This is supported in Curator\/Zookeeper API, but we found it little complex in terms of blocking the threads to claim the leadership for a longer duration. Otherwise, it was not functioning well.<\/p>\n<h3>Solution<\/h3>\n<p>So, we figured out a very simple way to determine the master node. Here are the steps below:<\/p>\n<ul>\n<li>All the nodes will register themselves to a specific ZK path. Let\u2019s say it\u2019s \u201c\/my\/project\/coordinators\/${machine-ip-address}\u201c<\/li>\n<li>Example: If Node1 (running on IP 127.0.0.1) joins the cluster &#8211; the ZK node path will appear like \u201cmy\/project\/coordinators\/127.0.0.1\/\u201c. We picked IP address, but it\u2019s up to you &#8211; whatever convention you like to follow.<\/li>\n<li>Write an Algo which determines the leader\/master node\n<ul>\n<li>Every coordinator node will read all the children node names (i.e. list of IPs).<\/li>\n<li>Sort the node-names and pick the very first one &#8211; Declare it as master node.<\/li>\n<li>Below is the code snippet to determine if the node is master node or not\n<p>[code]<br \/>\npublic static boolean isMasterCoordinatorNode(String nodeId) throws Exception {<br \/>\n        List&lt;String&gt; coordinatorNodes = getCuratorClient().getChildren().forPath(&quot;my\/project\/coordinators&quot;);<br \/>\n        if (coordinatorNodes!=null &amp;&amp; coordinatorNodes.size()&gt;0)<br \/>\n        {<br \/>\n            TreeSet&lt;String&gt; set = new TreeSet&lt;String&gt;(coordinatorNodes);<br \/>\n            String firstNodeId = set.first();<br \/>\n            if(firstNodeId.equals(nodeId)){<br \/>\n                return true;<br \/>\n            }<br \/>\n        }<br \/>\n        return false;<br \/>\n    }<\/p>\n<p>public synchronized static CuratorFramework getCuratorClient(){<br \/>\n        if(_client == null){<br \/>\n            String zookeeperStr = &quot;127.0.0.1:2181&quot;; \/\/ zookeeper address<br \/>\n            RetryPolicy retryPolicy = new ExponentialBackoffRetry(1000, 3);<br \/>\n            _client = CuratorFrameworkFactory.newClient(zookeeperStr, retryPolicy);<br \/>\n            _client.start();<br \/>\n        }<br \/>\n        return _client;<br \/>\n    }<br \/>\n[\/code]<\/p>\n<\/li>\n<\/ul>\n<\/li>\n<li>Other coordinator nodes (which are not picked as master node) &#8211; won\u2019t do anything.<\/li>\n<li><strong>Ensure that the registered ZooKeeper nodes are all ephemeral nodes<\/strong>, so that even if the master node goes down, the immediate next (available) node will become the master node.<\/li>\n<\/ul>\n<p>In our case this worked very well. \ud83d\ude42<\/p>\n<p><em>Here&#8217;s my previous blog as an introduction to Curator framework<br \/>\n<a href=\"http:\/\/www.tothenew.com\/blog\/curator-framework-for-apache-zookeeper\/\">Curator Framework for Apache ZooKeeper<\/a><\/em><\/p>\n","protected":false},"excerpt":{"rendered":"<p>Background Recently (in a project) we were required to determine the master node from a pool of similar type of nodes. And if master node fails, any other node should take on the responsibility &#8211; so that the service remains available. So, the use-case was something like &#8211; Only single node should behave as a [&hellip;]<\/p>\n","protected":false},"author":606,"featured_media":0,"comment_status":"open","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":{"iawp_total_views":23},"categories":[1395,446],"tags":[2664,2663,2669,2668],"aioseo_notices":[],"_links":{"self":[{"href":"https:\/\/www.tothenew.com\/blog\/wp-json\/wp\/v2\/posts\/29060"}],"collection":[{"href":"https:\/\/www.tothenew.com\/blog\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/www.tothenew.com\/blog\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/www.tothenew.com\/blog\/wp-json\/wp\/v2\/users\/606"}],"replies":[{"embeddable":true,"href":"https:\/\/www.tothenew.com\/blog\/wp-json\/wp\/v2\/comments?post=29060"}],"version-history":[{"count":0,"href":"https:\/\/www.tothenew.com\/blog\/wp-json\/wp\/v2\/posts\/29060\/revisions"}],"wp:attachment":[{"href":"https:\/\/www.tothenew.com\/blog\/wp-json\/wp\/v2\/media?parent=29060"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/www.tothenew.com\/blog\/wp-json\/wp\/v2\/categories?post=29060"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/www.tothenew.com\/blog\/wp-json\/wp\/v2\/tags?post=29060"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}