{"id":34592,"date":"2016-05-25T20:41:43","date_gmt":"2016-05-25T15:11:43","guid":{"rendered":"http:\/\/www.tothenew.com\/blog\/?p=34592"},"modified":"2016-12-19T15:39:16","modified_gmt":"2016-12-19T10:09:16","slug":"elasticsearch-shard-filterting","status":"publish","type":"post","link":"https:\/\/www.tothenew.com\/blog\/elasticsearch-shard-filterting\/","title":{"rendered":"Elasticsearch: Shard Filterting"},"content":{"rendered":"<p>Our\u00a0<a title=\"Cloud DevOps Engineers\" href=\"http:\/\/www.tothenew.com\/devops-automation-consulting\">cloud DevOps engineers<\/a>\u00a0have been using Elasticsearch on production environment for an e-commerce\u00a0website for quite a while. The website has one admin server to manage activities such as adding new production, managing discounts on various items, fetching reports etc. We came across a requirement where downloading reports from admin server should not put extra load on the Elasticsearch server since we have been using single <a href=\"http:\/\/www.tothenew.com\/blog\/elasticsearch-migration-found-to-aws-ec2\/\">Elasticsearch<\/a> cluster for customer-facing application and admin server as well.<\/p>\n<p><span style=\"font-weight: 400;\"> <\/span><\/p>\n<p>Currently, we are using two Elasticsearch nodes in a\u00a0cluster. Downloading a report fetches data from an Elasticsearch index named \u201creport\u201d. In order to segregate load of downloading reports from the two Elasticsearch nodes, we decided to add one more low configuration node which would host only \u201creport\u201d index.<\/p>\n<p><strong>Scenario:<\/strong> Implement Shard filtering on Elasticsearch node without downtime.<\/p>\n<p>We can tell Elasticsearch to host shards of a\u00a0specific index on the\u00a0desired node. This is called Shard filtering or shard allocation filtering.<\/p>\n<p><span style=\"font-weight: 400;\">We followed the below steps to implement Shard filtering:<\/span><\/p>\n<ol>\n<li style=\"font-weight: 400;\">Configure the third node Elasticsearch node. It can be of low configuration since it will only be used to download reports<\/li>\n<li style=\"font-weight: 400;\"><span style=\"font-weight: 400;\"><span style=\"font-weight: 400;\"><span style=\"font-weight: 400;\">Add the below tag in the &#8216;elasticsearch.yml&#8217;\u00a0file of the node and do not start Elasticsearch process yet.<\/span><\/span><\/span>\n<pre>node.tag: admin\r\n<\/pre>\n<p><span style=\"font-weight: 400;\">We need to add the\u00a0same tag to the indexes also.<\/span><span style=\"font-weight: 400;\"><br \/>\n<\/span><span style=\"font-weight: 400;\">One tag is needed so that index can allocate its shards to a node which has the same tag. <\/span><\/li>\n<li style=\"font-weight: 400;\"><span style=\"font-weight: 400;\">We want report index to be hosted on the third node so below curl request will tell report index to route only to the node with tag \u201cadmin\u201d.<\/span><span style=\"font-weight: 400;\"><br \/>\n<\/span><span style=\"font-weight: 400;\">We also need to tell all other indexes to not to get routed to the node with tag \u201cadmin\u201d.<\/span><span style=\"font-weight: 400;\"><br \/>\n<\/span><span style=\"font-weight: 400;\">Before adding exclude tag \u201cadmin\u201d in all the indexes, remove index \u201dreport\u201d after taking backup if required:<br \/>\n<\/span><\/p>\n<pre>DELETE report<\/pre>\n<p>Add exclude tag \u201cadmin\u201d in all the indexes:<\/p>\n<pre>PUT _settings {\r\n\"index.routing.allocation.exclude.tag: \"admin\"\r\n}<\/pre>\n<p>Now, restore or create report index and add the tag \u201cadmin\u201dto it:<\/p>\n<pre>PUT report\/_settings {\r\n\"index.routing.allocation.include.tag\": \"admin\"\r\n}<\/pre>\n<\/li>\n<li style=\"font-weight: 400;\"><span style=\"font-weight: 400;\">Now start the third node and perform rolling restarts on the other nodes. We can see using &#8220;head&#8221; plugin that shards of the report index are only allocated to node 3. We are done.<\/span><\/li>\n<\/ol>\n<p><span style=\"font-weight: 400;\">We, then, verified it by downloading\u00a0a report from admin server and the CPU utilization of the node 3 was increasing while other nodes\u2019 experienced no impact.<\/span><\/p>\n<p>Shard filtering is a\u00a0very crucial feature provided by Elasticsearch in order to segregate nodes based on our application requirements. \u00a0Multiple nodes could be configured to host a specific set of indexes depending upon the application architecture. In a micro-services architecture, a\u00a0single Elasticsearch cluster could be used to serve\u00a0different applications.<\/p>\n<p>&nbsp;<\/p>\n","protected":false},"excerpt":{"rendered":"<p>Our\u00a0cloud DevOps engineers\u00a0have been using Elasticsearch on production environment for an e-commerce\u00a0website for quite a while. The website has one admin server to manage activities such as adding new production, managing discounts on various items, fetching reports etc. We came across a requirement where downloading reports from admin server should not put extra load on [&hellip;]<\/p>\n","protected":false},"author":154,"featured_media":0,"comment_status":"open","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":{"iawp_total_views":5},"categories":[1174,2348,1],"tags":[1137,4843,1837,1524,3355,1252,3356,3352,3354,3353],"aioseo_notices":[],"_links":{"self":[{"href":"https:\/\/www.tothenew.com\/blog\/wp-json\/wp\/v2\/posts\/34592"}],"collection":[{"href":"https:\/\/www.tothenew.com\/blog\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/www.tothenew.com\/blog\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/www.tothenew.com\/blog\/wp-json\/wp\/v2\/users\/154"}],"replies":[{"embeddable":true,"href":"https:\/\/www.tothenew.com\/blog\/wp-json\/wp\/v2\/comments?post=34592"}],"version-history":[{"count":0,"href":"https:\/\/www.tothenew.com\/blog\/wp-json\/wp\/v2\/posts\/34592\/revisions"}],"wp:attachment":[{"href":"https:\/\/www.tothenew.com\/blog\/wp-json\/wp\/v2\/media?parent=34592"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/www.tothenew.com\/blog\/wp-json\/wp\/v2\/categories?post=34592"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/www.tothenew.com\/blog\/wp-json\/wp\/v2\/tags?post=34592"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}