{"id":33468,"date":"2016-04-24T19:38:54","date_gmt":"2016-04-24T14:08:54","guid":{"rendered":"http:\/\/www.tothenew.com\/blog\/?p=33468"},"modified":"2016-04-25T10:09:54","modified_gmt":"2016-04-25T04:39:54","slug":"mongo-point-in-time-restoration","status":"publish","type":"post","link":"https:\/\/www.tothenew.com\/blog\/mongo-point-in-time-restoration\/","title":{"rendered":"Mongo Point in Time Restoration"},"content":{"rendered":"<p style=\"text-align: left;\"><img decoding=\"async\" loading=\"lazy\" class=\"alignnone  wp-image-33469\" src=\"\/blog\/wp-ttn-blog\/uploads\/2016\/04\/images.png\" alt=\"images\" width=\"752\" height=\"219\" \/><\/p>\n<p style=\"text-align: left;\">While working with databases sometimes there is a\u00a0need to have data till\u00a0a specific time or date across all the secondaries or database peers which is useful for testing a particular functionality. Also, in the event of outage, few folks would like to restore it to a specific weekend or month end just to have uniformity across all the dependent applications.<\/p>\n<p><span style=\"font-weight: 400;\">Here I am\u00a0going to discuss how it can be achieved in <a title=\"MongoDB Consulting\" href=\"http:\/\/www.tothenew.com\/mean-stack-web-development-consulting\">MongoDB<\/a>. I will be using MongoDB\u2019s native tools &#8211; mongodump &amp; mongorestore. Although these tools are heavily used to take database\/table dumps of mongo data, these can also be used to have Oplog backup. Before proceeding, you must have idea of Oplog. <\/span><\/p>\n<p><span style=\"font-weight: 400;\">Oplog is like a normal collection (or table) in MongoDB which keeps all the operations that are received by MongoDB on its port. These operations can be updates, inserts, deletes etc. That means a working or executing copy of all the transactions that are stored in the normal collections will also be there in Oplog. However, this cannot be treated as a backup. You must keep your backup copies as well since Oplog is a capped collection,\u00a0which means that it will keep overwriting the oldest data as new data comes in, once its capacity is full. So, choose Oplog size accordingly. <\/span><\/p>\n<p><span style=\"font-weight: 400;\">To have mongo restored to point in time, you must take its backup accordingly. This assumes you are already taking mongo data backups regularly. Just in addition to that, take Oplog dump as well:<\/span><\/p>\n<p><span style=\"font-weight: 400;\">Here is what you need to take backup of &#8211; <\/span><b>oplog.rs<\/b><span style=\"font-weight: 400;\"> collection: <\/span><\/p>\n<p><strong><strong><img decoding=\"async\" loading=\"lazy\" class=\"alignnone size-full wp-image-33470\" src=\"\/blog\/wp-ttn-blog\/uploads\/2016\/04\/oplog_collection.png\" alt=\"oplog_collection\" width=\"447\" height=\"201\" \/><\/strong><\/strong><\/p>\n<p><span style=\"font-weight: 400;\">Take the backup using mongodump:<\/span><\/p>\n<p><strong><strong><img decoding=\"async\" loading=\"lazy\" class=\"alignnone size-full wp-image-33471\" src=\"\/blog\/wp-ttn-blog\/uploads\/2016\/04\/mongodump.png\" alt=\"mongodump\" width=\"1059\" height=\"199\" \/><\/strong><\/strong><\/p>\n<p><span style=\"font-weight: 400;\">From above command, \u00a0<\/span><b>local<\/b><span style=\"font-weight: 400;\"> is the database and <\/span><b>oplog.rs<\/b><span style=\"font-weight: 400;\"> is the collection name. oplogBack_5April2016 is the non-existent directory where you will take backup. <\/span><b>oplog.rs.bson<\/b><span style=\"font-weight: 400;\"> is the required dump collected after running this command and which will be used to have point-in-time restored data. <\/span><\/p>\n<p><span style=\"font-weight: 400;\">To restore to point in time, you must have your \u2018time\u2019 handy. What that means is you must know the date and time up to which you want to actually restore. That is calculated on the basis of <\/span><a href=\"https:\/\/en.wikipedia.org\/wiki\/Epoch_(reference_date)\"><span style=\"font-weight: 400;\">epoch<\/span><\/a><span style=\"font-weight: 400;\"> value. Oplog keep all the transactions on the basis of this epoch value available as <\/span><b>timestamp<\/b><span style=\"font-weight: 400;\">. <\/span><\/p>\n<p><span style=\"font-weight: 400;\">Now suppose you want to restore up to a date, say 31st March 2016 23:59:00. Its epoch value will be calculated as <\/span><b><i>1459468740<\/i><\/b><span style=\"font-weight: 400;\">. Having said that, it is not necessary Oplog will have a record created exact on the same time, it might create a record earlier or later to this time. And obviously that will happen most of the time. In that scenario, you must find an epoch value just less than the required epoch time and you will restore up to that timestamp. Your goal is to find <\/span><b>\u201c&#8221;ts&#8221;:{&#8220;timestamp&#8221;:{&#8220;t&#8221;:x,&#8221;i&#8221;:y}}\u201d<\/b><span style=\"font-weight: 400;\"> entry and note its values from Oplogs. X is the epoch value we are interested in and Y is an incrementing ordinal for operations within a given second. In case you find more than 1 value of Y for the same value of X, you should use the largest Y value.<\/span><\/p>\n<p><span style=\"font-weight: 400;\">In my case, I created and used this <\/span><a href=\"https:\/\/gist.github.com\/Amit-Naudiyal\/553dd057c5d97ae9c4bec0433ab0c0f4\"><span style=\"font-weight: 400;\">script<\/span><\/a><span style=\"font-weight: 400;\"> to get an epoch just less than my desired point-in-time epoch value from Oplogs. To be able to use this script you must have your Oplog dump in human readable format. <\/span><a href=\"https:\/\/docs.mongodb.org\/manual\/reference\/program\/bsondump\/\"><span style=\"font-weight: 400;\">bsondump<\/span><\/a><span style=\"font-weight: 400;\"> is such utility to convert your bson dump to json values. <\/span><\/p>\n<p><strong><strong><img decoding=\"async\" loading=\"lazy\" class=\"alignnone size-full wp-image-33472\" src=\"\/blog\/wp-ttn-blog\/uploads\/2016\/04\/bsondump.png\" alt=\"bsondump\" width=\"635\" height=\"54\" \/><\/strong><\/strong><\/p>\n<p>Say or For example, I have got<b> \u201cts\u201d : timestamp(1459464310, 1) <\/b><span style=\"font-weight: 400;\">&amp; <\/span><b>\u201cts\u201d : timestamp(1459464310, 2)<\/b><span style=\"font-weight: 400;\"> as epoch just less than my desired timestamp value. This corresponds to <\/span><i><span style=\"font-weight: 400;\">31 Mar 2016 22:45:10<\/span><\/i><span style=\"font-weight: 400;\"> and this is the last record my Oplog had just before 31st March 23:59:00. The ordinal value I choose will be 2. <\/span><\/p>\n<p>Once you are ready with the Oplog dump and the correct epoch and ordinal values, use following procedure for restoration:<\/p>\n<p><b>Step1<\/b><span style=\"font-weight: 400;\">. Restore the normal data from latest dump just before the point-in-time timestamp. Say you do weekly backup and so you had backup only till 27th March 2016 (<\/span><i><span style=\"font-weight: 400;\">Sunday<\/span><\/i><span style=\"font-weight: 400;\">). So, restore the data on new node up to 27th March 2016. It will be done using your normal restoration procedure.<\/span><\/p>\n<p><b>Step2<\/b><span style=\"font-weight: 400;\">: Rename your dump with <\/span><b>oplog.bson<\/b><span style=\"font-weight: 400;\"> as mongorestore will look for oplog.bson in the specified directory or in the root of dump directory if no directory specified.<\/span><\/p>\n<p><strong><strong><img decoding=\"async\" loading=\"lazy\" class=\"alignnone size-full wp-image-33473\" src=\"\/blog\/wp-ttn-blog\/uploads\/2016\/04\/rename.png\" alt=\"rename\" width=\"644\" height=\"37\" \/><\/strong><\/strong><\/p>\n<p><b>Step3<\/b><span style=\"font-weight: 400;\">. Replay your Oplogs until <\/span><b>1459464310:2.<\/b><\/p>\n<p><strong><strong><img decoding=\"async\" loading=\"lazy\" class=\"alignnone size-full wp-image-33474\" src=\"\/blog\/wp-ttn-blog\/uploads\/2016\/04\/mongorestore.png\" alt=\"mongorestore\" width=\"1027\" height=\"89\" \/><\/strong><\/strong><\/p>\n<p style=\"padding-left: 90px;\"><b>&#8211;oplogReplay: <\/b><span style=\"font-weight: 400;\">Replays the oplog after restoring the dump to ensure that\u00a0<\/span>the current state of the database reflects the point-in-time backup.<\/p>\n<p style=\"padding-left: 90px;\"><b>&#8211;oplogLimit: <\/b><span style=\"font-weight: 400;\">Prevents mongorestore from applying oplog entries with\u00a0<\/span>timestamp newer than or equal to &lt;timestamp&gt;.<\/p>\n<p style=\"padding-left: 90px;\"><span style=\"font-weight: 400;\"> \u00a0<\/span><b>oplogRest<\/b><span style=\"font-weight: 400;\">: This is the directory where I have kept my oplog.bson.<\/span><\/p>\n<p>Just note that restoring Oplogs is an idempotent activity and it will do\u00a0no harm if you run this multiple times. This is the time you can verify if you have restored correctly upto your desired timestamp. That should be all.<\/p>\n<p>&nbsp;<\/p>\n","protected":false},"excerpt":{"rendered":"<p>While working with databases sometimes there is a\u00a0need to have data till\u00a0a specific time or date across all the secondaries or database peers which is useful for testing a particular functionality. Also, in the event of outage, few folks would like to restore it to a specific weekend or month end just to have uniformity [&hellip;]<\/p>\n","protected":false},"author":181,"featured_media":0,"comment_status":"open","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":{"iawp_total_views":36},"categories":[1],"tags":[1900,3224,3223,3226,3225,3220],"aioseo_notices":[],"_links":{"self":[{"href":"https:\/\/www.tothenew.com\/blog\/wp-json\/wp\/v2\/posts\/33468"}],"collection":[{"href":"https:\/\/www.tothenew.com\/blog\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/www.tothenew.com\/blog\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/www.tothenew.com\/blog\/wp-json\/wp\/v2\/users\/181"}],"replies":[{"embeddable":true,"href":"https:\/\/www.tothenew.com\/blog\/wp-json\/wp\/v2\/comments?post=33468"}],"version-history":[{"count":0,"href":"https:\/\/www.tothenew.com\/blog\/wp-json\/wp\/v2\/posts\/33468\/revisions"}],"wp:attachment":[{"href":"https:\/\/www.tothenew.com\/blog\/wp-json\/wp\/v2\/media?parent=33468"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/www.tothenew.com\/blog\/wp-json\/wp\/v2\/categories?post=33468"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/www.tothenew.com\/blog\/wp-json\/wp\/v2\/tags?post=33468"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}