{"id":76331,"date":"2025-09-12T13:12:18","date_gmt":"2025-09-12T07:42:18","guid":{"rendered":"https:\/\/www.tothenew.com\/blog\/?p=76331"},"modified":"2025-10-13T14:44:28","modified_gmt":"2025-10-13T09:14:28","slug":"mongodb-recovery-practice-dr-drill-automation-with-terraform-python-jenkins","status":"publish","type":"post","link":"https:\/\/www.tothenew.com\/blog\/mongodb-recovery-practice-dr-drill-automation-with-terraform-python-jenkins\/","title":{"rendered":"MongoDB Recovery Practice \u2014 DR Drill Automation with Terraform, Python &amp; Jenkins"},"content":{"rendered":"<p>&nbsp;<\/p>\n<p>MongoDB Recovery Practice \u2014 DR Drill Automation with Terraform, Python &amp; Jenkins<br \/>\nWhen disaster strikes, the only thing that matters is how fast and reliably you can get your database back. Backups on a shelf are worthless if you can\u2019t restore them under pressure. I built an automated MongoDB restore pipeline with Terraform, Python, and Jenkins so I could repeatedly prove \u2014 not just hope \u2014 that restores work.<\/p>\n<p>The goal<br \/>\nI set out to remove the tedious, error-prone steps we kept repeating. I wanted a restore flow that was identical across environments, a script to build replica sets so we wouldn\u2019t misconfigure anything by hand, and real validation that proves the data is intact. Centralizing the whole thing in Jenkins meant every run is auditable and repeatable. In the end we built a single pipeline: it provisions infrastructure, applies the backup, runs integrity checks, and stores logs and a validation report for post-mortems and audits.<\/p>\n<p>&nbsp;<\/p>\n<p>Result: one pipeline that spins up infra, restores MongoDB from backups, validates the restore, and stores logs and reports.<\/p>\n<p>The workflow<\/p>\n<p>1.,Infrastructure with Terraform<\/p>\n<p>I use Terraform to bring up clean infra for each drill \u2014 EC2s (or VMs\/containers), networking, and persistent volumes. That guarantees the same starting point every time and removes \u201cworks on my machine\u201d surprises.<\/p>\n<p>2.Replica set creation (Python)<\/p>\n<p>Instead of typing rs.initiate() and rs.add() by hand, a Python script does it for me. It handles the ordering and retries so the replica set comes up consistently.<\/p>\n<div id=\"attachment_76643\" style=\"width: 310px\" class=\"wp-caption alignnone\"><img aria-describedby=\"caption-attachment-76643\" decoding=\"async\" loading=\"lazy\" class=\"size-medium wp-image-76643\" src=\"https:\/\/www.tothenew.com\/blog\/wp-ttn-blog\/uploads\/2025\/09\/ChatGPT-Image-Sep-30-2025-02_41_21-PM-300x300.png\" alt=\"code 1 \" width=\"300\" height=\"300\" srcset=\"\/blog\/wp-ttn-blog\/uploads\/2025\/09\/ChatGPT-Image-Sep-30-2025-02_41_21-PM-300x300.png 300w, \/blog\/wp-ttn-blog\/uploads\/2025\/09\/ChatGPT-Image-Sep-30-2025-02_41_21-PM-150x150.png 150w, \/blog\/wp-ttn-blog\/uploads\/2025\/09\/ChatGPT-Image-Sep-30-2025-02_41_21-PM-768x768.png 768w, \/blog\/wp-ttn-blog\/uploads\/2025\/09\/ChatGPT-Image-Sep-30-2025-02_41_21-PM-624x624.png 624w, \/blog\/wp-ttn-blog\/uploads\/2025\/09\/ChatGPT-Image-Sep-30-2025-02_41_21-PM-120x120.png 120w, \/blog\/wp-ttn-blog\/uploads\/2025\/09\/ChatGPT-Image-Sep-30-2025-02_41_21-PM-24x24.png 24w, \/blog\/wp-ttn-blog\/uploads\/2025\/09\/ChatGPT-Image-Sep-30-2025-02_41_21-PM-48x48.png 48w, \/blog\/wp-ttn-blog\/uploads\/2025\/09\/ChatGPT-Image-Sep-30-2025-02_41_21-PM-96x96.png 96w, \/blog\/wp-ttn-blog\/uploads\/2025\/09\/ChatGPT-Image-Sep-30-2025-02_41_21-PM.png 1024w\" sizes=\"(max-width: 300px) 100vw, 300px\" \/><p id=\"caption-attachment-76643\" class=\"wp-caption-text\">code 1<\/p><\/div>\n<p>Automating this avoids timing issues and misconfigurations.<\/p>\n<p>Backup &amp; restore<\/p>\n<p>Backups are normalized into compressed archives. The restore routine unpacks a dump and applies it to the freshly provisioned MongoDB nodes, following the automated replica set setup.<\/p>\n<div id=\"attachment_76644\" style=\"width: 310px\" class=\"wp-caption alignnone\"><img aria-describedby=\"caption-attachment-76644\" decoding=\"async\" loading=\"lazy\" class=\"size-medium wp-image-76644\" src=\"https:\/\/www.tothenew.com\/blog\/wp-ttn-blog\/uploads\/2025\/09\/ChatGPT-Image-Sep-30-2025-02_45_51-PM-300x300.png\" alt=\"Image 2 : dump creation \" width=\"300\" height=\"300\" srcset=\"\/blog\/wp-ttn-blog\/uploads\/2025\/09\/ChatGPT-Image-Sep-30-2025-02_45_51-PM-300x300.png 300w, \/blog\/wp-ttn-blog\/uploads\/2025\/09\/ChatGPT-Image-Sep-30-2025-02_45_51-PM-150x150.png 150w, \/blog\/wp-ttn-blog\/uploads\/2025\/09\/ChatGPT-Image-Sep-30-2025-02_45_51-PM-768x768.png 768w, \/blog\/wp-ttn-blog\/uploads\/2025\/09\/ChatGPT-Image-Sep-30-2025-02_45_51-PM-624x624.png 624w, \/blog\/wp-ttn-blog\/uploads\/2025\/09\/ChatGPT-Image-Sep-30-2025-02_45_51-PM-120x120.png 120w, \/blog\/wp-ttn-blog\/uploads\/2025\/09\/ChatGPT-Image-Sep-30-2025-02_45_51-PM-24x24.png 24w, \/blog\/wp-ttn-blog\/uploads\/2025\/09\/ChatGPT-Image-Sep-30-2025-02_45_51-PM-48x48.png 48w, \/blog\/wp-ttn-blog\/uploads\/2025\/09\/ChatGPT-Image-Sep-30-2025-02_45_51-PM-96x96.png 96w, \/blog\/wp-ttn-blog\/uploads\/2025\/09\/ChatGPT-Image-Sep-30-2025-02_45_51-PM.png 1024w\" sizes=\"(max-width: 300px) 100vw, 300px\" \/><p id=\"caption-attachment-76644\" class=\"wp-caption-text\">Image 2 :<br \/>dump creation<\/p><\/div>\n<p>Restoration executes through:<\/p>\n<div id=\"attachment_76645\" style=\"width: 310px\" class=\"wp-caption alignnone\"><img aria-describedby=\"caption-attachment-76645\" decoding=\"async\" loading=\"lazy\" class=\"size-medium wp-image-76645\" src=\"https:\/\/www.tothenew.com\/blog\/wp-ttn-blog\/uploads\/2025\/09\/ChatGPT-Image-Sep-30-2025-02_48_04-PM-300x300.png\" alt=\"Image 3 restoration initate\" width=\"300\" height=\"300\" srcset=\"\/blog\/wp-ttn-blog\/uploads\/2025\/09\/ChatGPT-Image-Sep-30-2025-02_48_04-PM-300x300.png 300w, \/blog\/wp-ttn-blog\/uploads\/2025\/09\/ChatGPT-Image-Sep-30-2025-02_48_04-PM-150x150.png 150w, \/blog\/wp-ttn-blog\/uploads\/2025\/09\/ChatGPT-Image-Sep-30-2025-02_48_04-PM-768x768.png 768w, \/blog\/wp-ttn-blog\/uploads\/2025\/09\/ChatGPT-Image-Sep-30-2025-02_48_04-PM-624x624.png 624w, \/blog\/wp-ttn-blog\/uploads\/2025\/09\/ChatGPT-Image-Sep-30-2025-02_48_04-PM-120x120.png 120w, \/blog\/wp-ttn-blog\/uploads\/2025\/09\/ChatGPT-Image-Sep-30-2025-02_48_04-PM-24x24.png 24w, \/blog\/wp-ttn-blog\/uploads\/2025\/09\/ChatGPT-Image-Sep-30-2025-02_48_04-PM-48x48.png 48w, \/blog\/wp-ttn-blog\/uploads\/2025\/09\/ChatGPT-Image-Sep-30-2025-02_48_04-PM-96x96.png 96w, \/blog\/wp-ttn-blog\/uploads\/2025\/09\/ChatGPT-Image-Sep-30-2025-02_48_04-PM.png 1024w\" sizes=\"(max-width: 300px) 100vw, 300px\" \/><p id=\"caption-attachment-76645\" class=\"wp-caption-text\">Image 3<br \/>restoration initate<\/p><\/div>\n<div class=\"mceTemp\"><\/div>\n<p>4.Validation &amp; comparison<\/p>\n<p>This is the real game-changer. Rather than hoping the restore worked, I run a validation script that:<\/p>\n<p>checks what collections exist (and whether any are missing),<br \/>\ncompares document counts collection-by-collection,<br \/>\ncompares indexes,<br \/>\noptionally samples _id values for obvious mismatches.<\/p>\n<p>If counts and indexes match, the script returns success (exit code 0); if not, it fails. That makes it perfect for CI\/CD \u2014 Jenkins can gate the pipeline on the validation result.<\/p>\n<p>&nbsp;<\/p>\n<p>&nbsp;<\/p>\n<p>5.Logging &amp; reporting<br \/>\nEvery step logs to Jenkins. The validation creates a structured JSON report and Jenkins archives logs and artifacts for audits. That audit trail builds trust: when auditors ask, you can show a drill\u2019s inputs, outputs, and validation report.<\/p>\n<p>6. Jenkins orchestration<br \/>\nSingle Jenkins job with stages:<br \/>\nTerraform \u2192 Replica Set Setup \u2192 Restore \u2192 Validation &amp; Comparison \u2192 Archive Logs<\/p>\n<div id=\"attachment_76646\" style=\"width: 310px\" class=\"wp-caption alignnone\"><img aria-describedby=\"caption-attachment-76646\" decoding=\"async\" loading=\"lazy\" class=\"size-medium wp-image-76646\" src=\"https:\/\/www.tothenew.com\/blog\/wp-ttn-blog\/uploads\/2025\/09\/44ec5350-6c35-4077-890e-1d45db3cad04-300x33.png\" alt=\"Image 3 : Flow diagram\" width=\"300\" height=\"33\" srcset=\"\/blog\/wp-ttn-blog\/uploads\/2025\/09\/44ec5350-6c35-4077-890e-1d45db3cad04-300x33.png 300w, \/blog\/wp-ttn-blog\/uploads\/2025\/09\/44ec5350-6c35-4077-890e-1d45db3cad04-768x85.png 768w, \/blog\/wp-ttn-blog\/uploads\/2025\/09\/44ec5350-6c35-4077-890e-1d45db3cad04-624x69.png 624w, \/blog\/wp-ttn-blog\/uploads\/2025\/09\/44ec5350-6c35-4077-890e-1d45db3cad04.png 963w\" sizes=\"(max-width: 300px) 100vw, 300px\" \/><p id=\"caption-attachment-76646\" class=\"wp-caption-text\">Image 3 :<br \/>Flow diagram<\/p><\/div>\n<div id=\"attachment_76648\" style=\"width: 210px\" class=\"wp-caption alignnone\"><img aria-describedby=\"caption-attachment-76648\" decoding=\"async\" loading=\"lazy\" class=\"size-medium wp-image-76648\" src=\"https:\/\/www.tothenew.com\/blog\/wp-ttn-blog\/uploads\/2025\/09\/ChatGPT-Image-Sep-30-2025-02_53_06-PM-200x300.png\" alt=\"Pipeline Sample \" width=\"200\" height=\"300\" srcset=\"\/blog\/wp-ttn-blog\/uploads\/2025\/09\/ChatGPT-Image-Sep-30-2025-02_53_06-PM-200x300.png 200w, \/blog\/wp-ttn-blog\/uploads\/2025\/09\/ChatGPT-Image-Sep-30-2025-02_53_06-PM-683x1024.png 683w, \/blog\/wp-ttn-blog\/uploads\/2025\/09\/ChatGPT-Image-Sep-30-2025-02_53_06-PM-768x1152.png 768w, \/blog\/wp-ttn-blog\/uploads\/2025\/09\/ChatGPT-Image-Sep-30-2025-02_53_06-PM-624x936.png 624w, \/blog\/wp-ttn-blog\/uploads\/2025\/09\/ChatGPT-Image-Sep-30-2025-02_53_06-PM.png 1024w\" sizes=\"(max-width: 200px) 100vw, 200px\" \/><p id=\"caption-attachment-76648\" class=\"wp-caption-text\">Pipeline Sample<\/p><\/div>\n<p>&nbsp;<\/p>\n<p>Lessons learned<br \/>\nAutomate infra and DB setup. Terraform gives you a clean slate for every run and removes manual variability.<\/p>\n<p>Validation is not optional. Counts and index checks catch a lot of issues you wouldn\u2019t notice otherwise.<\/p>\n<p>Logs equal trust. Storing artifacts in Jenkins makes your drills credible to others.<\/p>\n<p>Practice makes perfect. Each drill gave me small improvements to scripts and timing.<\/p>\n<p>Minimal input reduces errors. I trimmed the required inputs to just host + DB name and let scripts infer the rest.<\/p>\n<p>Outcome<br \/>\nNow a single Jenkins job can provision infra, build a MongoDB replica set, restore from dumps, validate data and indexes, and store the whole run as an auditable artifact. The drills are predictable, repeatable, and quick \u2014 the kind of confidence you actually want during an incident.<\/p>\n<p>Drills are predictable, repeatable, and fast \u2014 the confidence you want during an incident.<\/p>\n<p>&nbsp;<\/p>\n<p>Restoration drills were made predictable, quick, and reliable<\/p>\n<p>Appendix\u2014Validation script (summary)<\/p>\n<div id=\"attachment_76649\" style=\"width: 210px\" class=\"wp-caption alignnone\"><img aria-describedby=\"caption-attachment-76649\" decoding=\"async\" loading=\"lazy\" class=\"size-medium wp-image-76649\" src=\"https:\/\/www.tothenew.com\/blog\/wp-ttn-blog\/uploads\/2025\/09\/validate_restore_optimized-200x300.png\" alt=\"alidate_restore_optimized sample code\" width=\"200\" height=\"300\" srcset=\"\/blog\/wp-ttn-blog\/uploads\/2025\/09\/validate_restore_optimized-200x300.png 200w, \/blog\/wp-ttn-blog\/uploads\/2025\/09\/validate_restore_optimized-624x936.png 624w, \/blog\/wp-ttn-blog\/uploads\/2025\/09\/validate_restore_optimized.png 683w\" sizes=\"(max-width: 200px) 100vw, 200px\" \/><p id=\"caption-attachment-76649\" class=\"wp-caption-text\">alidate_restore_optimized sample code<\/p><\/div>\n<p>Returns exit code 0 when counts and indexes match.<\/p>\n<p>Returns non-zero on mismatch so Jenkins can fail the build.<\/p>\n<p>Produces a JSON report with collection names, counts, index diffs, and sample _id checks.<\/p>\n<p>Usage<\/p>\n<div id=\"attachment_76650\" style=\"width: 310px\" class=\"wp-caption alignnone\"><img aria-describedby=\"caption-attachment-76650\" decoding=\"async\" loading=\"lazy\" class=\"size-medium wp-image-76650\" src=\"https:\/\/www.tothenew.com\/blog\/wp-ttn-blog\/uploads\/2025\/09\/ChatGPT-Image-Sep-30-2025-03_03_37-PM-300x200.png\" alt=\"Compare output example code \" width=\"300\" height=\"200\" srcset=\"\/blog\/wp-ttn-blog\/uploads\/2025\/09\/ChatGPT-Image-Sep-30-2025-03_03_37-PM-300x200.png 300w, \/blog\/wp-ttn-blog\/uploads\/2025\/09\/ChatGPT-Image-Sep-30-2025-03_03_37-PM-1024x683.png 1024w, \/blog\/wp-ttn-blog\/uploads\/2025\/09\/ChatGPT-Image-Sep-30-2025-03_03_37-PM-768x512.png 768w, \/blog\/wp-ttn-blog\/uploads\/2025\/09\/ChatGPT-Image-Sep-30-2025-03_03_37-PM-624x416.png 624w, \/blog\/wp-ttn-blog\/uploads\/2025\/09\/ChatGPT-Image-Sep-30-2025-03_03_37-PM.png 1536w\" sizes=\"(max-width: 300px) 100vw, 300px\" \/><p id=\"caption-attachment-76650\" class=\"wp-caption-text\">Compare output example code<\/p><\/div>\n<p>This script returns 0 in case of success (counts + indexes equal), or a non-zero value if mismatches occur \u2014 which makes it perfect for Jenkins pipelines.<\/p>\n<p>Final takeaway<\/p>\n<p>Backups don\u2019t save you \u2014 restores do. Automating the infra, the replica set, the restore, and the validation turned a slow, error-prone task into a single-click procedure you can trust. If you run MongoDB in production, drill restores until you can do them under pressure \u2014 that\u2019s when you\u2019ll know your backups are actually useful.<\/p>\n<p>&nbsp;<\/p>\n<p>&nbsp;<\/p>\n<p>&nbsp;<\/p>\n<p>&nbsp;<\/p>\n","protected":false},"excerpt":{"rendered":"<p>&nbsp; MongoDB Recovery Practice \u2014 DR Drill Automation with Terraform, Python &amp; Jenkins When disaster strikes, the only thing that matters is how fast and reliably you can get your database back. Backups on a shelf are worthless if you can\u2019t restore them under pressure. I built an automated MongoDB restore pipeline with Terraform, Python, [&hellip;]<\/p>\n","protected":false},"author":2183,"featured_media":0,"comment_status":"open","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":{"iawp_total_views":9},"categories":[5877],"tags":[8160,8164,1682,4846,8161,8162,1585,8163],"aioseo_notices":[],"_links":{"self":[{"href":"https:\/\/www.tothenew.com\/blog\/wp-json\/wp\/v2\/posts\/76331"}],"collection":[{"href":"https:\/\/www.tothenew.com\/blog\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/www.tothenew.com\/blog\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/www.tothenew.com\/blog\/wp-json\/wp\/v2\/users\/2183"}],"replies":[{"embeddable":true,"href":"https:\/\/www.tothenew.com\/blog\/wp-json\/wp\/v2\/comments?post=76331"}],"version-history":[{"count":3,"href":"https:\/\/www.tothenew.com\/blog\/wp-json\/wp\/v2\/posts\/76331\/revisions"}],"predecessor-version":[{"id":76701,"href":"https:\/\/www.tothenew.com\/blog\/wp-json\/wp\/v2\/posts\/76331\/revisions\/76701"}],"wp:attachment":[{"href":"https:\/\/www.tothenew.com\/blog\/wp-json\/wp\/v2\/media?parent=76331"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/www.tothenew.com\/blog\/wp-json\/wp\/v2\/categories?post=76331"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/www.tothenew.com\/blog\/wp-json\/wp\/v2\/tags?post=76331"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}