{"id":77595,"date":"2026-02-10T17:38:22","date_gmt":"2026-02-10T12:08:22","guid":{"rendered":"https:\/\/www.tothenew.com\/blog\/?p=77595"},"modified":"2026-02-24T18:10:18","modified_gmt":"2026-02-24T12:40:18","slug":"containers-lie-a-deep-dive-into-docker-shim-and-a-real-on-call-fix","status":"publish","type":"post","link":"https:\/\/www.tothenew.com\/blog\/containers-lie-a-deep-dive-into-docker-shim-and-a-real-on-call-fix\/","title":{"rendered":"Containers Lie | A Deep Dive into Docker-Shim and a Real On-Call Fix"},"content":{"rendered":"<p>In this BLOG I will share an incident that taught me how containers really work under the hood.<\/p>\n<h1>Production Down &#8211;<\/h1>\n<p>Once I received production website down alert for one of my customer.<br \/>\nAs I checked the website was giving 502<\/p>\n<div id=\"attachment_77705\" style=\"width: 856px\" class=\"wp-caption alignnone\"><img aria-describedby=\"caption-attachment-77705\" decoding=\"async\" loading=\"lazy\" class=\" wp-image-77705\" src=\"https:\/\/www.tothenew.com\/blog\/wp-ttn-blog\/uploads\/2026\/02\/Screenshot-5.png\" alt=\"website-down\" width=\"846\" height=\"348\" srcset=\"\/blog\/wp-ttn-blog\/uploads\/2026\/02\/Screenshot-5.png 1264w, \/blog\/wp-ttn-blog\/uploads\/2026\/02\/Screenshot-5-300x123.png 300w, \/blog\/wp-ttn-blog\/uploads\/2026\/02\/Screenshot-5-1024x421.png 1024w, \/blog\/wp-ttn-blog\/uploads\/2026\/02\/Screenshot-5-768x316.png 768w, \/blog\/wp-ttn-blog\/uploads\/2026\/02\/Screenshot-5-624x257.png 624w\" sizes=\"(max-width: 846px) 100vw, 846px\" \/><p id=\"caption-attachment-77705\" class=\"wp-caption-text\">website-down<\/p><\/div>\n<h1>Initial Checks &#8211;<\/h1>\n<p>I immediately logged in to the production host to investigate.<\/p>\n<p>The first thing I checked was the container that was running the production WordPress website.<\/p>\n<blockquote><p><em>$ docker ps<\/em><br \/>\n<em>CONTAINER ID\u00a0 \u00a0 \u00a0 \u00a0 \u00a0IMAGE \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0COMMAND \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0CREATED \u00a0 \u00a0 \u00a0 \u00a0 \u00a0STATUS \u00a0 \u00a0 \u00a0 \u00a0 \u00a0PORTS\u00a0 \u00a0 \u00a0 \u00a0 \u00a0 NAMES<\/em><br \/>\n<em>7f3a9c1b2d91 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0wordpress:6.4-apache \u00a0 \u00a0 \u00a0 \u00a0 \u00a0&#8220;docker-entrypoint.s\u2026&#8221; 3 days ago Up 3 days 0.0.0.0:80-&gt;80\/tcp prod-wordpress<\/em><\/p><\/blockquote>\n<p>At first glance, everything looked healthy the container was up there were no restarts Ports were mapped correctly.<\/p>\n<p>I checked and relaoded the webiste again just to find out the website was still down.<\/p>\n<h1>Next Move &#8211;<\/h1>\n<p>Next I checked the POD utilisation<\/p>\n<blockquote><p>$ docker stats prod-wordpress<br \/>\nCONTAINER ID \u00a0 \u00a0 \u00a0 \u00a0 \u00a0NAME \u00a0 \u00a0 \u00a0 \u00a0 \u00a0CPU % \u00a0 \u00a0 \u00a0 \u00a0 \u00a0MEM USAGE \/ LIMIT \u00a0 \u00a0 \u00a0 \u00a0 \u00a0MEM % \u00a0 \u00a0 \u00a0 \u00a0 \u00a0NET I\/O\u00a0 \u00a0 \u00a0 \u00a0 \u00a0 BLOCK I\/O \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0PIDS<br \/>\n7f3a9c1b2d91\u00a0 \u00a0 prod-wordpress<strong> \u00a0 \u00a0 0.00%<\/strong> \u00a0 \u00a0 \u00a0 \u00a0 \u00a0110MiB \/ 2GiB \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a05.37%<strong> \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a00B \/ 0B \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a00B \/ 0B<\/strong> \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a01<\/p><\/blockquote>\n<p>Red flags:<\/p>\n<ul>\n<li>0% CPU<\/li>\n<li>0 network traffic<\/li>\n<li>Only 1 PID<\/li>\n<\/ul>\n<p>Website was down but container showed \u201cUp\u201d<br \/>\n<strong>This clearly meant -&gt; The container was running, but not serving any traffic.<\/strong><\/p>\n<p>&nbsp;<\/p>\n<h1>Quick Fix &#8211;<\/h1>\n<p>My Attempt of Quick Fix (The first thought of every DevOps) &#8211;<\/p>\n<p>I tried restarting the container.<\/p>\n<blockquote><p>$ docker restart prod-wordpress<\/p><\/blockquote>\n<p>Guess what it was hung<\/p>\n<p>Next tried to stop the container<\/p>\n<blockquote><p>$ docker stop prod-wordpress<\/p><\/blockquote>\n<p>It also hung, now it was a panic moment for me.<\/p>\n<p>Even a force kill didn\u2019t work:<\/p>\n<blockquote><p>$ docker kill prod-wordpress<\/p><\/blockquote>\n<p>I thought of restarting the Docker daemon (systemctl restart docker) might have been the easiest fix. But this was production all conatiners would be impacted and there would be unplanned downtime.<\/p>\n<p>At this point of time I was pretty clueless. Then I went to seek help with online.<\/p>\n<p>Docker-Shim<\/p>\n<p>While researching similar incidents, I came to know about docker shim which is present in old docker versions.<\/p>\n<blockquote><p>docker-shim<\/p><\/blockquote>\n<p>Older Docker versions used a helper process called docker-shim, which acted as an intermediate between Docker and the container runtime.<\/p>\n<p>Each container had its own docker-shim process.<\/p>\n<p>So I checked:<\/p>\n<blockquote><p>$ ps aux | grep docker-shim | grep prod-wordpress<br \/>\nroot \u00a0 \u00a0 \u00a0 \u00a0 \u00a024791 \u00a0 \u00a0 \u00a0 \u00a0 \u00a00.0 0.1 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0123456 \u00a0 \u00a0 \u00a0 \u00a0 \u00a03456 ? Sl\u00a0 \u00a0 Feb08 0:01 \u00a0docker-shim -namespace moby -id 7f3a9c1b2d91 -address \/run\/docker\/libcontainerd\/docker-containerd.sock<\/p><\/blockquote>\n<p>Findings &#8211;<\/p>\n<p>docker-shim PID: 24791<br \/>\nIt was the parent process of the container\u2019s main PID (24873)<\/p>\n<p>Since docker-shim is just a helper process, I decided to kill it directly.<\/p>\n<blockquote><p>$ kill -9 24791<\/p><\/blockquote>\n<p>Immediately checked Docker again:<\/p>\n<blockquote><p>$ docker ps -a<br \/>\nCONTAINER ID\u00a0 \u00a0 \u00a0IMAGE\u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 STATUS\u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0NAMES<br \/>\n7f3a9c1b2d91\u00a0 \u00a0 \u00a0 \u00a0 wordpress:6.4-apache\u00a0 \u00a0<strong> Exited (137) 2 seconds ago.<\/strong>\u00a0 \u00a0prod-wordpress<\/p><\/blockquote>\n<p>Bingo! The Zombie process was killed and container was in Exited state now<\/p>\n<p>Now I started the container and it worked instantly:<\/p>\n<blockquote><p>$ docker start prod-wordpress<br \/>\nprod-wordpress<\/p><\/blockquote>\n<p>Verified traffic:<\/p>\n<blockquote><p>$ docker stats prod-wordpress<br \/>\nCONTAINERID \u00a0 \u00a0 \u00a0 \u00a0 \u00a0NAME \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0CPU % \u00a0 \u00a0 \u00a0 \u00a0 \u00a0MEM USAGE \/ LIMIT \u00a0 \u00a0 \u00a0 NET I\/O<br \/>\n7f3a9c1b2d91\u00a0 \u00a0 \u00a0 \u00a0 \u00a0 prod-wordpress \u00a0 \u00a0 \u00a0 \u00a0 \u00a0<strong>1.23%<\/strong> \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 145MiB \/ 2GiB \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0<strong>8.4MB \/ 6.9MB<\/strong><\/p><\/blockquote>\n<p>Website was back online.<\/p>\n<div id=\"attachment_77706\" style=\"width: 831px\" class=\"wp-caption alignnone\"><img aria-describedby=\"caption-attachment-77706\" decoding=\"async\" loading=\"lazy\" class=\" wp-image-77706\" src=\"https:\/\/www.tothenew.com\/blog\/wp-ttn-blog\/uploads\/2026\/02\/unnamed-3.webp\" alt=\"website-up\" width=\"821\" height=\"223\" srcset=\"\/blog\/wp-ttn-blog\/uploads\/2026\/02\/unnamed-3.webp 1392w, \/blog\/wp-ttn-blog\/uploads\/2026\/02\/unnamed-3-300x81.webp 300w, \/blog\/wp-ttn-blog\/uploads\/2026\/02\/unnamed-3-1024x278.webp 1024w, \/blog\/wp-ttn-blog\/uploads\/2026\/02\/unnamed-3-768x209.webp 768w, \/blog\/wp-ttn-blog\/uploads\/2026\/02\/unnamed-3-624x169.webp 624w\" sizes=\"(max-width: 821px) 100vw, 821px\" \/><p id=\"caption-attachment-77706\" class=\"wp-caption-text\">website-up<\/p><\/div>\n<p>&nbsp;<\/p>\n<h1>Learned from This Incident<\/h1>\n<p>Containers Can Lie<br \/>\nA container showing \u201cUp\u201d doesn\u2019t mean it\u2019s healthy or serving traffic.<br \/>\ndocker-shim Was a Critical Link<br \/>\ndocker-shim acted as the parent process<br \/>\nIf it hung, the container lifecycle was broken<\/p>\n<h1>So next time if you face similar issues dont forget to check the docker shim process.<\/h1>\n","protected":false},"excerpt":{"rendered":"<p>In this BLOG I will share an incident that taught me how containers really work under the hood. Production Down &#8211; Once I received production website down alert for one of my customer. As I checked the website was giving 502 Initial Checks &#8211; I immediately logged in to the production host to investigate. The [&hellip;]<\/p>\n","protected":false},"author":2171,"featured_media":0,"comment_status":"open","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":{"iawp_total_views":26},"categories":[5877],"tags":[1892,1883,3979,1785],"aioseo_notices":[],"_links":{"self":[{"href":"https:\/\/www.tothenew.com\/blog\/wp-json\/wp\/v2\/posts\/77595"}],"collection":[{"href":"https:\/\/www.tothenew.com\/blog\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/www.tothenew.com\/blog\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/www.tothenew.com\/blog\/wp-json\/wp\/v2\/users\/2171"}],"replies":[{"embeddable":true,"href":"https:\/\/www.tothenew.com\/blog\/wp-json\/wp\/v2\/comments?post=77595"}],"version-history":[{"count":8,"href":"https:\/\/www.tothenew.com\/blog\/wp-json\/wp\/v2\/posts\/77595\/revisions"}],"predecessor-version":[{"id":77940,"href":"https:\/\/www.tothenew.com\/blog\/wp-json\/wp\/v2\/posts\/77595\/revisions\/77940"}],"wp:attachment":[{"href":"https:\/\/www.tothenew.com\/blog\/wp-json\/wp\/v2\/media?parent=77595"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/www.tothenew.com\/blog\/wp-json\/wp\/v2\/categories?post=77595"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/www.tothenew.com\/blog\/wp-json\/wp\/v2\/tags?post=77595"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}