{"id":79102,"date":"2026-03-26T13:54:06","date_gmt":"2026-03-26T08:24:06","guid":{"rendered":"https:\/\/www.tothenew.com\/blog\/?p=79102"},"modified":"2026-04-06T10:55:59","modified_gmt":"2026-04-06T05:25:59","slug":"the-night-netflix-refused-to-buffer","status":"publish","type":"post","link":"https:\/\/www.tothenew.com\/blog\/the-night-netflix-refused-to-buffer\/","title":{"rendered":"The Night Netflix Refused to Buffer"},"content":{"rendered":"<div class=\"WordSection1\">\n<p class=\"MsoNormal\" style=\"text-align: center;\" align=\"center\"><b><u><span style=\"font-size: 22.0pt; line-height: 115%; font-family: 'Cambria Math',serif;\">The<br \/>\nNight Netflix Refused to Buffer<\/span><\/u><\/b><\/p>\n<p class=\"MsoNormal\" style=\"line-height: normal;\"><span style=\"font-size: 14.0pt;\">It\u2019s<br \/>\n2:00 AM.<\/span><\/p>\n<p class=\"MsoNormal\" style=\"line-height: normal;\"><span style=\"font-size: 14.0pt;\">You\u2019ve<br \/>\ntold yourself <i>\u201cjust one more episode\u201d<\/i> at least three times already. But<br \/>\nnow it\u2019s the finale. You click Play\u2026 and it starts instantly.<\/span><\/p>\n<p class=\"MsoNormal\" style=\"line-height: normal;\"><span style=\"font-size: 14.0pt;\">No<br \/>\nspinning circle. No loading screen. Nothing.<br \/>\nJust video.<br \/>\nFeels normal, right?<\/span><\/p>\n<p class=\"MsoNormal\" style=\"line-height: normal;\"><span style=\"font-size: 14.0pt;\">But<br \/>\nhere\u2019s the truth: that \u201cinstant\u201d moment is one of the most complex illusions in<br \/>\nmodern engineering. What you\u2019re seeing is not speed\u2014it\u2019s preparation. A system<br \/>\nthat has been working <i>before you even decided to click<\/i>. <\/span><\/p>\n<p class=\"MsoNormal\"><b><span style=\"font-size: 18.0pt; line-height: 115%;\">It Starts<br \/>\nBefore You Even Open Netflix<\/span><\/b><\/p>\n<p class=\"MsoNormal\" style=\"line-height: normal;\"><span style=\"font-size: 14.0pt;\">When<br \/>\nyou open Netflix and see your \u201cContinue Watching\u201d row, you might assume it\u2019s<br \/>\ncoming from a database somewhere.<\/span><\/p>\n<p class=\"MsoNormal\" style=\"line-height: normal;\"><span style=\"font-size: 14.0pt;\">It\u2019s<br \/>\nnot.<\/span><\/p>\n<p class=\"MsoNormal\" style=\"line-height: normal;\"><span style=\"font-size: 14.0pt;\">Behind<br \/>\nthe scenes, Netflix is serving that data from something called EVCache\u2014a<br \/>\nmassive, distributed memory layer designed to answer requests in milliseconds.<\/span><\/p>\n<figure style=\"text-align: center; margin-top: 30px; margin-bottom: 30px;\"><img decoding=\"async\" loading=\"lazy\" class=\"size-medium wp-image-79104\" src=\"https:\/\/www.tothenew.com\/blog\/wp-ttn-blog\/uploads\/2026\/03\/evCacheImg.webp\" alt=\"Request flow with EVCache servers\" width=\"500\" height=\"244\" \/><figcaption>Request flow with EVCache servers<\/figcaption><\/figure>\n<p class=\"MsoNormal\" style=\"line-height: normal;\"><span style=\"font-size: 14.0pt;\">Because<br \/>\nhitting a database? That\u2019s already too slow. If users ever feel delay, you\u2019ve<br \/>\nalready lost them.<\/span><\/p>\n<p class=\"MsoNormal\" style=\"line-height: normal;\"><span style=\"font-size: 14.0pt;\">So<br \/>\ninstead of making databases faster, they asked a better question:<\/span><\/p>\n<p class=\"MsoNormal\" style=\"line-height: normal;\"><i><span style=\"font-size: 14.0pt;\">\u201cWhat<br \/>\nif we never had to hit the database at all?\u201d<\/span><\/i><\/p>\n<p class=\"MsoNormal\"><b><span style=\"font-size: 18.0pt; line-height: 115%;\">The<br \/>\nProblem With Memory: It Forgets<\/span><\/b><\/p>\n<p class=\"MsoNormal\" style=\"line-height: normal;\"><span style=\"font-size: 14.0pt;\">Caching<br \/>\nsounds simple\u2014store data in memory, serve it quickly.<\/span><\/p>\n<p class=\"MsoNormal\" style=\"line-height: normal;\"><span style=\"font-size: 14.0pt;\">But<br \/>\nat Netflix scale, nothing is simple.<\/span><\/p>\n<p class=\"MsoNormal\" style=\"line-height: normal;\"><span style=\"font-size: 14.0pt;\">Servers<br \/>\nfail. New ones spin up. Traffic spikes unpredictably.<\/span><\/p>\n<p class=\"MsoNormal\" style=\"line-height: normal;\"><span style=\"font-size: 14.0pt;\">Now<br \/>\nimagine this:<br \/>\nA new cache server starts\u2014 completely empty.<\/span><\/p>\n<p class=\"MsoNormal\" style=\"line-height: normal;\"><span style=\"font-size: 14.0pt;\">Suddenly,<br \/>\nmillions of requests miss the cache and hit the database at once.<\/span><\/p>\n<p class=\"MsoNormal\" style=\"line-height: normal;\"><span style=\"font-size: 14.0pt;\">Boom.<br \/>\nSystem collapse.<br \/>\nNetflix had to solve this.<\/span><\/p>\n<p class=\"MsoNormal\" style=\"line-height: normal;\"><span style=\"font-size: 14.0pt;\">So<br \/>\nthey built a quiet, almost invisible mechanism:<\/span><\/p>\n<ul style=\"margin-top: 0cm;\" type=\"disc\">\n<li class=\"MsoNormal\" style=\"line-height: normal;\"><span style=\"font-size: 14.0pt;\">Somewhere,<br \/>\na <i>warm<\/i> server continuously replicates its state across a distributed system (built on top of Memcached, via EVCache)<\/span><\/li>\n<li class=\"MsoNormal\" style=\"line-height: normal;\"><span style=\"font-size: 14.0pt;\">A<br \/>\nnew server wakes up, syncs that data through this replication layer, and fills itself before taking<br \/>\ntraffic<\/span><\/li>\n<\/ul>\n<p class=\"MsoNormal\" style=\"line-height: normal;\"><span style=\"font-size: 14.0pt;\">By<br \/>\nthe time it goes live, it already \u201cremembers\u201d everything.<br \/>\nNo chaos. No spike. No user ever notices.<\/span><\/p>\n<p class=\"MsoNormal\"><b><span style=\"font-size: 18.0pt; line-height: 115%;\">But<br \/>\nMemory Isn\u2019t Enough for Movies<\/span><\/b><\/p>\n<p class=\"MsoNormal\" style=\"line-height: normal;\"><span style=\"font-size: 14.0pt;\">Caching<br \/>\nuser data is one thing.<br \/>\nBut what about a 20GB 4K movie?<br \/>\nYou can\u2019t just throw that into RAM.<\/span><\/p>\n<p class=\"MsoNormal\" style=\"line-height: normal;\"><span style=\"font-size: 14.0pt;\">So<br \/>\nNetflix did something unusual.<\/span><\/p>\n<p class=\"MsoNormal\" style=\"line-height: normal;\"><span style=\"font-size: 14.0pt;\">Instead<br \/>\nof bringing users closer to servers\u2014<br \/>\nthey brought servers closer to users.<\/span><\/p>\n<p class=\"MsoNormal\"><b><span style=\"font-size: 18.0pt; line-height: 115%;\">The Box<br \/>\nSitting Inside Your ISP<\/span><\/b><\/p>\n<p class=\"MsoNormal\" style=\"line-height: normal;\"><span style=\"font-size: 14.0pt;\">Netflix<br \/>\nbuilt their own CDN\u2014<\/span><b><span style=\"font-size: 14.0pt;\">Open Connect<\/span><\/b><span style=\"font-size: 14.0pt;\">.<\/span><\/p>\n<figure style=\"text-align: center; margin-top: 30px; margin-bottom: 30px;\"><img decoding=\"async\" loading=\"lazy\" class=\"size-medium wp-image-79105\" src=\"https:\/\/www.tothenew.com\/blog\/wp-ttn-blog\/uploads\/2026\/03\/openConnect.png\" alt=\"Request flow with OpenConnect CDN servers\" width=\"500\" height=\"194\" \/><figcaption>Request flow with OpenConnect CDN servers<\/figcaption><\/figure>\n<p class=\"MsoNormal\" style=\"line-height: normal;\"><span style=\"font-size: 14.0pt;\">But<br \/>\nhere\u2019s the twist:<br \/>\nThey physically ship servers to ISPs.<\/span><\/p>\n<p class=\"MsoNormal\" style=\"line-height: normal;\"><span style=\"font-size: 14.0pt;\">Yes,<br \/>\nactual hardware.<\/span><\/p>\n<p class=\"MsoNormal\" style=\"line-height: normal;\"><span style=\"font-size: 14.0pt;\">These<br \/>\nboxes sit inside your internet provider\u2019s network, just a few miles from your<br \/>\nhome.<\/span><\/p>\n<p class=\"MsoNormal\" style=\"line-height: normal;\"><span style=\"font-size: 14.0pt;\">So<br \/>\nwhen you hit play, your movie isn\u2019t traveling across continents.<br \/>\nIt\u2019s practically next door.<\/span><\/p>\n<p class=\"MsoNormal\"><b><span style=\"font-size: 18.0pt; line-height: 115%;\">The Final Frontier: Optimizing the OS Kernel<\/span><\/b><\/p>\n<p class=\"MsoNormal\" style=\"line-height: normal;\"><span style=\"font-size: 14.0pt;\">Most<br \/>\ncompanies optimize applications.<br \/>\nNetflix optimized the operating system.<\/span><\/p>\n<p class=\"MsoNormal\" style=\"line-height: normal;\"><span style=\"font-size: 14.0pt;\">Normally,<br \/>\nwhen video data moves:<\/span><\/p>\n<ul style=\"margin-top: 0cm;\" type=\"disc\">\n<li class=\"MsoNormal\"><span style=\"font-size: 14.0pt;\">It gets copied<\/span><\/li>\n<li class=\"MsoNormal\"><span style=\"font-size: 14.0pt;\">Encrypted<\/span><\/li>\n<li class=\"MsoNormal\"><span style=\"font-size: 14.0pt;\">Passed between layers<\/span><\/li>\n<li class=\"MsoNormal\"><span style=\"font-size: 14.0pt;\">Handled by CPU repeatedly<\/span><\/li>\n<\/ul>\n<p class=\"MsoNormal\"><span style=\"font-size: 14.0pt;\">That\u2019s slow.<\/span><\/p>\n<p class=\"MsoNormal\"><span style=\"font-size: 14.0pt;\">So Netflix moved encryption directly into the kernel using kTLS.<br \/>\nIt flows straight from disk \u2192 network card.<\/span><\/p>\n<p class=\"MsoNormal\"><span style=\"font-size: 14.0pt;\">The result?<br \/>\nA single server can stream to thousands of users at once.<\/span><\/p>\n<p class=\"MsoNormal\"><b><span style=\"font-size: 18.0pt;\">While You Sleep, Netflix Is Preparing for You<\/span><\/b><\/p>\n<p class=\"MsoNormal\"><span style=\"font-size: 14.0pt;\">Every night, something interesting happens.<br \/>\nNetflix studies what people in your region are watching.<br \/>\nThen, quietly\u2026<br \/>\nIt pushes that content to local servers near you.<\/span><\/p>\n<p class=\"MsoNormal\"><span style=\"font-size: 14.0pt;\">So when you wake up and press play, the video is already there.<br \/>\nNot fetched. Not requested.<br \/>\nWaiting.<\/span><\/p>\n<p class=\"MsoNormal\"><b><span style=\"font-size: 18.0pt;\">The Creepiest Part: Netflix Knows What You\u2019ll Click<\/span><\/b><\/p>\n<p class=\"MsoNormal\"><span style=\"font-size: 14.0pt;\">Now comes the part that feels almost unfair.<br \/>\nNetflix doesn\u2019t just react to your clicks.<br \/>\nIt predicts them.<br \/>\nAs you scroll\u2026 pause\u2026 hover\u2026 hesitate\u2026<\/span><\/p>\n<p class=\"MsoNormal\"><span style=\"font-size: 14.0pt;\">Their machine learning pipeline is calculating probabilities:<\/span><\/p>\n<p class=\"MsoNormal\"><span style=\"font-size: 14.0pt;\">\u201cThere\u2019s a 90% chance he clicks this.\u201d<br \/>\nAnd when that happens?<br \/>\nNetflix doesn\u2019t wait.<br \/>\nIt starts loading the video <i>before<\/i> you click.<\/span><\/p>\n<ul>\n<li class=\"MsoNormal\"><span style=\"font-size: 14.0pt;\">Videos start instantly<\/span><\/li>\n<li class=\"MsoNormal\"><span style=\"font-size: 14.0pt;\">Episodes transition seamlessly<\/span><\/li>\n<li class=\"MsoNormal\"><span style=\"font-size: 14.0pt;\">You never see loading screens<\/span><\/li>\n<\/ul>\n<p class=\"MsoNormal\"><span style=\"font-size: 14.0pt;\">Because you\u2019re always slightly behind what Netflix has already prepared.<\/span><\/p>\n<p class=\"MsoNormal\"><b><span style=\"font-size: 18.0pt;\">The Real Lesson (This Is Where Most Engineers Get It Wrong)<\/span><\/b><\/p>\n<p class=\"MsoNormal\"><span style=\"font-size: 14.0pt;\">Most developers think like this: <i>\u201cHow do I make my backend faster?\u201d<\/i><br \/>\nNetflix thinks differently: <i>\u201cHow do I make sure the backend is never needed?\u201d<\/i><\/span><\/p>\n<p class=\"MsoNormal\"><b><span style=\"font-size: 14.0pt;\">That\u2019s the shift.<br \/>\nThat\u2019s the mindset.<br \/>\nThat\u2019s why when you hit play at 2:00 AM\u2026<br \/>\nThere\u2019s no spinning circle.<br \/>\nBecause Netflix already knew you would.<\/span><\/b><\/p>\n<\/div>\n","protected":false},"excerpt":{"rendered":"<p>The Night Netflix Refused to Buffer It\u2019s 2:00 AM. You\u2019ve told yourself \u201cjust one more episode\u201d at least three times already. But now it\u2019s the finale. You click Play\u2026 and it starts instantly. No spinning circle. No loading screen. Nothing. Just video. Feels normal, right? But here\u2019s the truth: that \u201cinstant\u201d moment is one of [&hellip;]<\/p>\n","protected":false},"author":2258,"featured_media":0,"comment_status":"open","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":{"iawp_total_views":22},"categories":[3477],"tags":[6267,118,2645,6276,8525,5947,8268,2297,6307,8526,8524,6894,6016,8523,5221],"aioseo_notices":[],"_links":{"self":[{"href":"https:\/\/www.tothenew.com\/blog\/wp-json\/wp\/v2\/posts\/79102"}],"collection":[{"href":"https:\/\/www.tothenew.com\/blog\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/www.tothenew.com\/blog\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/www.tothenew.com\/blog\/wp-json\/wp\/v2\/users\/2258"}],"replies":[{"embeddable":true,"href":"https:\/\/www.tothenew.com\/blog\/wp-json\/wp\/v2\/comments?post=79102"}],"version-history":[{"count":7,"href":"https:\/\/www.tothenew.com\/blog\/wp-json\/wp\/v2\/posts\/79102\/revisions"}],"predecessor-version":[{"id":79433,"href":"https:\/\/www.tothenew.com\/blog\/wp-json\/wp\/v2\/posts\/79102\/revisions\/79433"}],"wp:attachment":[{"href":"https:\/\/www.tothenew.com\/blog\/wp-json\/wp\/v2\/media?parent=79102"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/www.tothenew.com\/blog\/wp-json\/wp\/v2\/categories?post=79102"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/www.tothenew.com\/blog\/wp-json\/wp\/v2\/tags?post=79102"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}