Introduction Large Language Models (LLMs) are transforming the way that users interact with applications, and they introduce observability challenges that require new approaches. Unlike deterministic APIs that return predictable results, LLMs have variable performance, unpredictable outputs, and complex failure modes. Observing these...
Introduction In ad-tech, logs are not “nice to have.” They are the product’s heartbeat. Every impression, every click, every bid request — everything generates logs. Multiply that by millions of requests per minute, and you’re suddenly dealing with millions of events and TB’s of logs per day. That’s exactly where one of our...
Introduction When we started with Amazon ECS on AWS Fargate, it felt simple. No EC2 to manage. No AMIs. No cluster scaling headaches. Then the number of services grew. Working for the ad-tech client from last 5 years and running their workload on ECS Fargate has taught us many things. Different traffic patterns. Different...
It is painfully inefficient to check metrics across a large collection of AWS accounts (development, staging, uat, production, etc.). This is a major time waster, not just a small irritation. In addition to wasting valuable engineering time, you run a much higher risk of missing an alert that could result in a full-blown outage every time...
Introduction Cloud monitoring has evolved over the years and we have moved from manual static monitoring of thresholds to dynamic anomaly monitoring, AI and ML-based operational tasks.Here AWS DevOps Guru comes into the picture as an Mananged machine learning service in Cloud Operations. AWS DevOps Guru is an AIOps solution that can...
Introduction In DevOps, upgrades are rarely exciting. They don’t ship new features (most of the time). They don’t impress clients. They don’t always get leadership applause. And yet, over the years at To The New, one thing has become very clear to us: DevOps teams that do upgrades regularly move faster, stay safer,...
Introduction When teams start on their DevOps journey, the excitement is real. CI/CD pipelines, faster deployments, cloud-native tools, automation everywhere - it feels like everything is finally going to be smooth. But in reality, the first year of DevOps is rarely smooth. It’s messy, experimental, and full of learning. [caption...
Introduction For years, we have used Kafka in the Data Centre, then we moved to AWS and started using EC2 to run Kafka. However, the headaches increased along with our usage. We began to feel as though we were spending more time managing Kafka than creating anything of value due to broker upgrades, Zookeeper problems, imbalanced...