Introduction In early microservices architectures on AWS, communication between services relied on infrastructure components like load balancers. There was no native, built-in way for services to discover each other dynamically. A common architecture looked like this: Public DNS → Amazon Application Load Balancer → Frontend Service → Private ALB → Backend Service While this approach […]
Introduction Anyone who has managed a production environment at scale knows the feeling. Five dashboards open, three alerts firing, and you’re not sure which one actually matters — while the thing about to cause a real problem isn’t making any noise yet. Modern DevOps infrastructure is complex. Microservices, Kubernetes clusters, CI/CD pipelines, external APIs — […]
Introduction Routing HTTP/HTTPS traffic to workloads in Azure Kubernetes Service (AKS) is a basic necessity for cloud applications in the modern era. Although the Kubernetes Ingress resource addresses this, the manual work involved in maintaining an ingress controller is a considerable overhead.To address this, Microsoft launched the Application Routing add-on with Managed NGINX Ingress. This […]
If you’ve ever had to jump between six different AWS accounts just to figure out why one Lambda function is behaving oddly – you already know the pain. Multi-account AWS environments are great for security and governance, but they can turn basic monitoring into a logistical nightmare. The good news? AWS gives you everything you […]
What is Microsoft Azure Azure is Microsoft’s cloud computing platform. And it does a lot — virtual machines, storage, networking, monitoring, identity management, databases, you name it. The idea is actually pretty simple. Instead of buying physical servers, setting them up in an office or data center, and then maintaining them yourself, organizations just use […]
Three months into building our DevOps AI agent, I gave a demo of it to the team. Checked pods, read logs, suggested fixes. Everyone was impressed. Then one engineer asked it: “Remember that ingress issue we sorted on Tuesday?” The agent had no idea what she was talking about. I had spent weeks on tool […]
Introduction In a multi-account AWS environment, log management for services such as Cloud Trail, VPC Flow Logs, and WAF is a complex and fragmented process. This is due to the fact that each account has its own log data, which is not easy for security and operations teams to manage centrally. This issue, however, can […]
Introduction Modern cloud-native systems generate millions of log entries every day. But here’s the real question — are we truly extracting meaningful insights from those logs, or just storing them? AI-Powered Log Monitoring in Microsoft Azure combines Azure Monitor, Azure Log Analytics, and Azure OpenAI Service to transform raw telemetry data into actionable intelligence. Instead […]
Introduction We started seeing repeated OutOfMemoryError exceptions in a Spring Boot service running on Amazon ECS in EC2 mode. The impact of the OutOfMemoryError was serious:- JVM threads crashed, including SQS listeners, HTTP threads, and AWS SDK threads. Messages were retried and eventually sent to SQS Dead Letter Queues. The service became unstable under load. […]