DevOps

ECS Service Discovery VS Service Connect

Introduction In early microservices architectures on AWS, communication between services relied on infrastructure components like load balancers. There was no native, built-in way for services to discover each other dynamically. A common architecture looked like this: Public DNS → Amazon Application Load Balancer → Frontend Service → Private ALB → Backend Service While this approach […]

DevOps

Why Your Monitoring Is Always One Step Behind — And How AI Fixes That

Introduction Anyone who has managed a production environment at scale knows the feeling. Five dashboards open, three alerts firing, and you’re not sure which one actually matters — while the thing about to cause a real problem isn’t making any noise yet. Modern DevOps infrastructure is complex. Microservices, Kubernetes clusters, CI/CD pipelines, external APIs — […]

DevOps

Managed NGINX Ingress Controller with Application Routing Add-on in AKS

Introduction Routing HTTP/HTTPS traffic to workloads in Azure Kubernetes Service (AKS) is a basic necessity for cloud applications in the modern era. Although the Kubernetes Ingress resource addresses this, the manual work involved in maintaining an ingress controller is a considerable overhead.To address this, Microsoft launched the Application Routing add-on with Managed NGINX Ingress. This […]

Gaurav
Gaurav
Read

DevOps

From Zero to Hundreds: Onboarding Your Entire AWS Fleet to Centralized CloudWatch in Under an Hour

If you’ve ever had to jump between six different AWS accounts just to figure out why one Lambda function is behaving oddly – you already know the pain. Multi-account AWS environments are great for security and governance, but they can turn basic monitoring into a logistical nightmare. The good news? AWS gives you everything you […]

DevOps

Understanding Azure Cloud Fundamentals for DevOps Engineers

What is Microsoft Azure Azure is Microsoft’s cloud computing platform. And it does a lot — virtual machines, storage, networking, monitoring, identity management, databases, you name it. The idea is actually pretty simple. Instead of buying physical servers, setting them up in an office or data center, and then maintaining them yourself, organizations just use […]

DevOps

Memory in AI: Why Your Agent Forgets Everything — And How to Fix It

Three months into building our DevOps AI agent, I gave a demo of it to the team. Checked pods, read logs, suggested fixes. Everyone was impressed. Then one engineer asked it: “Remember that ingress issue we sorted on Tuesday?” The agent had no idea what she was talking about. I had spent weeks on tool […]

Iman Abbas
Iman Abbas
Read

MSP

Cross-Account Centralised Logging in AWS Using S3, KMS, and SQS for SIEM Integration

Introduction In a multi-account AWS environment, log management for services such as Cloud Trail, VPC Flow Logs, and WAF is a complex and fragmented process. This is due to the fact that each account has its own log data, which is not easy for security and operations teams to manage centrally. This issue, however, can […]

Umang Dakh
Umang Dakh
Read

DevOps

AI-Powered Log Monitoring in Azure

Introduction Modern cloud-native systems generate millions of log entries every day. But here’s the real question — are we truly extracting meaningful insights from those logs, or just storing them? AI-Powered Log Monitoring in Microsoft Azure combines Azure Monitor, Azure Log Analytics, and Azure OpenAI Service to transform raw telemetry data into actionable intelligence. Instead […]

Deepak
Deepak
Read

DevOps

Fixing JVM OutOfMemoryError on ECS (EC2 Based)

Introduction We started seeing repeated OutOfMemoryError exceptions in a Spring Boot service running on Amazon ECS in EC2 mode. The impact of the OutOfMemoryError was serious:- JVM threads crashed, including SQS listeners, HTTP threads, and AWS SDK threads. Messages were retried and eventually sent to SQS Dead Letter Queues. The service became unstable under load. […]

Ahmad Ali
Ahmad Ali
Read
Services