DevOps

HA (high availability) Active/Passive Palo Alto on AWS

Introduction In the first part, we explored Palo Alto firewalls, their use cases, and different ways to achieve high availability in AWS. To learn more click here. In this second part, we’ll walk through a complete end-to-end setup of an Active/Passive Palo Alto HA deployment within the same Availability Zone. Architecture In this setup, traffic […]

DevOps

Real-World AWS Cost Optimization Strategies for High-Traffic Platforms

Introduction I’ll be honest when I say running a high-traffic production environment on AWS is fun…. until you see the cloud bill. At first, you overprovision a bit of memory “just to be safe.” Containers stay up a little longer than needed. Logs? Oh, we log everything because, you know, one day you might need […]

DevOps

From Logstash to Fluent Bit: How We Streamlined Logging for an Ad Tech Client

Introduction In ad-tech, logs are not “nice to have.” They are the product’s heartbeat. Every impression, every click, every bid request — everything generates logs. Multiply that by millions of requests per minute, and you’re suddenly dealing with millions of events and TB’s of logs per day. That’s exactly where one of our platforms was. […]

DevOps

Why Right-Sizing Is Not a One-Time Activity

Introduction If you’ve worked in production long enough, you’ve probably heard this: “Let’s right-size the services and reduce the AWS bill.” So we do it. We check CPU and memory metrics for a week. We reduce task sizes. Costs drop. Everyone’s happy. And then…. six months later, the bill increases again. Nothing “dramatic” changed. No […]

DevOps

How to Centralize AWS Monitoring: A Guide to CloudWatch Cross-Account Metrics

It is painfully inefficient to check metrics across a large collection of AWS accounts (development, staging, uat, production, etc.). This is a major time waster, not just a small irritation. In addition to wasting valuable engineering time, you run a much higher risk of missing an alert that could result in a full-blown outage every […]

MSP

Chaos Engineering: Simulating Network Latency using AWS FIS

Introduction Modern applications have distributed systems consisting of multiple services, containers, and infrastructure components. While it improves scalability, security and reliability, it also increases the chances of unexpected failures and downtime. Application testing methods majorly focus on application functionality, but they rarely test how systems behave in real-world failures such as instance crashes, network latency, […]

Rauf Khan
Rauf Khan
Read

MSP

AWS DevOps Guru: Intelligent AIOps for Modern Cloud Observability

Introduction Cloud monitoring has evolved over the years and we have moved from manual static monitoring of thresholds to dynamic anomaly monitoring, AI and ML-based operational tasks.Here AWS DevOps Guru comes into the picture as an Mananged machine learning service in Cloud Operations. AWS DevOps Guru is an AIOps solution that can detect operational anomalies […]

DevOps

I Left This AWS Task Half-Done for 2 Weeks – Here’s What It Taught Me

Introduction When you work with AWS infrastructure for some time, you realise that not all problems announce themselves with alerts or outages. Some problems stay quiet, blend into the background, and only reveal themselves later-usually when someone asks a question you can’t answer clearly. This is one such experience from my early days of working […]

Vivek Tiwary
Vivek Tiwary
Read

DevOps

Powering Real-Time Multiplayer Games on AWS: From Chaos to a Managed Backbone

Real-time multiplayer games are unforgiving. Players don’t care for your flashy server hardware and state of the art network infrastructure, they care about quick response matches, fair competition, and smooth gameplay. When that breaks, they don’t blame latency graphs; they blame the game. At TO THE NEW, we’ve seen this pattern repeatedly: studios that treat […]

Saurabh Jain
Saurabh Jain
Read
Services