Patching Azure VMs from AWS Systems Manager using Hybrid Activation Each cloud platform provides its own native tools, which can lead to fragmented processes and increased administrative overhead. To address this challenge, AWS Systems Manager (SSM) offers a powerful solution through its Hybrid Activation feature. This capability allows...
Introduction In a multi-account AWS environment, log management for services such as Cloud Trail, VPC Flow Logs, and WAF is a complex and fragmented process. This is due to the fact that each account has its own log data, which is not easy for security and operations teams to manage centrally. This issue, however, can be addressed...
Introduction We know that many applications generate large amounts of event data such as alerts, application events, logs, and notifications. This data is usually unstructured and arrives in a continuous manner. The initial step in creating a data engineering pipeline is to store this event data into a reliable and long term storage...
In AWS environments, visibility is critical. When applications run across multiple services, engineers need tools that help them monitor performance, track user activity, and maintain configuration compliance. Three AWS services commonly used for this purpose are Amazon CloudWatch, AWS CloudTrail, and AWS Config. Although these...
In AWS networking, it is common to configure all required components—subnets, gateways, and route tables—yet still encounter connectivity issues. In most cases, the problem is not with individual components, but with a lack of understanding of how these components interact with each other. This article explains how key VPC...
Introduction In the first part, we explored Palo Alto firewalls, their use cases, and different ways to achieve high availability in AWS. To learn more click here. In this second part, we’ll walk through a complete end-to-end setup of an Active/Passive Palo Alto HA deployment within the same Availability Zone. Architecture ...
Introduction I’ll be honest when I say running a high-traffic production environment on AWS is fun…. until you see the cloud bill. At first, you overprovision a bit of memory “just to be safe.” Containers stay up a little longer than needed. Logs? Oh, we log everything because, you know, one day you might need it. And cross-AZ...
Introduction In ad-tech, logs are not “nice to have.” They are the product’s heartbeat. Every impression, every click, every bid request — everything generates logs. Multiply that by millions of requests per minute, and you’re suddenly dealing with millions of events and TB’s of logs per day. That’s exactly where one of our...
Introduction If you’ve worked in production long enough, you’ve probably heard this: “Let’s right-size the services and reduce the AWS bill.” So we do it. We check CPU and memory metrics for a week. We reduce task sizes. Costs drop. Everyone’s happy. And then…. six months later, the bill increases again....
It is painfully inefficient to check metrics across a large collection of AWS accounts (development, staging, uat, production, etc.). This is a major time waster, not just a small irritation. In addition to wasting valuable engineering time, you run a much higher risk of missing an alert that could result in a full-blown outage every time...
Introduction Modern applications have distributed systems consisting of multiple services, containers, and infrastructure components. While it improves scalability, security and reliability, it also increases the chances of unexpected failures and downtime. Application testing methods majorly focus on application functionality, but...
Introduction Cloud monitoring has evolved over the years and we have moved from manual static monitoring of thresholds to dynamic anomaly monitoring, AI and ML-based operational tasks.Here AWS DevOps Guru comes into the picture as an Mananged machine learning service in Cloud Operations. AWS DevOps Guru is an AIOps solution that can...