Introduction Real-time systems are now central to modern businesses. Payments, order updates, customer activity, telemetry, notifications, and analytics all depend on events moving quickly and reliably between services. The challenge is not just speed. The challenge is preserving correctness and resilience when systems are distributed, traffic is variable, and failures are inevitable. Apache Kafka […]
Introduction In DevOps, upgrades are rarely exciting. They don’t ship new features (most of the time). They don’t impress clients. They don’t always get leadership applause. And yet, over the years at To The New, one thing has become very clear to us: DevOps teams that do upgrades regularly move faster, stay safer, and break […]
Introduction If you’ve ever worked with Kafka, you know the problem: data grows fast. Every click, impression, or event adds up, and before you know it, your Kafka broker’s disks are full. Disk is not very cheap on AWS, and storing everything on expensive broker storage is costly, and scaling up to handle growth feels […]
Imagine you’re managing a busy e-commerce website. Every time a customer places an order, it triggers several events: an email confirmation, a shipping update, a payment confirmation, and much more. From updating the inventory to sending a confirmation email and processing the payment, everything needs to happen instantly and in sync. But how do you […]
Introduction Have you ever wondered about real-time data streaming? If not, let’s embark on a journey to explore the Kafka system and explore its strengths and weaknesses. Before jumping into learning about it, first, we should talk about Big Data, as we are surrounded by an enormous amount of data and the volume is huge; […]
Kafka is a distributed streaming platform designed for real-time data pipelines, stream processing, and data integration. AWS lambda, on the other hand, is a serverless compute service that executes your code in response to events, managing the underlying compute resources for you. In organizations where Kafka plays a central role in streaming and data integration, […]
Introduction As companies grow and their data streaming needs change, it’s important to optimize resources to stay efficient and control costs. AWS has rolled out a new feature for Amazon Managed Streaming for Apache Kafka (Amazon MSK) that enables the removal of brokers from MSK clusters. This capability enables you to adjust the size of […]
Introduction In the world of data management, companies seek to streamline operations and enhance scalability. One key journey involves migrating self-managed Apache Kafka clusters from AWS EC2 to Amazon MSK. We executed such a migration for a client with zero downtime, offering insights and strategies in this blog. Motivations Behind Migration Scalability Limitations: Scaling self-managed […]
Introduction Apache Kafka is a distributed system consisting of servers and clients that communicate via a high-performance TCP network protocol. We have components generating events (Producers) and components that consume those events (Consumers). Consumers label themselves with a consumer group name so that each record published on a topic will be delivered to one and […]