Introduction
In early microservices architectures on AWS, communication between services relied on infrastructure components like load balancers. There was no native, built-in way for services to discover each other dynamically.
A common architecture looked like this:
Public DNS → Amazon Application Load Balancer → Frontend Service → Private ALB → Backend Service

Service-to-Service Communication
While this approach worked, it introduced several practical challenges:
- Multiple hops between services increased the latency
- Higher operational overhead in managing multiple load balancers
- Increased cost due to additional ALBs
- Static configuration of endpoints (hardcoding or manual updates)
- Scaling complexity, since every new service required routing rules
To improve external access and routing flexibility, teams often introduced an API Gateway layer using Amazon API Gateway:
Client → API Gateway → Services (via host/path-based routing)

Service-to-Service Communication via API Gateway
This helped centralise routing for incoming traffic, but internal service-to-service communication remained complex. Services still needed a way to:
- Discover each other dynamically
- Avoid hardcoded endpoints
- Scale without manual intervention
This led to the evolution of native AWS solutions like ECS Service Discovery and later Service Connect, which simplify internal communication without requiring load balancers.
ECS Service to Service Communication: Without ELBs
Before these features, ECS services typically relied on:
- Internal load balancers
- Static DNS entries
- Environment variables with hardcoded service URLs
These approaches worked but did not scale well in dynamic environments where tasks are frequently created and destroyed.
To address this, AWS introduced two key approaches:
- Service Discovery (Cloud Map-based DNS resolution)
- Service Connect (proxy-based service communication abstraction)
Service Discovery
Service Discovery is built on AWS Cloud Map and integrates with Amazon Route 53 to enable services to find each other using DNS names rather than static IP addresses.
Here is how it works:
- When an ECS service starts, it registers itself with Cloud Map
- Cloud Map maintains a registry of running tasks (IP + port)
- A DNS record is created/updated in Route 53
- Other services resolve the DNS name to get the IP address
- Requests are sent directly to the resolved endpoint
Example:
backend.service.local → 10.0.2.45
A frontend service can call:
http://backend.service.local:8080

Service Discovery Working
Key Concepts:
- Namespace: A logical grouping of services under a domain (e.g., service.local)
- Service Registry: Keeps track of healthy tasks and their endpoints
- DNS Records: Automatically managed records that map service names to IP addresses
What is CloudMap Namespace?
- It is a Private Registry for MicroServices.
- If we define a namespace in a microservice, it is registered in the CloudMap.
- The DNS is created like this: <discovery name>.<namespace>
- There are two types of Namespaces
- API Only (No R53 record is created, and the application code manages the DNS routing) – Hard to Manage
- API + DNS (This creates an R53 entry, and the services use this DNS to communicate internally/externally) – Easy to Manage

Types of CloudMap Namespaces
Service Connect
Service Connect is a more advanced and managed approach that builds on top of Service Discovery but introduces a service mesh-like experience without requiring complex setup.
It uses a sidecar proxy (based on Envoy) inside each ECS task to manage communication between services (No need for the R53 records to manage DNS). Inside the container /etc/hosts file, the IP and the SC endpoint mapping are auto-populated by the SC Envoy.
Here is how it works:
- Each task runs a sidecar proxy container
- Applications communicate with services using logical names
- The proxy handles:
- Service discovery
- Load balancing
- Retries and timeouts
- Traffic routing
Example:
http://backend:8080
No need to manage DNS records or endpoints manually.

Service Connect Flow
Key Concepts:
- Transparent service-to-service communication
- Built-in load balancing across tasks
- Automatic retries and failure handling
- Centralised traffic management via proxy
- Better observability (metrics, logs, tracing)
Working Summary:
- The application sends a request to the local proxy
- Proxy resolves the service name internally
- Proxy routes the request to a healthy task
- Handles failures and retries automatically
Service Discovery Pros & Cons
Pros:
- Simple architecture [name → DNS → IP → direct TCP/HTTP connection].
- No sidecar container.
- Works well with any DNS‑capable client (Lambda, EC2, on‑prem, scripts).
- Supports blue/green and other patterns that depend on DNS/Cloud Map records.
Cons:
- Failover speed depends on DNS TTL and client caching. Some clients keep using dead IPs for a while.
- Client libraries must handle retries, timeouts, and load balancing themselves.
- Limited built‑in observability. Only Route 53 metrics and app‑level logs.
Service Connect Pros and Cons
Pros:
- Faster failover when a task dies, as Envoy maintains live endpoint lists instead of relying on DNS TTL.
- Built‑in traffic features: retries, connection draining, and cross‑VPC service communication using logical names.
- Better observability: ECS console shows service graph and real‑time network metrics; Envoy emits detailed logs and latency stats.
- Simplified naming (short logical names with client aliases)
Cons:
- Extra resource cost: an Envoy sidecar per task increases CPU and memory usage.
- SC‑style names are really for SC‑enabled ECS tasks. External or legacy clients still need DNS/Cloud Map or a load balancer in front.
Conclusion
ECS service-to-service communication has evolved significantly from traditional load balancer-based architectures to more dynamic and scalable approaches.
- Service Discovery provides a DNS-based mechanism that allows services to locate each other without hardcoding endpoints. It is simple, cost-effective, and suitable for less complex systems.
- Service Connect takes this further by introducing a proxy-based architecture that abstracts networking complexities and provides built-in load balancing, retries, and observability.
For modern microservices on ECS, Service Connect is often the preferred choice due to its operational simplicity and production-grade features. However, Service Discovery remains relevant for simpler use cases or when fine-grained control is required.