DNS Failure Triggers Multi-Service AWS Disruption in US-EAST-1

As of Oct 20, 3:53 PM PDT, Amazon Web Services (AWS) has fully resolved a large-scale service disruption in its US-EAST-1 Region that began late Sunday night and continued into Monday afternoon, affecting dozens of its most widely used services including EC2, Lambda, RDS, CloudWatch, and DynamoDB. The issue originated at 11:49 PM PDT on October 19, when DNS resolution failures disrupted access to the DynamoDB API endpoint, triggering cascading effects across multiple AWS subsystems.

According to AWS, engineers identified and corrected the root cause at 2:24 AM PDT, but subsequent dependencies in EC2’s internal launch subsystem and Network Load Balancer health checks prolonged the outage. These issues caused network connectivity failures that rippled through several major services, including Lambda, CloudWatch, and DynamoDB. AWS restored Network Load Balancer functionality at 9:38 AM PDT and gradually reduced throttling on EC2 instance launches and SQS queue processing through the afternoon. By 3:01 PM PDT, all AWS services had returned to normal operation, though some—including AWS Config, Redshift, and Connect—continue processing backlogged data and analytics workloads.

At the height of the incident, more than 75 AWS services were impacted, with customers reporting elevated API errors, delayed message queues, and EC2 launch failures across multiple Availability Zones in Northern Virginia. The company said it will publish a detailed post-event analysis in the coming days.

• Outage window: Oct 19, 11:49 PM PDT – Oct 20, 3:01 PM PDT

• Root cause: DNS resolution failure for DynamoDB service endpoints in US-EAST-1

• Secondary impacts: EC2 launch subsystem, Network Load Balancer health checks, Lambda event polling

• Region affected: US-EAST-1 (Northern Virginia)

• Services impacted: Over 75, including EC2, Lambda, RDS, DynamoDB, CloudTrail, SageMaker, and EventBridge

• Status as of 3:53 PM PDT: All AWS services fully recovered; some backlogs still processing

“AWS engineers resolved the DNS and internal subsystem impairments and have restored all AWS services to normal operation,” the company stated. “Some services continue to process residual backlogs, which are expected to clear within hours.”

Link to the AWS Health Dashboard – https://health.aws.amazon.com/health/status

🌐 Analysis: This incident highlights ongoing dependency challenges tied to AWS’s US-EAST-1 Region, which has historically been a single point of concentration for many global workloads. Recent large-scale events underscore the growing operational risk of hyperscale concentration and may accelerate customer diversification across regions or providers. Microsoft Azure and Google Cloud have both faced similar single-region disruptions in the past year, reinforcing the need for distributed multi-region architectures in critical cloud deployments.