AWS Services

Amazon SQS

Understand Amazon SQS as managed message queues, including standard and FIFO queues, visibility timeout, dead-letter queues, retries, scaling, and exam traps.

foundation6 min readUpdated 2026-06-02CloudCertificationReliabilityOperations
Message QueueStandard QueueFIFO QueueVisibility TimeoutDead-Letter QueuePollingAt-Least-Once Delivery

After this, you will understand

SQS explains how AWS systems absorb spikes, decouple producers from consumers, and survive worker failure without dropping work.

Plain version

SQS is a managed queue where producers send messages and consumers poll, process, and delete them.

Decision pressure

Learners think a queue guarantees exactly-once processing or forget visibility timeout and dead-letter queues.

Exam-ready model

Put SQS between fast producers and slower consumers, tune visibility timeout, make workers idempotent, and alarm on queue depth and message age.

Think before readingWhat happens if a consumer receives an SQS message but does not delete it before visibility timeout expires?
The message becomes visible again and can be received by another consumer, which is why duplicate-safe processing matters.

Reading in progress

This page is saved in your local study history so you can continue later.

Next: Amazon SNS

Study path

Read these in order

Start with the mechanics, then move into the patterns that explain why the system is shaped this way.

  1. 1Amazon SNSaws-services

Concepts Covered

  • Message queues
  • Producers and consumers
  • Standard queues
  • FIFO queues
  • Visibility timeout
  • Dead-letter queues
  • Long polling
  • At-least-once delivery
  • Lambda event source mappings
  • Queue depth alarms

1. Plain-English Mental Model

Amazon Simple Queue Service, or SQS, is a managed message queue.

A producer sends messages to a queue. A consumer polls the queue, receives messages, processes them, and deletes them after successful work.

The simple model is:

producer -> SQS queue -> consumer

The queue decouples producer speed from consumer speed. If producers spike, messages can wait. If consumers fail, messages can become visible again. If work keeps failing, messages can move to a dead-letter queue for inspection.

SQS is not pub/sub by itself. It is a queue. Each message is meant to be processed by a consumer, not broadcast to every subscriber. For fanout, use SNS with SQS subscriptions or another event service.

2. Why This Service Exists

Direct service-to-service calls create tight coupling.

If checkout calls the email service synchronously and the email service is down, checkout may fail. If image upload calls thumbnail generation directly and thumbnail processing is slow, upload latency suffers. If a traffic spike produces more work than workers can handle, requests fail instead of waiting.

SQS exists to buffer work between systems.

It lets producers hand off tasks durably and lets consumers process at their own pace. This improves resilience, scaling, and failure isolation.

For SAA-C03, SQS appears in questions about decoupling, buffering spikes, asynchronous processing, worker fleets, Lambda consumers, retries, dead-letter queues, standard versus FIFO ordering, and queue-depth-based scaling.

3. The Naive Approach And Where It Breaks

The naive design calls workers directly:

web app -> worker service

If the worker is slow, the web app waits. If the worker is down, the request fails. If traffic spikes, the worker is overloaded.

Another naive design uses SQS but ignores visibility timeout. A worker receives a message and takes longer than the visibility timeout to process it. The message becomes visible again and another worker processes it too.

Another mistake is expecting standard queues to process each message exactly once in perfect order. Standard queues provide high throughput and at-least-once delivery. FIFO queues provide ordering within message groups and exactly-once processing semantics in a narrower sense, with different throughput characteristics and requirements.

Queues make systems more resilient only when consumers are designed correctly.

4. Core Primitives

A queue stores messages.

A producer sends messages. A consumer receives messages. After successful processing, the consumer deletes the message.

Visibility timeout hides a received message from other consumers for a period of time. If the consumer does not delete the message before timeout, SQS can deliver it again.

Dead-letter queues receive messages that fail processing too many times, based on redrive policy.

Standard queues support very high throughput and at-least-once delivery, with best-effort ordering.

FIFO queues preserve ordering within a message group and support deduplication. They require message group IDs and have different scaling patterns.

Long polling reduces empty responses and can lower cost by waiting for messages before returning.

5. Architecture Use Cases

Use SQS to decouple web requests from background jobs:

API -> SQS -> worker Lambda or EC2 fleet

Use SQS between microservices when the producer should not fail just because the consumer is temporarily down.

Use SQS with Auto Scaling. Workers can scale based on approximate queue depth or message age.

Use SQS as a buffer for image processing, email sending, video transcoding jobs, order workflows, batch tasks, and integration with slower downstream systems.

Use FIFO queues when order matters for a specific entity, such as all updates for one account or order.

Use SNS to fan out one event into multiple SQS queues when multiple independent consumers need their own copy.

7. Security Model

SQS access is controlled by IAM and queue policies.

Producers need permission to send messages. Consumers need permission to receive, delete, and change visibility as needed.

Queue policies can allow cross-account producers or subscribers, often used with SNS fanout across accounts.

Encrypt queues with SSE when messages may contain sensitive data. KMS key permissions matter if using customer-managed keys.

Do not put sensitive data in queue names or unprotected message attributes. Avoid storing secrets in message bodies.

Use VPC endpoints when private workloads need private access to SQS APIs.

8. Reliability And Resilience

SQS stores messages redundantly across multiple Availability Zones for standard queues.

Visibility timeout and retries help recover from worker failure. Dead-letter queues isolate poison messages so they do not block healthy work forever.

Consumers should be idempotent because duplicates can happen.

Monitor ApproximateAgeOfOldestMessage, visible messages, not visible messages, and DLQ depth. A growing queue means consumers are falling behind.

For FIFO queues, a failed message can block later messages in the same message group until it succeeds or moves to the DLQ.

Set message retention long enough for recovery but not so long that failed workflows hide forever.

9. Performance And Scaling

SQS standard queues scale for high throughput. Consumers scale horizontally by polling in parallel.

FIFO throughput depends on message groups and queue settings. Multiple message groups allow more parallelism while preserving order per group.

Batching send, receive, and delete operations can improve efficiency.

Long polling reduces empty receives.

For Lambda consumers, batch size, maximum concurrency, visibility timeout, and function timeout must be aligned. If the function times out repeatedly, messages can retry and eventually land in the DLQ.

Scale workers based on queue depth and processing time, not only CPU.

10. Cost Model

SQS cost is request-based, with charges for API actions and additional features such as payloads, FIFO usage, and data transfer depending on architecture.

Long polling can reduce empty receive requests.

Batching can reduce request count.

DLQs cost little compared with losing failed messages, but retaining too much message data forever is not a plan.

SQS can reduce compute cost by smoothing spikes and allowing worker fleets to scale with backlog.

12. SAA-C03 Exam Signals

"Decouple application components" points to SQS.

"Buffer sudden spikes" points to SQS.

"Worker fleet processes background jobs" points to SQS plus EC2, ECS, or Lambda consumers.

"Need strict ordering" points to FIFO queue.

"Failed messages should be isolated" points to dead-letter queue.

"Message becomes visible again after consumer failure" points to visibility timeout.

"Fan out one event to multiple queues" points to SNS plus SQS subscriptions.

13. Common Exam Traps

Do not expect standard queues to guarantee strict ordering.

Do not assume exactly-once processing for normal consumers. Design idempotent handlers.

Do not forget to delete messages after successful processing.

Do not set visibility timeout shorter than processing time.

Do not use one FIFO message group for all messages if you need parallel processing.

Do not forget DLQ retention and alarms.

Review AWS Lambda, Amazon SNS, and Amazon CloudWatch.

Official AWS references:

What to study next

These links keep the session moving: read prerequisites first, then open the systems, concepts, and patterns that deepen this page.

More Links

Additional references connected to this page.