AWS Services
Athena vs Redshift vs OpenSearch
Compare Amazon Athena, Amazon Redshift, and Amazon OpenSearch Service for serverless SQL over S3, data warehousing, full-text search, log analytics, dashboards, and SAA-C03 analytics decisions.
After this, you will understand
Analytics service choice becomes clearer when learners separate ad hoc SQL over S3, warehouse BI analytics, and indexed search or log exploration.
Use Athena for serverless SQL over S3, Redshift for data warehouse analytics, and OpenSearch for full-text search and log analytics.
Teams use Redshift for occasional S3 log queries, use Athena for low-latency application search, or use OpenSearch as a relational warehouse.
Classify the question: query lake files, model warehouse data, or search indexed documents and logs.
Think before readingWhich service should you choose for ad hoc SQL over CloudTrail logs stored in S3?
Reading in progress
This page is saved in your local study history so you can continue later.
Study path
Read these in order
Start with the mechanics, then move into the patterns that explain why the system is shaped this way.
Concepts Covered
- Serverless SQL over S3
- Data warehouses
- Full-text search
- Log analytics
- Glue Data Catalog
- Redshift Spectrum
- OpenSearch indexes
- BI dashboards
- Query cost
- SAA-C03 analytics traps
1. Plain-English Mental Model
Athena, Redshift, and OpenSearch are all analytics-adjacent, but they answer different questions.
Athena = query files in S3 with serverless SQL
Redshift = warehouse structured data for repeated BI analytics
OpenSearch = index documents and logs for search and exploration
If you want to ask SQL questions over S3 logs without running a cluster, think Athena.
If you want a modeled data warehouse for business reporting and analytical joins, think Redshift.
If you want full-text search, log exploration, or indexed document analytics, think OpenSearch.
2. Why This Service Exists
Analytics workloads are not all the same.
Some data sits in S3 and only needs occasional SQL queries. Keeping a warehouse running for that can be wasteful. Athena exists for serverless query over lake data.
Some organizations need repeated, modeled, high-concurrency BI over structured historical data. Redshift exists as a managed data warehouse for those workloads.
Some users need to search documents, logs, and events by text and fields in near real time. OpenSearch exists for indexed search and operational analytics.
For SAA-C03, the wording usually reveals the service: serverless SQL over S3, data warehouse, or full-text search/log analytics.
3. The Naive Approach And Where It Breaks
The naive approach is to call every analytics store a database.
That leads to wrong fits.
Redshift can query and analyze data well, but it is not the simplest answer for occasional CloudTrail log queries in S3.
Athena can query S3 with SQL, but it is not a low-latency product search engine or a transactional application database.
OpenSearch can aggregate and visualize logs, but it is not a traditional relational warehouse for complex SQL modeling and BI semantics.
The architecture should match query shape, latency expectation, data layout, and operational model.
4. Core Primitives
Athena primitives include workgroups, queries, result locations, Glue Data Catalog tables, partitions, file formats, compression, and data scanned.
Redshift primitives include provisioned clusters, Serverless workgroups, schemas, tables, columnar storage, distribution styles, sort keys, workload management, snapshots, and Spectrum external tables.
OpenSearch primitives include domains or serverless collections, indexes, documents, shards, replicas, dashboards, ingestion pipelines, and index lifecycle or storage tiers.
Athena reads files. Redshift warehouses modeled data. OpenSearch indexes documents.
5. Architecture Use Cases
Use Athena for querying S3 data lakes, logs, exports, CSV/JSON/Parquet files, and ad hoc datasets:
S3 logs -> Glue Data Catalog -> Athena SQL
Use Redshift for business intelligence and historical analytics:
operational systems -> ETL/ELT -> Redshift -> dashboards
Use OpenSearch for application search and operational log exploration:
logs or documents -> indexing pipeline -> OpenSearch -> search and dashboards
Use Glue to catalog and transform data for Athena or Redshift. Use QuickSight for dashboard visualization on top of suitable data sources.
7. Security Model
Athena security includes IAM, S3 data permissions, Glue Data Catalog permissions, workgroups, query-result bucket controls, Lake Formation where used, and KMS keys.
Redshift security includes IAM, database users and roles, network placement, security groups, encryption, S3 access roles, and audit logs.
OpenSearch security includes domain or collection policies, network placement, IAM, fine-grained access control, encryption, dashboard access, and index-level permissions.
All three can expose sensitive analytical data. Protect query results, logs, indexed documents, and warehouse tables with least privilege.
Do not index or warehouse secrets casually.
8. Reliability And Resilience
Athena reliability depends on S3 data availability, catalog correctness, permissions, and query result configuration. There is no query cluster to patch.
Redshift reliability depends on warehouse mode, snapshots, data pipelines, workload management, and recovery objectives.
OpenSearch reliability depends on shard design, replicas, snapshots, ingestion buffering, and domain or collection configuration.
For all three, data pipelines must be replayable. Analytics systems often fail through stale data, schema drift, or partial ingestion rather than total service outage.
Monitor freshness, failed jobs, query failures, and storage growth.
9. Performance And Scaling
Athena performance depends heavily on partitioning, file size, compression, and columnar formats such as Parquet or ORC.
Redshift performance depends on warehouse capacity, data modeling, distribution, sort keys, compression, workload management, and query design.
OpenSearch performance depends on index mapping, shard count, replica count, ingestion rate, query shape, refresh behavior, and storage tier.
Use the performance model that matches the query. Do not expect Athena to serve product search or OpenSearch to replace a warehouse join model.
10. Cost Model
Athena cost is driven largely by data scanned, plus S3, Glue, and result storage.
Redshift cost is driven by provisioned or serverless compute, storage, snapshots, data transfer, and workload usage.
OpenSearch cost is driven by domain or serverless capacity, storage, replicas, snapshots, and ingestion pipelines.
Athena can be cheaper for occasional queries. Redshift can be more efficient for repeated warehouse workloads. OpenSearch is worth it when indexed search and log exploration justify the cost.
12. SAA-C03 Exam Signals
"Serverless SQL over S3" points to Athena.
"Query CloudTrail, ALB, or VPC Flow Logs in S3" points to Athena.
"Data warehouse" points to Redshift.
"Business intelligence over structured historical data" points to Redshift.
"Full-text search" points to OpenSearch.
"Log analytics with indexed search and dashboards" points to OpenSearch.
"Crawler or ETL job" points to Glue, not Athena, Redshift, or OpenSearch alone.
13. Common Exam Traps
Do not choose Redshift for a small occasional S3 query when Athena fits.
Do not choose Athena for low-latency full-text search.
Do not choose OpenSearch as the warehouse for complex SQL BI.
Do not forget Glue Data Catalog permissions for Athena.
Do not ignore data scanned cost in Athena.
Do not index all logs forever in OpenSearch without retention planning.
15. Related Topics
Review Amazon Athena, Amazon Redshift, Amazon OpenSearch Service, AWS Glue, and Analytics Data Lake On S3.
Official AWS references:
What to study next
These links keep the session moving: read prerequisites first, then open the systems, concepts, and patterns that deepen this page.
Prerequisites
Read these first if the mechanics feel unfamiliar.
More Links
Additional references connected to this page.