AWS Services
Amazon OpenSearch Service
Understand OpenSearch Service for managed search, log analytics, observability, domains, indexes, dashboards, security, scaling, and SAA-C03 signals.
After this, you will understand
OpenSearch helps learners separate full-text search and log analytics from relational databases and metric monitoring.
Amazon OpenSearch Service runs managed OpenSearch clusters or serverless collections for search, log analytics, and observability use cases.
Learners use RDS or DynamoDB for full-text search, or expect CloudWatch metrics alone to provide searchable log analytics.
Use OpenSearch when applications or operators need indexed search, log exploration, dashboards, and near-real-time analytics over searchable documents.
Think before readingWhy is OpenSearch different from Redshift?
Reading in progress
This page is saved in your local study history so you can continue later.
Study path
Read these in order
Start with the mechanics, then move into the patterns that explain why the system is shaped this way.
Concepts Covered
- Amazon OpenSearch Service
- Managed domains
- OpenSearch Serverless collections
- Indexes and documents
- Shards and replicas
- Full-text search
- Log analytics
- OpenSearch Dashboards
- Hot, UltraWarm, and cold storage concepts
- OpenSearch versus CloudWatch, Redshift, and DynamoDB
1. Plain-English Mental Model
Amazon OpenSearch Service is managed search and log analytics.
The simple model is:
documents or logs -> OpenSearch indexes -> search, filters, aggregations, dashboards
OpenSearch indexes documents so users and systems can search them quickly. This can mean full-text search in an application, operational log analytics, observability dashboards, or security investigation data.
It is not a relational database. It is not a warehouse. It is not only a metrics service.
OpenSearch is useful when the workload says "search these documents" or "explore logs quickly with indexed fields."
2. Why This Service Exists
Applications and operators often need search.
A product catalog needs keyword search and filters. Support teams need to search logs by request ID. Security teams need to explore events. Observability teams need dashboards over structured logs. A relational database can do some search, but it is usually not ideal for full-text search and log analytics at scale.
OpenSearch Service exists to run OpenSearch without teams managing clusters from scratch.
For SAA-C03, it appears in questions about full-text search, log analytics, operational dashboards, near-real-time search over ingested events, OpenSearch Dashboards, indexing data from Kinesis or Logstash-style pipelines, and choosing a search engine rather than a relational database.
3. The Naive Approach And Where It Breaks
The naive pattern is database search:
application -> SQL LIKE queries -> production database
This breaks when search becomes fuzzy, full-text, faceted, high-volume, or log-oriented. It can also hurt the production database.
Another naive pattern is dumping logs to S3 and expecting fast interactive search. S3 is durable storage. Athena can query it, but indexed log exploration often points to OpenSearch.
Another mistake is using OpenSearch as the source of truth for application transactions. It is usually better as a searchable index built from authoritative data elsewhere.
OpenSearch is an index and analytics engine, not a replacement for every database.
4. Core Primitives
A domain is a managed OpenSearch cluster.
OpenSearch Serverless uses collections and capacity units instead of user-managed clusters for supported use cases.
An index stores related documents.
A document is a JSON-like record stored in an index.
Shards split index data for scale.
Replicas copy shards for availability and read scale.
OpenSearch Dashboards provides visualization and exploration.
Ingestion can come from applications, Kinesis Data Firehose, Logstash-style pipelines, Lambda, CloudWatch Logs subscriptions, or other pipelines.
Storage tiers and instance choices affect performance and cost.
5. Architecture Use Cases
Use OpenSearch for application search:
product updates -> indexing pipeline -> OpenSearch -> search API
Use it for log analytics:
application logs -> Kinesis or Firehose -> OpenSearch -> dashboards and alerts
Use it for operational investigation where users need to search recent logs by fields and text.
Use CloudWatch for metrics and alarms, S3 for durable log archive, Athena for ad hoc SQL over archived logs, and OpenSearch for indexed exploration.
Use Redshift for warehouse analytics when SQL joins and structured BI modeling are the primary need.
7. Security Model
OpenSearch security includes network access, IAM, fine-grained access control, encryption, and dashboard access.
Domains can be public or VPC-based. Production domains should usually avoid broad public exposure.
Fine-grained access control can restrict indexes, documents, and dashboards.
Encryption at rest and node-to-node encryption should be enabled where required.
Ingest pipelines need permissions to write to indexes.
Logs can contain sensitive data. Do not index secrets or raw PII casually. Retention and access controls matter.
8. Reliability And Resilience
OpenSearch reliability depends on domain configuration, shard design, replicas, Availability Zone awareness, snapshots, and ingestion behavior.
Replicas improve read availability and recovery from node loss.
Snapshots protect indexes from data loss or accidental deletion.
Ingestion pipelines should handle backpressure. If OpenSearch is unavailable or overloaded, producers should not lose data silently.
OpenSearch is often a derived index. Rebuilding from S3, Kinesis retention, or source databases should be possible for critical datasets.
9. Performance And Scaling
OpenSearch performance depends on index design, shard count, instance type, storage, query patterns, ingestion rate, refresh interval, and retention.
Too many tiny shards waste resources. Too few large shards limit parallelism and recovery.
Search workloads and ingestion workloads compete for cluster resources.
Hot data may need fast storage and compute. Older log data may use warm or cold storage patterns depending on configuration.
Serverless can reduce cluster management for supported workloads, but query and index design still matter.
10. Cost Model
OpenSearch cost depends on cluster instance hours or serverless capacity, storage, snapshots, data transfer, and ingestion pipelines.
Keeping every log indexed forever is expensive. Use retention policies and archive older logs to S3.
OpenSearch is usually more expensive than storing logs in S3 alone, but it buys fast indexed search and dashboards.
Choose what deserves indexing based on investigation needs, retention, and query latency.
Cost questions often distinguish searchable recent operational data from cheap long-term archive.
12. SAA-C03 Exam Signals
"Full-text search" points to OpenSearch.
"Search application documents or product catalog" points to OpenSearch.
"Log analytics with indexed search and dashboards" points to OpenSearch.
"OpenSearch Dashboards" points to OpenSearch.
"Metrics and alarms" points to CloudWatch.
"Warehouse SQL analytics" points to Redshift.
"Serverless SQL over S3 logs" points to Athena.
13. Common Exam Traps
Do not use RDS as a full-text search engine when OpenSearch is the obvious managed search answer.
Do not confuse OpenSearch with CloudWatch metrics.
Do not store every log forever in OpenSearch without retention planning.
Do not expose OpenSearch domains publicly without strict controls.
Do not assume OpenSearch is the system of record for transactional data.
Do not ignore shard and replica design.
15. Related Topics
Review Amazon CloudWatch, Amazon Kinesis Data Streams, Amazon Athena, Amazon Redshift, and Amazon QuickSight.
Official AWS references:
What to study next
These links keep the session moving: read prerequisites first, then open the systems, concepts, and patterns that deepen this page.
Prerequisites
Read these first if the mechanics feel unfamiliar.
More Links
Additional references connected to this page.