AWS Scenarios

Secure Partner File Ingest On S3

Design secure partner file ingestion with AWS Transfer Family, Amazon S3, KMS, IAM roles, S3 event notifications, SQS, Lambda, Macie, lifecycle policies, quarantine prefixes, and audit logging.

intermediate5 min readUpdated 2026-06-03CloudCertificationSecurityOperations
AWS Transfer FamilyAmazon S3SFTPKMS EncryptionS3 Event NotificationsQuarantine PrefixLambda ProcessingMacie Findings

After this, you will understand

File ingest is a humble scenario that forces learners to connect identity, storage, events, encryption, quarantine, scanning, and operational retry behavior.

Plain version

Let partners upload through Transfer Family into S3, isolate incoming files, encrypt them, emit events to processing, and move only validated files into trusted prefixes.

Decision pressure

Uploads land directly in trusted data sets, Lambda processes files twice without idempotency, KMS policies block access, or partners receive broad bucket permissions.

Exam-ready model

Use managed transfer endpoints, per-partner scoped access, encrypted landing prefixes, event-driven processing, quarantine and processed zones, and explicit audit trails.

Think before readingWhy should uploaded files usually land in an incoming or quarantine prefix first?
Because the system should validate, scan, classify, and normalize files before treating them as trusted business data.

Reading in progress

This page is saved in your local study history so you can continue later.

Study path

Read these in order

Start with the mechanics, then move into the patterns that explain why the system is shaped this way.

  1. 1Analytics Data Lake On S3aws-scenarios

Concepts Covered

  • Partner SFTP intake
  • AWS Transfer Family
  • S3 landing buckets
  • Per-partner IAM scoping
  • SSE-KMS
  • S3 event notifications
  • SQS buffering
  • Lambda processing
  • Macie sensitive data findings
  • SAA-C03 integration traps

1. Situation

A company receives daily files from external partners. Partners already use SFTP and cannot easily change their systems. The company wants the files in S3 so downstream analytics and processing can use AWS-native services.

The files may contain sensitive business data. The system needs encryption, partner isolation, audit logs, retry behavior, and a clear boundary between untrusted uploads and trusted processed data.

The design question is:

how do we keep familiar partner transfer workflows while making the backend cloud-native and controlled?

AWS Transfer Family provides managed SFTP, FTPS, FTP, AS2, and browser-based transfer options backed by S3 or EFS. S3 becomes the durable landing layer.

2. Naive Design

The naive design gives every partner an IAM user with broad S3 access and asks them to upload files directly with AWS credentials.

That creates credential risk and asks partners to change tooling.

Another naive design lets partners upload directly into the final analytics prefix. Downstream jobs may process incomplete, malformed, duplicated, malicious, or wrongly named files.

A third mistake is using S3 events to invoke Lambda directly for every upload without thinking about retries, duplicates, file size, and idempotency.

File ingest is not just receiving bytes. It is building a trust transition.

3. What Breaks

Isolation breaks when one partner can list or overwrite another partner's files.

Processing breaks when S3 event notifications are delivered more than once and the processor is not idempotent.

Security breaks when encrypted objects require KMS permissions that the processing role does not have.

Compliance breaks when sensitive files are stored without classification, retention, or audit.

Operations break when failed processing has no dead-letter path and no one knows which files are stuck.

4. AWS Architecture

Use AWS Transfer Family to expose the protocol partners already use, commonly SFTP.

Back the transfer server with S3. Scope each partner to a specific prefix or logical directory. Use IAM roles and policies to limit access.

Land uploads in an incoming or quarantine prefix:

s3://partner-ingest/incoming/partner-a/date/file.csv

Encrypt objects with SSE-S3 by default or SSE-KMS when customer-managed key control, audit, or cross-account access requires it.

Use S3 event notifications to send object-created events to SQS, EventBridge, or Lambda. SQS is often a useful buffer between S3 and processing.

Use Lambda, Step Functions, Glue, or container jobs to validate schema, scan, transform, tag, and move files into trusted processed prefixes.

Use Macie when sensitive data discovery is part of the requirement.

5. Request Or Data Flow

A partner connects to the Transfer Family endpoint using its normal SFTP client.

Transfer Family authenticates the user and writes the file into the mapped S3 prefix.

S3 emits an object-created event. The event goes to SQS, EventBridge, or Lambda depending on the design.

The processor reads the object, validates file name, size, checksum, schema, and partner ownership. It may scan, classify, tag, or enrich metadata.

If valid, the processor writes the file to a trusted prefix:

processed/partner-a/yyyy/mm/dd/file.parquet

If invalid, it moves or tags the file for quarantine and records the reason.

Downstream analytics should read from processed prefixes, not raw incoming prefixes.

6. Security Controls

Use per-partner users or identity provider mappings. Restrict each partner to its own prefix.

Do not share broad bucket permissions. Use scoped IAM and bucket policies.

Use KMS key policies that allow Transfer Family, S3, and processing roles to perform the required encryption and decryption operations.

Use S3 Block Public Access. Partner access should go through the intended transfer endpoint or defined access path.

Use CloudTrail and S3 logs where audit requirements demand them.

Use Macie for sensitive data discovery in S3 when regulated data or data leakage is a concern.

7. Resilience Controls

Transfer Family is managed and can use a redundant fleet. S3 provides durable storage for uploaded files.

Use SQS between S3 and processors when you want buffering, retries, and dead-letter queues.

Make processors idempotent. S3 event notifications are at-least-once, so the same file event can be seen more than once.

Separate incoming, quarantine, failed, and processed prefixes. This makes recovery and replay easier.

Keep original raw files until downstream validation is complete and retention rules allow deletion.

8. Performance Controls

Transfer performance depends on partner network conditions, file size, protocol behavior, and endpoint configuration.

For many small files, event overhead and Lambda concurrency can dominate. Batch processing or queue buffering may be better.

For large files, ensure the processor can handle object size and timeout limits. Lambda may not be the right processor for every file.

Use S3 prefix design for discoverability and lifecycle. Avoid downstream jobs that scan the entire bucket when they only need a partition.

9. Cost Controls

Transfer Family has endpoint and transfer-related costs. S3 costs include storage, requests, lifecycle, and retrieval.

KMS costs can matter for high-volume upload workflows.

Lambda, SQS, EventBridge, Glue, Macie, and CloudWatch logs all add cost depending on volume.

Lifecycle rules should transition or expire raw, failed, and processed data according to retention requirements.

Avoid processing loops where a Lambda writes to the same prefix that triggers itself.

10. Exam Variants

"Partners require SFTP but data should land in S3" points to AWS Transfer Family.

"Trigger processing when a file arrives" points to S3 event notifications with Lambda, SQS, SNS, or EventBridge.

"Need decoupled retryable processing" often points to SQS between S3 and compute.

"Detect sensitive data in S3 uploads" points to Macie.

"Encrypt with customer-managed keys" points to SSE-KMS and key policy permissions.

"Prevent one partner from seeing another partner's files" points to scoped IAM access and prefix isolation.

11. Common Traps

Do not give partners broad S3 permissions.

Do not process untrusted uploads as final data.

Do not forget S3 event duplicate delivery.

Do not trigger Lambda on a prefix where Lambda writes output unless the filter prevents loops.

Do not use KMS without granting the processing roles decrypt permissions.

Do not assume SFTP means you must run EC2 file servers.

Review AWS Transfer Family, Amazon S3, AWS Key Management Service, Amazon SQS, AWS Lambda, and Amazon Macie.

Official AWS references:

What to study next

These links keep the session moving: read prerequisites first, then open the systems, concepts, and patterns that deepen this page.