Windows/NFS file operation
→ FPolicy instant detection (CREATE/WRITE/DELETE/RENAME)
→ FPolicy Server → SQS → Bridge Lambda → EventBridge custom bus
→ EventBridge Rule (file_path prefix = ai_knowledge)
→ KB Trigger Lambda (debounce) → StartIngestionJob
→ Bedrock KB → reflected in tens of seconds to minutes
The FPolicy → SQS → EventBridge front-end reuses the existing solutions/event-driven/fpolicy pattern infrastructure. UC29 adds only an EventBridge rule and the KB Trigger Lambda.
Lost-Update Window (Critical)
EventBridge rule filtering FPolicy events for ai_knowledge prefix
The EventBridge rule routes only FPolicy events matching the ai_knowledge volume path to the KB Trigger Lambda.
Bedrock Ingestion performs a full-source scan at job start time. Files added during a running job are not included in that execution. Scenario C alone does not guarantee zero missed files.
Mandatory: Always pair Scenario C with Scenario B (periodic reconcile sync) as a safety net. The KB Trigger Lambda skips when a job is already in progress (debounce + ConflictException handling + reserved concurrency = 2).
Namespace Pitfall
FPolicy reports ONTAP volume-path namespace (ai_knowledge/..., underscore). The KB S3 ingestion prefix (ai-knowledge/, hyphen) is a different namespace. Initial implementation confused the two, causing false-skip. The EventBridge rule and Lambda secondary filter now use a dedicated FPOLICY_PATH_FILTER parameter for the volume-path namespace.
Hybrid RAG: Internal KB + Web Search (opt-in)
GA at AWS Summit NYC 2026 (June 17, 2026). Powered by AgentCore Web Search Tool.
Enterprise knowledge from FSx for ONTAP is treated as the primary internal source, while public web context is supplemental and untrusted. For questions that benefit from current external context — regulatory updates, market trends, public market information — the Query Lambda can optionally augment answers with real-time web search results.
How It Works
-
Internal KB retrieval (always): Bedrock KB searches S3 Vectors for relevant chunks from FSx for ONTAP documents
-
Web search (opt-in): AgentCore Gateway invokes Amazon's purpose-built web index via MCP protocol
-
Unified answer: Bedrock Converse merges both contexts, with internal documents as primary source
Key Design Decisions
| Decision |
Rationale |
Opt-in (EnableWebSearch=false default) |
Most enterprise QA needs internal data only |
| Graceful degradation |
Web Search failure → internal-only answer (no error surfaced to user) |
|
Cross-region (us-east-1 Gateway) |
Web Search Tool is us-east-1 only; adds ~100-200ms latency |
| Query safety |
Only user's question text is sent to Web Search — never internal document content |
| Citation separation |
[Internal: filename] vs [Web: title](URL) — users see exactly which source informed each claim |
| Prompt injection defense |
Web results wrapped in <web_search_results> with explicit "untrusted data" instruction |
| Acceptable Use compliance |
Source URLs and titles are always displayed (Web Search Tool TOS requirement) |
Deployment
sam deploy --parameter-overrides \
EnableWebSearch=true \
AgentCoreGatewayId=<gateway-id> \
AgentCoreGatewayRegion=us-east-1
Example Response
{"status":"completed","query":"What are the latest FISC guidelines for cloud data protection?","answer":"Based on internal documentation, our current FISC compliance posture covers... Additionally, [Web: FISC 2026 Guidelines Update](https://example.com/fisc-2026) published last month introduces...","citations":[{"source":"s3://.../legal/compliance/fisc-overview.pdf"}],"web_citations":[{"source":"https://example.com/fisc-2026","title":"FISC 2026 Guidelines Update","type":"web"}],"web_search_enabled":true}
Verification Highlights
Windows-Identity S3 Access Point with Dedicated AD
Windows Explorer Quick Access showing mapped FSx for ONTAP SMB share
The Windows EC2 domain-joined to AWS Managed Microsoft AD, with the FSx SMB share mapped — proving the literal drag & drop experience works end to end.
To demonstrate the literal Windows drag & drop experience, we built a dedicated AWS Managed Microsoft AD + domain-joined Windows EC2 + AD-joined SVM:
-
AD-joined SVM OU: AWS Managed AD's
OU=Computers lacks delegation rights → use the domain-name OU (OU=<domain>,DC=...)
-
CIFS share creation: Executes against the filesystem management LIF, not the SVM LIF
-
Windows-identity S3 AP: Works correctly with a running dedicated AD; files dropped in Explorer are readable via S3 AP
Deletion Lifecycle
Bedrock KB data source showing Sync button and ingestion status
The Bedrock KB data source connected to the FSx for ONTAP S3 AP alias. Click "Sync" for manual ingestion, or let Scenario B/C automate it.
Step Functions execution graph — all states succeeded
Scenario B's Step Functions workflow: detect changes → start ingestion → poll status → notify on completion.
"User deletes a file → AI forgets it" verified end-to-end: file deletion → next sync → numberOfDocumentsDeleted=1 → re-query returns "no information found". Powered by dataDeletionPolicy=DELETE. For urgent revocation between syncs, call the Ingestion API directly.
Performance Considerations
-
Shared bandwidth: S3 AP reads share the FSx throughput capacity (128/256/512 MBps) with NFS/SMB workloads. Scenario B's 15-minute interval and Scenario C's reserved concurrency (2) throttle ingestion flow
-
Bulk re-index: For full re-ingestion (e.g., embedding model change), use a FlexClone volume as the Ingestion target — zero impact on production I/O, consistent point-in-time read
-
Tiering: Frequently accessed AI knowledge should remain on the SSD tier. Capacity Pool retrieval latency affects GetObject time during ingestion
-
Web Search latency: Cross-region call to us-east-1 adds ~100-200ms. Total hybrid query latency depends on KB size, model, and network conditions (KB retrieve + Web Search + Converse generation)
Access Control — Three Layers
S3 AP boundaries are volume/prefix-level. For per-user visibility:
-
Search narrowing = Bedrock KB metadata filters (this UC; not AWS authorization)
-
Document-level ACL = Amazon Quick S3 Knowledge Base (UC30; user/group-level)
-
Chunk-level permission filter = Custom Permission-Aware RAG (FC3; AD SID/NTFS ACL for regulated industries)
Web Search results are public information — no ACL filtering needed. However, the unified answer that combines internal + web sources is subject to the same access control as internal-only answers (the internal citations remain permission-scoped).
Vector Store: Why S3 Vectors
This pattern uses Amazon S3 Vectors as the Bedrock KB vector store. OpenSearch Serverless remains a valid option when its operational and latency profile fits the workload better.
| Criterion |
OpenSearch Serverless |
S3 Vectors |
| Minimum monthly cost |
~175ドル (2 OCU) |
Pay-per-use only |
| Cost at scale |
OCU-based |
Cost savings for large vector datasets (see AWS documentation) |
| Metadata filtering |
Supported |
Supported (department, owner, role) |
| Permission-Aware RAG compatibility |
Supported |
Compatible with metadata-filtered retrieval designs; authorization enforced by application layer |
| Infrastructure management |
Managed but OCU scaling required |
Managed vector operations |
| Scale |
Millions of vectors |
2 billion vectors per index |
| Query latency |
Sub-100ms |
Sub-100ms |
For this project — 28 industry patterns + PoC-to-production lifecycle — S3 Vectors' pay-per-use model is the right fit. We evaluated Bedrock Managed Knowledge Base (GA June 2026, AWS Summit NYC) but chose Custom KB + S3 Vectors for cost control, ACL metadata flexibility, and FSx for ONTAP lifecycle integration (see ADR: docs/investigations/managed-kb-vs-custom-kb-s3vectors.md).
Data Classification
| Output |
Classification |
Rationale |
| KB vectors + metadata |
INTERNAL |
Inherits source file classification |
| Ingestion job status / SNS |
INTERNAL |
Operational metadata only |
| CloudWatch Metrics / Logs |
INTERNAL |
Aggregate metrics, no file content |
| Web Search results |
PUBLIC |
External public information |
| Hybrid answer (internal + web) |
INTERNAL |
Contains internal document citations |
For regulated workloads (CUI / FISC / HIPAA), extend shared/data_classification.py labels. If retention-period requirements apply, use dataDeletionPolicy=RETAIN and design a separate purge procedure.
Cost
| Component |
Monthly estimate |
Notes |
| Lambda (sync + query) |
< 5ドル |
Serverless pay-per-use |
| S3 API (ListObjects, GetObject) |
< 1ドル |
S3 AP reads |
| EventBridge Scheduler |
< 1ドル |
15-min interval |
| Bedrock KB Ingestion |
Usage-based |
Per-document embedding |
| S3 Vectors |
Usage-based |
Compare with OpenSearch Serverless for your query volume, latency, and operations requirements |
| Bedrock LLM (query) |
Usage-based |
Nova Pro: 0ドル.0008/1K input tokens |
| FPolicy Server (Scenario C) |
~35ドル |
ECS Fargate (set desiredCount=0 when idle) |
| AgentCore Web Search (opt-in) |
Per-query pricing (see AgentCore pricing) |
Gateway invocation pricing |
| Cross-region transfer (opt-in) |
< 0ドル.02 |
us-east-1 ↔ ap-northeast-1 |
Getting Started
git clone https://github.com/Yoshiki0705/FSx-for-ONTAP-S3AccessPoints-Serverless-Patterns.git
cd FSx-for-ONTAP-S3AccessPoints-Serverless-Patterns/solutions/genai/kb-selfservice-curation
# Install dependencies (shared modules used by Lambda handlers)
pip install -r requirements.txt # or: uv pip install -r requirements.txt
# Review parameters
cat samconfig.toml.example
# Build and deploy (requires configured AWS credentials + FSx for ONTAP S3 AP)
sam build && sam deploy --guided
# DemoMode=true runs without FSx for ONTAP (regular S3 bucket)
# Optional: Enable Web Search hybrid RAG
sam deploy --parameter-overrides \
EnableWebSearch=true \
AgentCoreGatewayId=<gateway-id> \
AgentCoreGatewayRegion=us-east-1
Governance Note
This article is technical architecture guidance, not legal, compliance, or regulatory advice. Pricing, regional availability, and benchmark numbers are time-sensitive; verify them against current AWS documentation before production use. S3 AP data source boundaries are at volume/prefix granularity — for per-user visibility control, consider Custom Permission-Aware RAG. If retention-period requirements (NARA / FISC) apply, use dataDeletionPolicy=RETAIN and design purge procedures separately. Web Search Tool usage requires compliance with the Acceptable Use Policy (source citations must be displayed).
Yoshiki Fujiwara