InsightWorker Logo
  • contact@verticalserve.com
Docs / LLM providers / AWS Bedrock

AWS Bedrock

InsightWorker's recommended provider for enterprise customers — your prompts stay in your AWS account, you use existing IAM, and there's no separate vendor relationship to procure.

For auth setup, see one of:

This page covers model selection, region availability, and Bedrock-specific behavior.

Configuration

~/.insightworker/.env:

LLM_PROVIDER=bedrock
AWS_REGION=us-east-1
BEDROCK_MODEL=us.anthropic.claude-sonnet-4-5-20250929-v1:0

# Hybrid model routing (used by some skills for cost optimization):
BEDROCK_MODEL_FAST=us.anthropic.claude-haiku-4-5-20251001-v1:0
BEDROCK_MODEL_STRONG=us.anthropic.claude-opus-4-1-20250805-v1:0

Recommended models

These IDs are Bedrock cross-region inference profiles (the us.anthropic.* form). They route automatically across multiple regions for higher availability.

Use caseModelNotes
Default agent driverus.anthropic.claude-sonnet-4-5-20250929-v1:0Recommended for most apps
Heavy reasoning, complex translationus.anthropic.claude-opus-4-1-20250805-v1:0Slower, more capable
Cheap classification, bulkus.anthropic.claude-haiku-4-5-20251001-v1:0Fast, cheap
Latestcheck aws bedrock list-foundation-models for currents

Verifying available models

aws bedrock list-foundation-models \
  --region us-east-1 \
  --by-provider anthropic \
  --query 'modelSummaries[?modelLifecycle.status==`ACTIVE`].modelId'

This returns the foundation model IDs in your region. The us.anthropic.* inference profiles you actually use in BEDROCK_MODEL aren't listed there — they're a separate API:

aws bedrock list-inference-profiles \
  --region us-east-1

If a model shows in list-foundation-models but not in list-inference-profiles, you can still call it directly with the foundation model ID — but cross-region routing won't help if your home region runs out of capacity.

Region availability

Anthropic models on Bedrock are typically available in:

  • us-east-1 (Virginia) — broadest model coverage
  • us-west-2 (Oregon) — most models
  • eu-central-1 (Frankfurt) — Sonnet, Haiku
  • ap-northeast-1 (Tokyo) — Sonnet
  • ap-southeast-2 (Sydney) — limited

The us.anthropic.* cross-region inference profiles are only valid for US-region calls. For EU/APAC, use the bare anthropic.claude-... foundation model IDs without the us. prefix.

For non-US deployments your BEDROCK_MODEL would look like:

AWS_REGION=eu-central-1
BEDROCK_MODEL=anthropic.claude-sonnet-4-5-20250929-v1:0   # no us. prefix

Context windows

Bedrock-hosted Claude has the same context windows as Anthropic-direct:

Model familyContext
claude-sonnet-4-5, claude-sonnet-4-5200k tokens
claude-opus-4-1, claude-sonnet-4-5200k (1M for the 1M variant)
claude-haiku-4-5200k

InsightWorker caps tool output to ~10% of the context window — see permissions-and-safety/loop-detector.md for the cap math, or override with MAX_TOOL_OUTPUT_LENGTH.

VPC endpoint (PrivateLink) for regulated tenants

If your network team requires AWS traffic to stay off the public internet, set up a Bedrock VPC interface endpoint:

  1. VPC consoleEndpointsCreate endpoint
  2. Service name: com.amazonaws.<region>.bedrock-runtime
  3. Enable Private DNS
  4. Attach to the subnets where InsightWorker runs

With Private DNS enabled, the standard hostname (bedrock-runtime.us-east-1.amazonaws.com) resolves to the private endpoint when accessed from inside the VPC. InsightWorker requires no code change — the AWS SDK follows DNS.

For audit completeness, also enable VPC Flow Logs and CloudTrail data events on the endpoint.

Bedrock Guardrails (content filtering)

Bedrock supports server-side guardrails (PII redaction, content categories, denied topics). These are configured in the AWS console as a separate Guardrail resource and applied to model calls via guardrailIdentifier parameter.

InsightWorker doesn't currently set guardrailIdentifier on calls. If you have a Guardrail you want enforced, two options:

  1. Provider-side: configure the Guardrail's default model-association so it's enforced at the model resource level (no client change needed).
  2. Wait for the InsightWorker feature: passing a guardrail ID via env var is on the roadmap. File an issue if you'd like it prioritized.

Cost considerations

Bedrock prices match Anthropic-direct API pricing within a small margin. As of writing:

ModelInputOutput
Sonnet 4.5$3/M tokens$15/M tokens
Opus 4.1$15/M tokens$75/M tokens
Haiku 4.5$0.25/M tokens$1.25/M tokens

Verify against the Bedrock pricing page — these change.

For agent apps, the typical cost-driver is tool output volume, not the LLM call itself. Cap tool output aggressively (MAX_TOOL_OUTPUT_LENGTH=20000 for chatty apps).

Common gotchas

SymptomCauseFix
ResourceNotFoundException: model not foundWrong model IDRun aws bedrock list-inference-profiles --region <region> to see what's actually available
AccessDeniedException: bedrock:InvokeModelRole missing Bedrock permissionSee policy in aws-sso.md
ValidationException: model not enabledModel access not requested in the AWS consoleBedrock → Model access → request access for the Anthropic family in your region
Slow first call (5+s)Cold startSubsequent calls are ~1-2s; for latency-sensitive apps pre-warm with a tiny ping at daemon startup
Throttling under loadRegion capacityUse cross-region inference profiles (the us.anthropic.* form) — they failover automatically

See also


Source: docs/providers/bedrock.md in the public repo. Open a PR with corrections.