LLM Providers
InsightWorker is provider-agnostic. The agent's instructions and tool definitions are normalized into the Anthropic message shape, then translated to each provider's native format. You configure one (or auto-detect from whichever credentials are present).
Provider matrix
| Provider | Selector | Best for | Notes |
|---|---|---|---|
| AWS Bedrock | bedrock | Enterprise — uses your AWS account, no separate key | Recommended default for regulated industries |
| Anthropic API | anthropic | Direct access to latest Claude | When you don't have AWS Bedrock |
| OpenAI | openai | gpt-4o, o1 reasoning models | Tool calling works well; 128k context cap |
| Azure OpenAI | azure | Enterprise — prompts stay in your Azure tenant | Same OpenAI shape, routed through your endpoint |
| Google Gemini (AI Studio) | gemini | Long context (up to 2M tokens), cheapest per-token | Stricter function-call schema |
| Google Vertex AI | vertex | Enterprise — same Gemini models on GCP, IAM-governed, region-pinned | ADC or service-account auth |
| Custom (OpenAI-compatible) | custom | vLLM, Ollama, LM Studio, in-house GPU | Whatever endpoint speaks OpenAI's chat-completions API |
Auto-detection rules
When LLM_PROVIDER isn't set, InsightWorker picks the first available:
vertex— ifGOOGLE_CLOUD_PROJECTandGOOGLE_CLOUD_LOCATIONare setgemini— ifGEMINI_API_KEYis setazure— ifAZURE_OPENAI_ENDPOINTandAZURE_OPENAI_API_KEYare setcustom— ifCUSTOM_LLM_BASE_URLis setopenai— ifOPENAI_API_KEYis setanthropic— ifANTHROPIC_API_KEYis setbedrock— fallback (assumes IAM role /~/.aws/credentials)
Set LLM_PROVIDER explicitly when you have multiple credentials configured and want deterministic selection.
Per-model context windows
The agent caps tool output to ~10% of the active model's context window so multi-turn loops don't blow past the limit.
| Model family | Context | Tool output cap |
|---|---|---|
gpt-4o, gpt-4-turbo, gpt-4.1 | 128k | ~50k chars |
o1, o3 (OpenAI reasoning) | 200k | ~80k chars |
claude-sonnet-4-5/6, claude-opus-4-1, claude-haiku-4-5 | 200k | ~80k chars |
claude-sonnet-4-5 (1M variant) | 1M | ~400k chars |
gemini-2.0-flash | 1M | ~400k chars |
gemini-2.5-pro, gemini-1.5-pro | 2M | ~800k chars |
gpt-4 (legacy) | 8k | ~3k chars |
| Unknown / custom | 100k | ~40k chars |
Override with MAX_TOOL_OUTPUT_LENGTH if you want a hard cap regardless of model.
How the agent describes itself
When asked "what model are you using", the agent reports the configured backend, not the underlying model identity. Example:
User: which LLM are you using? Agent: I'm InsightWorker. Under the hood, I'm currently configured to use OpenAI with gpt-4o, but you can swap providers in your settings.
If the model leaks its pretrained identity ("I'm Claude"), update to the latest CLI — the system prompt explicitly overrides this.
Recommended models per provider
AWS Bedrock (default)
- Sonnet 4.5:
us.anthropic.claude-sonnet-4-5-20250929-v1:0— general apps, tool use - Opus 4.1:
us.anthropic.claude-opus-4-1-20250805-v1:0— hard reasoning, complex translations - Haiku 4.5:
us.anthropic.claude-haiku-4-5-20251001-v1:0— bulk classification, cheap calls
Anthropic direct
claude-sonnet-4-5— generalclaude-sonnet-4-5— complex reasoning, 1M contextclaude-haiku-4-5— cheap
OpenAI
gpt-4o— general, tool usegpt-4o-mini— cheapo1-preview/o1-mini— hard reasoning
Gemini
gemini-2.0-flash-001— generalgemini-2.5-pro— long context, hard reasoninggemini-2.0-flash-lite— cheap classification
Switching providers
You can switch providers any time without re-installing — just update ~/.insightworker/.env and restart the agent. Conversation history isn't portable across providers (different message shapes), so the next session starts fresh.
In the Desktop app: Settings → LLM Provider tab.
Privacy / data residency
| Provider | Data path |
|---|---|
| Bedrock | Your AWS account, your region. Anthropic doesn't see your prompts. |
| Vertex | Your GCP project, your region. Google's enterprise data terms apply. |
| Azure | Your Azure tenant, your region. Microsoft's enterprise data terms apply. |
| Custom (self-hosted) | Whatever endpoint you point at. |
| Anthropic / OpenAI / Gemini direct APIs | Vendor's cloud. Read their data-use terms. |
For regulated workloads (insurance, healthcare, finance) the recommended providers are Bedrock, Vertex, Azure, or custom self-hosted.
Source: docs/providers/overview.md in the public repo. Open a PR with corrections.
