Deploy InsightWorker Workers to Production

Deploy AI Workers to Production

Promote an app from your dev workstation to a managed production worker — with central policy, signed skill versions, service-account credentials, and full audit. Three deployment modes cover scheduled jobs, event-triggered automations, and long-running agents.

Why deploy via Enterprise

An app on your laptop isn't a deployment.

The CLI is great for building and testing. Production is a different problem: who owns the credentials, which skill version is running, who can trigger it, who sees the audit trail when it does something. The Enterprise control plane gives you all of it from one place — so your workers are governed, not just running.

Pinned, signed skill versions

Every production worker pulls a specific version of a signed skill bundle on startup. No drift, no surprise behavior changes — and rollback is a single config update in the control plane.

Service-account credentials

Each deployment gets its own scoped Personal Access Token. Runs are attributed to the service account, not a human. Revoke from one place, instantly.

Per-environment policy

Dev workers can use any model and any tool. Production workers inherit a stricter policy — model whitelist, tool blocklist, token quotas, off-hours windows — enforced server-side on every policy fetch.

Approval policy for unattended runs

Production can't pause for /app done. Define rules once: auto-approve read-only tools, escalate writes to Slack or PagerDuty, hard-block destructive ones.

Usage & cost attribution

Every production run shows up in the same Dashboard / Usage / Activity views as developer runs — filterable by deployment, team, or skill. No separate observability stack.

Full audit trail

Every prompt, tool call, model response, and approval decision is logged with the deployment ID, signed skill version, and policy bundle signature in force at that moment. Auditable end-to-end.

How it fits together

Workers run in your environment. Control runs in ours (or yours).

A production worker is just a containerized InsightWorker runtime configured to run unattended. On startup it fetches its effective policy bundle and pinned skill version from the control plane. It reports usage back. Triggers come from your existing infrastructure: cron, queue, webhook, or chat channel.

┌──────────────────────────────────────────────────────────────────────┐ │ InsightWorker Enterprise Control Plane │ │ │ │ policy · skills marketplace · deployments · audit · usage analytics │ └──────────┬─────────────────────────────┬─────────────────────────────┘ │ policy + skill bundle │ usage events + audit │ (signed, ETag-cached) │ (run-id, tokens, cost) ▼ ▲ ┌──────────────────────────────────────────────────────────────────────┐ │ Production Workers (your environment) │ │ │ │ ┌──────────┐ ┌──────────┐ ┌──────────────┐ │ │ │ Scheduled│ │ Event │ │ Long-running │ │ │ │ job │ │ triggered│ │ daemon │ │ │ │ (cron) │ │ (webhook)│ │ (Slack bot) │ │ │ └────┬─────┘ └────┬─────┘ └──────┬───────┘ │ │ └───────────────┴─────────────────┘ │ │ │ │ │ insightworker worker │ │ │ │ │ ┌────────────────┼────────────────────┐ │ │ ▼ ▼ ▼ │ │ your LLM your data your tools │ │ (Bedrock/ (SharePoint/ (JIRA/SQL/ │ │ OpenAI/...) S3/DBs/...) shell/...) │ └──────────────────────────────────────────────────────────────────────┘

Pick your trigger

Three deployment modes cover almost everything

Same worker runtime, same skill bundles, same control plane. The only thing that changes is what wakes the worker up.

Mode 1 · Scheduled

Run on a clock

A container that runs insightworker worker on a schedule, executes one app, reports usage, exits. Best for batch and recurring work.

Triggered by

Kubernetes CronJob
Airflow DAG / Dagster job
AWS EventBridge / GCP Cloud Scheduler
System cron

Fits best for

Weekly sprint digests
Nightly broker submission OCR batches
Daily SOV reconciliation
Quarterly compliance reports

# k8s CronJob — runs the weekly sprint digest every Monday at 9am
apiVersion: batch/v1
kind: CronJob
metadata:
  name: iw-weekly-digest
spec:
  schedule: "0 9 * * 1"
  jobTemplate:
    spec:
      template:
        spec:
          containers:
          - name: iw-worker
            image: ghcr.io/verticalserve/insightworker-worker:2.1.11
            args: ["worker", "--skill=weekly-sprint-digest@1.4.0"]
            env:
            - name: IW_API_URL
              value: https://iw.your-org.com
            - name: IW_PAT
              valueFrom: { secretKeyRef: { name: iw-pat, key: token } }
          restartPolicy: OnFailure

Mode 2 · Event-triggered

Run when something happens

A container that starts on an event (webhook, queue message, file drop), runs one app, posts back, exits. Best for fast-reaction automations.

Triggered by

Webhook (PagerDuty, JIRA, GitHub, Slack)
SQS / Pub/Sub / Kafka message
S3 / GCS file drop notification
Inbound email (SendGrid / SES inbound)

Fits best for

L1 incident response from PagerDuty
JIRA ticket triage on creation
New broker submission auto-processing
Inbound claims FNOL email intake

# Cloud Run service — handles a PagerDuty webhook
# PagerDuty fires → Cloud Run starts container → runs L1 skill → exits

$ gcloud run deploy iw-l1-responder \
    --image=ghcr.io/verticalserve/insightworker-worker:2.1.11 \
    --args="worker,--mode=webhook,--skill=l1-incident-response@2.1.0" \
    --set-env-vars=IW_API_URL=https://iw.your-org.com \
    --set-secrets=IW_PAT=iw-l1-pat:latest \
    --region=us-east1 --no-allow-unauthenticated

# Then point PagerDuty at the Cloud Run URL:
# https://iw-l1-responder-xxx.run.app/webhook

# Worker:
#   1. Reads the PagerDuty payload (incident.triggered)
#   2. Pulls effective policy + l1-incident-response@2.1.0 skill bundle
#   3. Runs the app: ack alert → tail logs → pull dashboards → draft update
#   4. Posts the update to PagerDuty notes + #incidents Slack channel
#   5. Reports usage event back to the control plane and exits

Mode 3 · Long-running daemon

Stay listening

A persistent container connected to a chat channel or a queue, handling many requests over a session. Best for conversational agents.

Triggered by

Slack / MS Teams socket connection
Long-polling a ticket queue
WebSocket from your internal app
Telegram / WhatsApp business bot

Fits best for

L1 support agent in #support Slack
Underwriting Q&A bot for brokers
DevOps runbook assistant in #ops
HR onboarding bot in Teams

# docker-compose.yml — Slack-attached support agent
services:
  iw-support-bot:
    image: ghcr.io/verticalserve/insightworker-worker:2.1.11
    command: ["worker", "--mode=slack", "--channel=#support",
              "--skill=l1-support-agent@3.0.0"]
    restart: always
    environment:
      IW_API_URL:    https://iw.your-org.com
      IW_PAT:        ${IW_SUPPORT_PAT}
      SLACK_BOT_TOKEN: ${SLACK_BOT_TOKEN}
      SLACK_APP_TOKEN: ${SLACK_APP_TOKEN}
    healthcheck:
      test: ["CMD", "insightworker", "healthcheck"]
      interval: 30s

# Bot listens on #support, threads each user request as its own run,
# reports each run as a usage event with thread_id as the correlation key.

Centralized management

One dashboard for every deployment

Once a worker is deployed, you manage it from the Enterprise UI alongside your dev devices and human users. Same audit, same usage analytics, same policy editor.

Deployments page

List of every production worker, its skill version, its environment, last-run status, error rate. One row per deployment with a sparkline.

Skill version pinning

Each deployment points at one signed skill version. Promote v2.1.0 → v2.2.0 with an audit-logged config change; if it misbehaves, click "rollback to last good".

Alerts on failure and drift

Notify on error spikes, quota breaches, or unexpected model usage. Email, Slack, or PagerDuty — your channel of choice.

Per-deployment usage

Filter the Usage and Activity views by deployment. See tokens, cost, top skills, error mix, and recent runs for any worker — same UI your team already uses.

Approval policy editor

Define which tool calls run unattended and which escalate. "Read-only tools auto-approve, writes to prod systems escalate to #ops, destructive operations hard-block."

Service-account credentials

One scoped PAT per deployment, rotated on a schedule. Revoke from the Credentials page; the worker is locked out on its next policy fetch.

From dev to prod

A typical rollout — five steps

# 1. Build the app on your laptop (CLI / Desktop / VS Code)
$ insightworker /app create "process new broker submissions in inbox folder"
$ insightworker /app run process-broker-submission

# 2. When it works end-to-end, package it as a skill
$ insightworker /app publish broker-submission-triage

# 3. Submit to the Enterprise marketplace for approval (admin signs it)
$ # Publishes via `iw app publish` to the configured S3 bucket (IW_APP_STORE)

# 4. Create the deployment in the Enterprise UI
#    Settings → Deployments → New → mode=Event-triggered, skill=broker-submission-triage@1.0.0
#    Pick environment (prod), service account, approval policy, alert channels.

# 5. Point your trigger at the deployment endpoint and watch runs flow in
#    Dashboard → Usage → filter by deployment → see tokens, cost, runs in real time.

Ready to put a worker in production?

Production deployment, central management, signed skill versions, and full audit are part of InsightWorker Enterprise. Start a conversation — we can have your first deployment running in a week.

Talk to us about Enterprise See what else Enterprise includes