Google Cloud — Vertex AI auth

InsightWorker can call Google's Gemini models through Vertex AI in your GCP project (recommended for enterprise: data stays in your project, IAM-governed) or through Google AI Studio (simpler, just an API key, but consumer-tier endpoint).

This page covers Vertex AI auth. For AI Studio, see providers/gemini-ai-studio.md.

Two auth paths

Path	When to use	Setup
Application Default Credentials (ADC) via `gcloud auth application-default login`	Developer machine	One command
Service account JSON key	CI / unattended servers	Generate JSON, set env var

Path 1 — ADC (developer machine)

Install Google Cloud SDK if you haven't, then:

gcloud auth application-default login

A browser opens, you sign in, ADC credentials get cached at ~/.config/gcloud/application_default_credentials.json. Refreshes silently before expiry.

~/.insightworker/.env:

LLM_PROVIDER=vertex
GOOGLE_CLOUD_PROJECT=your-gcp-project-id
GOOGLE_CLOUD_LOCATION=us-central1
VERTEX_MODEL=gemini-2.0-flash-001

That's it — the Vertex SDK picks up ADC automatically.

Path 2 — Service account JSON (unattended)

For server / CI environments where there's no browser flow:

GCP Console → IAM & Admin → Service Accounts → Create Service Account
Name: insightworker-vertex
Grant role: Vertex AI User (roles/aiplatform.user)
Keys tab → Add Key → JSON → download
Move the JSON to a safe location on your server (e.g. /etc/insightworker/vertex-sa.json)
chmod 600 it

~/.insightworker/.env:

LLM_PROVIDER=vertex
GOOGLE_CLOUD_PROJECT=your-gcp-project-id
GOOGLE_CLOUD_LOCATION=us-central1
GOOGLE_APPLICATION_CREDENTIALS=/etc/insightworker/vertex-sa.json
VERTEX_MODEL=gemini-2.0-flash-001

The Google SDK reads GOOGLE_APPLICATION_CREDENTIALS and authenticates as the service account.

Required IAM role

roles/aiplatform.user is the minimum. Specifically the principal needs:

aiplatform.endpoints.predict
aiplatform.models.predict
aiplatform.locations.list

The pre-built aiplatform.user role covers all of these. For tighter scope, build a custom role with just those three permissions.

Region selection

Gemini models are not available in every Vertex region. Common choices:

us-central1 — most models, lowest latency from US East/Central
us-east5 — newer; some models only
europe-west4 — Europe-based teams
asia-northeast1 — Tokyo

Check the Vertex AI locations doc for which models are available where.

Verify

In the chat:

Which LLM provider and model are you using?

Should respond with Google Vertex AI + your VERTEX_MODEL.

A real query test:

Use perplexity_search to find any news on Apple. Then summarize.

The agent should run the tool and synthesize using your Vertex Gemini model. Check the latency in the response — Vertex calls usually take 1-3s; if you're seeing 10+s, your region / model combo may be cold-starting.

Common gotchas

Symptom	Cause	Fix
`Could not load default credentials`	ADC not run or JSON path wrong	Re-run `gcloud auth application-default login`, or check `GOOGLE_APPLICATION_CREDENTIALS` path is absolute and readable
`Permission denied: aiplatform.endpoints.predict`	Role not granted	Add `roles/aiplatform.user` to the principal
`Model not found in location`	Region / model mismatch	Check the locations doc; pick a region where your model is GA
Works locally, fails in CI	ADC isn't on the CI runner	Use service account JSON path instead of ADC

VPC Service Controls (regulated environments)

If your GCP project has VPC Service Controls enabled, ensure InsightWorker's traffic to aiplatform.googleapis.com is permitted in the perimeter. Add the InsightWorker service account to the access policy if needed.