Google Cloud — Vertex AI auth
InsightWorker can call Google's Gemini models through Vertex AI in your GCP project (recommended for enterprise: data stays in your project, IAM-governed) or through Google AI Studio (simpler, just an API key, but consumer-tier endpoint).
This page covers Vertex AI auth. For AI Studio, see providers/gemini-ai-studio.md.
Two auth paths
| Path | When to use | Setup |
|---|---|---|
Application Default Credentials (ADC) via gcloud auth application-default login | Developer machine | One command |
| Service account JSON key | CI / unattended servers | Generate JSON, set env var |
Path 1 — ADC (developer machine)
Install Google Cloud SDK if you haven't, then:
gcloud auth application-default login
A browser opens, you sign in, ADC credentials get cached at ~/.config/gcloud/application_default_credentials.json. Refreshes silently before expiry.
~/.insightworker/.env:
LLM_PROVIDER=vertex
GOOGLE_CLOUD_PROJECT=your-gcp-project-id
GOOGLE_CLOUD_LOCATION=us-central1
VERTEX_MODEL=gemini-2.0-flash-001
That's it — the Vertex SDK picks up ADC automatically.
Path 2 — Service account JSON (unattended)
For server / CI environments where there's no browser flow:
- GCP Console → IAM & Admin → Service Accounts → Create Service Account
- Name:
insightworker-vertex - Grant role: Vertex AI User (
roles/aiplatform.user) - Keys tab → Add Key → JSON → download
- Move the JSON to a safe location on your server (e.g.
/etc/insightworker/vertex-sa.json) chmod 600it
~/.insightworker/.env:
LLM_PROVIDER=vertex
GOOGLE_CLOUD_PROJECT=your-gcp-project-id
GOOGLE_CLOUD_LOCATION=us-central1
GOOGLE_APPLICATION_CREDENTIALS=/etc/insightworker/vertex-sa.json
VERTEX_MODEL=gemini-2.0-flash-001
The Google SDK reads GOOGLE_APPLICATION_CREDENTIALS and authenticates as the service account.
Required IAM role
roles/aiplatform.user is the minimum. Specifically the principal needs:
aiplatform.endpoints.predictaiplatform.models.predictaiplatform.locations.list
The pre-built aiplatform.user role covers all of these. For tighter scope, build a custom role with just those three permissions.
Region selection
Gemini models are not available in every Vertex region. Common choices:
us-central1— most models, lowest latency from US East/Centralus-east5— newer; some models onlyeurope-west4— Europe-based teamsasia-northeast1— Tokyo
Check the Vertex AI locations doc for which models are available where.
Verify
In the chat:
Which LLM provider and model are you using?
Should respond with Google Vertex AI + your VERTEX_MODEL.
A real query test:
Use perplexity_search to find any news on Apple. Then summarize.
The agent should run the tool and synthesize using your Vertex Gemini model. Check the latency in the response — Vertex calls usually take 1-3s; if you're seeing 10+s, your region / model combo may be cold-starting.
Common gotchas
| Symptom | Cause | Fix |
|---|---|---|
Could not load default credentials | ADC not run or JSON path wrong | Re-run gcloud auth application-default login, or check GOOGLE_APPLICATION_CREDENTIALS path is absolute and readable |
Permission denied: aiplatform.endpoints.predict | Role not granted | Add roles/aiplatform.user to the principal |
Model not found in location | Region / model mismatch | Check the locations doc; pick a region where your model is GA |
| Works locally, fails in CI | ADC isn't on the CI runner | Use service account JSON path instead of ADC |
VPC Service Controls (regulated environments)
If your GCP project has VPC Service Controls enabled, ensure InsightWorker's traffic to aiplatform.googleapis.com is permitted in the perimeter. Add the InsightWorker service account to the access policy if needed.
See also
- providers/vertex-ai.md — model catalog, request/response shape
- providers/gemini-ai-studio.md — the consumer-tier alternative if Vertex is overkill
Source: docs/authentication/google-cloud.md in the public repo. Open a PR with corrections.
