Google Vertex AI
Gemini models hosted in your GCP project, with IAM-governed access, region-pinned data, and VPC-SC support. The recommended Google path for enterprise / regulated workloads.
For auth setup, see authentication/google-cloud.md.
Configuration
~/.insightworker/.env:
LLM_PROVIDER=vertex
GOOGLE_CLOUD_PROJECT=your-gcp-project-id
GOOGLE_CLOUD_LOCATION=us-central1
VERTEX_MODEL=gemini-2.0-flash-001
# Optional — falls back to ADC if unset:
# GOOGLE_APPLICATION_CREDENTIALS=/path/to/service-account.json
Recommended models
Same Gemini family as AI Studio. Vertex doesn't add or remove models — it changes where they're hosted.
| Use case | Model |
|---|---|
| General apps | gemini-2.0-flash-001 |
| Long context, hard reasoning | gemini-2.5-pro (2M tokens) |
| Cheap | gemini-2.0-flash-lite |
Region availability
Some Gemini models are only available in a subset of regions. Common choices:
us-central1— broadest availabilityus-east5— newer, some modelseurope-west4— Europeasia-northeast1— Tokyo
Check Vertex AI locations doc for which model is GA where. If you specify a model in a region where it's not available, you'll get a 404.
VPC Service Controls
If your GCP project is inside a VPC-SC perimeter, ensure aiplatform.googleapis.com is permitted and the InsightWorker service account is in the access policy. See GCP VPC-SC docs.
Where data goes
Vertex AI: prompts and responses stay in your GCP project, in the region you configured. Google's enterprise data terms apply (cloud.google.com/terms). No training on customer data.
This is the regulated-customer-acceptable Google path.
Common gotchas
| Symptom | Cause | Fix |
|---|---|---|
Could not load default credentials | ADC not run | gcloud auth application-default login, or set GOOGLE_APPLICATION_CREDENTIALS to a service-account JSON |
Permission denied | Role missing aiplatform.user | Grant roles/aiplatform.user to the principal |
Model not found in location | Model / region mismatch | Check the locations doc; switch to a region where the model is GA |
| Slow first call | Cold start | Subsequent calls warm up. For latency-sensitive apps pre-warm |
See also
- authentication/google-cloud.md — auth (ADC vs. service account)
- gemini-ai-studio.md — consumer-tier Gemini (no GCP needed)
- overview.md — full provider matrix
Source: docs/providers/vertex-ai.md in the public repo. Open a PR with corrections.
