Vertex AI SDK

Pass-through endpoints for Vertex AI - call provider-specific endpoint, in native format (no translation).

Feature	Supported	Notes
Cost Tracking	✅	supports all models on `/generateContent` endpoint
Logging	✅	works across all integrations
End-user Tracking	❌	Tell us if you need this
Streaming	✅

Just replace https://REGION-aiplatform.googleapis.com with LITELLM_PROXY_BASE_URL/vertex_ai

Example Usage

curl
Vertex Node.js SDK

curl http://localhost:4000/vertex_ai/publishers/google/models/gemini-1.0-pro:generateContent \
  -H "Content-Type: application/json" \
  -H "x-litellm-api-key: Bearer sk-1234" \
  -d '{
    "contents":[{
      "role": "user", 
      "parts":[{"text": "How are you doing today?"}]
    }]
  }'

const { VertexAI } = require('@google-cloud/vertexai');

const vertexAI = new VertexAI({
    project: 'your-project-id', // enter your vertex project id
    location: 'us-central1', // enter your vertex region
    apiEndpoint: "localhost:4000/vertex_ai" // <proxy-server-url>/vertex_ai # note, do not include 'https://' in the url
});

const model = vertexAI.getGenerativeModel({
    model: 'gemini-1.0-pro'
}, {
    customHeaders: {
        "x-litellm-api-key": "sk-1234" // Your litellm Virtual Key
    }
});

async function generateContent() {
    try {
        const prompt = {
            contents: [{
                role: 'user',
                parts: [{ text: 'How are you doing today?' }]
            }]
        };

        const response = await model.generateContent(prompt);
        console.log('Response:', response);
    } catch (error) {
        console.error('Error:', error);
    }
}

generateContent();

Quick Start

Let's call the Vertex AI /generateContent endpoint

Add Vertex AI Credentials to your environment

export DEFAULT_VERTEXAI_PROJECT="" # "adroit-crow-413218"
export DEFAULT_VERTEXAI_LOCATION="" # "us-central1"
export DEFAULT_GOOGLE_APPLICATION_CREDENTIALS="" # "/Users/Downloads/adroit-crow-413218-a956eef1a2a8.json"

Start LiteLLM Proxy

litellm

# RUNNING on http://0.0.0.0:4000

Test it!

Let's call the Google AI Studio token counting endpoint

curl http://localhost:4000/vertex-ai/publishers/google/models/gemini-1.0-pro:generateContent \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer sk-1234" \
  -d '{
    "contents":[{
      "role": "user",
      "parts":[{"text": "How are you doing today?"}]
    }]
  }'

Supported API Endpoints

Gemini API
Embeddings API
Imagen API
Code Completion API
Batch prediction API
Tuning API
CountTokens API

Authentication to Vertex AI

LiteLLM Proxy Server supports two methods of authentication to Vertex AI:

Pass Vertex Credetials client side to proxy server
Set Vertex AI credentials on proxy server

Usage Examples

Gemini API (Generate Content)

curl http://localhost:4000/vertex_ai/publishers/google/models/gemini-1.5-flash-001:generateContent \
  -H "Content-Type: application/json" \
  -H "x-litellm-api-key: Bearer sk-1234" \
  -d '{"contents":[{"role": "user", "parts":[{"text": "hi"}]}]}'

Embeddings API

curl http://localhost:4000/vertex_ai/publishers/google/models/textembedding-gecko@001:predict \
  -H "Content-Type: application/json" \
  -H "x-litellm-api-key: Bearer sk-1234" \
  -d '{"instances":[{"content": "gm"}]}'

Imagen API

curl http://localhost:4000/vertex_ai/publishers/google/models/imagen-3.0-generate-001:predict \
  -H "Content-Type: application/json" \
  -H "x-litellm-api-key: Bearer sk-1234" \
  -d '{"instances":[{"prompt": "make an otter"}], "parameters": {"sampleCount": 1}}'

Count Tokens API

curl http://localhost:4000/vertex_ai/publishers/google/models/gemini-1.5-flash-001:countTokens \
  -H "Content-Type: application/json" \
  -H "x-litellm-api-key: Bearer sk-1234" \
  -d '{"contents":[{"role": "user", "parts":[{"text": "hi"}]}]}'

Tuning API

Create Fine Tuning Job

curl http://localhost:4000/vertex_ai/tuningJobs \
      -H "Content-Type: application/json" \
      -H "x-litellm-api-key: Bearer sk-1234" \
      -d '{
  "baseModel": "gemini-1.0-pro-002",
  "supervisedTuningSpec" : {
      "training_dataset_uri": "gs://cloud-samples-data/ai-platform/generative_ai/sft_train_data.jsonl"
  }
}'

Advanced

Pre-requisites

Setup proxy with DB

Use this, to avoid giving developers the raw Anthropic API key, but still letting them use Anthropic endpoints.

Use with Virtual Keys

Setup environment

export DATABASE_URL=""
export LITELLM_MASTER_KEY=""

# vertex ai credentials
export DEFAULT_VERTEXAI_PROJECT="" # "adroit-crow-413218"
export DEFAULT_VERTEXAI_LOCATION="" # "us-central1"
export DEFAULT_GOOGLE_APPLICATION_CREDENTIALS="" # "/Users/Downloads/adroit-crow-413218-a956eef1a2a8.json"

litellm

# RUNNING on http://0.0.0.0:4000

Generate virtual key

curl -X POST 'http://0.0.0.0:4000/key/generate' \
-H 'x-litellm-api-key: Bearer sk-1234' \
-H 'Content-Type: application/json' \
-d '{}'

Expected Response

{
    ...
    "key": "sk-1234ewknldferwedojwojw"
}

Test it!

curl http://localhost:4000/vertex_ai/publishers/google/models/gemini-1.0-pro:generateContent \
  -H "Content-Type: application/json" \
  -H "x-litellm-api-key: Bearer sk-1234" \
  -d '{
    "contents":[{
      "role": "user", 
      "parts":[{"text": "How are you doing today?"}]
    }]
  }'

Send `tags` in request headers

Use this if you wants tags to be tracked in the LiteLLM DB and on logging callbacks

Pass tags in request headers as a comma separated list. In the example below the following tags will be tracked

tags: ["vertex-js-sdk", "pass-through-endpoint"]

curl
Vertex Node.js SDK

curl http://localhost:4000/vertex-ai/publishers/google/models/gemini-1.0-pro:generateContent \
  -H "Content-Type: application/json" \
  -H "x-litellm-api-key: Bearer sk-1234" \
  -H "tags: vertex-js-sdk,pass-through-endpoint" \
  -d '{
    "contents":[{
      "role": "user", 
      "parts":[{"text": "How are you doing today?"}]
    }]
  }'

const { VertexAI } = require('@google-cloud/vertexai');

const vertexAI = new VertexAI({
    project: 'your-project-id', // enter your vertex project id
    location: 'us-central1', // enter your vertex region
    apiEndpoint: "localhost:4000/vertex_ai" // <proxy-server-url>/vertex_ai # note, do not include 'https://' in the url
});

const model = vertexAI.getGenerativeModel({
    model: 'gemini-1.0-pro'
}, {
    customHeaders: {
        "x-litellm-api-key": "sk-1234", // Your litellm Virtual Key
        "tags": "vertex-js-sdk,pass-through-endpoint"
    }
});

async function generateContent() {
    try {
        const prompt = {
            contents: [{
                role: 'user',
                parts: [{ text: 'How are you doing today?' }]
            }]
        };

        const response = await model.generateContent(prompt);
        console.log('Response:', response);
    } catch (error) {
        console.error('Error:', error);
    }
}

generateContent();

Vertex AI SDK

Example Usage​

Quick Start​

Supported API Endpoints​

Authentication to Vertex AI​

Usage Examples​

Gemini API (Generate Content)​

Embeddings API​

Imagen API​

Count Tokens API​

Tuning API​

Advanced​

Use with Virtual Keys​

Send tags in request headers​