Concept #123Mediumextended-ai-concepts

What is the use of Vertex AI in Google?

#gen-ai#mlops

Answer

What is Vertex AI in Google?

Vertex AI is Google Cloud's unified machine learning platform that brings together all Google's AI/ML tools and services under one roof — enabling you to build, deploy, scale, and monitor AI models and applications in production.

What Vertex AI Provides

ServiceDescription
Model GardenAccess 150+ foundation models (Gemini, Llama, Claude, etc.)
Gemini APIUse Gemini 1.5 Pro/Flash via API
TrainingCustom ML model training at scale
AutoMLNo-code model training
Feature StoreCentralized ML feature management
PipelinesMLOps pipelines (like Kubeflow)
EndpointsDeploy and serve models
EvaluationModel performance monitoring
Vector SearchManaged vector database (Matching Engine)
Agent BuilderBuild and deploy AI agents

Using Vertex AI for Gen AI

python
import vertexai
from vertexai.generative_models import GenerativeModel

# Initialize with your Google Cloud project
vertexai.init(project="your-project-id", location="us-central1")

# Use Gemini via Vertex AI
model = GenerativeModel("gemini-1.5-pro")

response = model.generate_content(
    "Explain RAG in simple terms with a Python example"
)
print(response.text)

RAG with Vertex AI Vector Search

python
from google.cloud import aiplatform
from vertexai.language_models import TextEmbeddingModel

# Vertex AI provides managed vector search
embedding_model = TextEmbeddingModel.from_pretrained("text-embedding-004")

# Embed documents
def embed_text(texts: list[str]) -> list[list[float]]:
    embeddings = embedding_model.get_embeddings(texts)
    return [e.values for e in embeddings]

# Deploy to Vertex AI Vector Search (Matching Engine)
# This handles indexing, search, and scaling automatically

Vertex AI vs Direct API

Direct Gemini APIVertex AI
AuthAPI keyGoogle Cloud IAM
ScaleLimitedEnterprise-grade
SecurityBasicVPC, CMEK, audit logs
MonitoringNoneCloud Monitoring + Logging
SLAsNoYes
CostPay per tokenPay per token + infra
Models availableGemini onlyGemini + 3rd party

Key Vertex AI Services for Gen AI Engineers

1. Model Garden — Access pre-trained models:

bash
# Access Claude, Llama, Mistral, etc. via Vertex AI
gcloud ai models list --region=us-central1

2. Grounding — Connect to real-time data:

python
from vertexai.preview.generative_models import (
    GenerativeModel, Tool, grounding
)

# Ground Gemini responses with Google Search or your data
model = GenerativeModel("gemini-1.5-pro",
    tools=[Tool.from_google_search_retrieval(
        google_search_retrieval=grounding.GoogleSearchRetrieval()
    )]
)

3. Agent Builder — Deploy production agents:

  • Visual agent builder UI
  • Connect to data stores (BigQuery, GCS, websites)
  • Deploy with auto-scaling

When to Use Vertex AI

  • Enterprise — need IAM, VPC, compliance, SLAs
  • Google Cloud — already in GCP ecosystem
  • Large scale — need auto-scaling inference
  • MLOps — full ML pipeline (train → deploy → monitor)
  • Multi-model — want access to many foundation models
  • Tight Google integration — BigQuery, GCS, Cloud Run