Vertex AI

GCP AI & MLFree tier available

Unified platform to build, deploy, and scale ML models with AutoML, custom training on TPUs and GPUs, model registry, pipelines, feature store, and generative AI studio

FluffyStack tools

Add to Service Builder Add to Compare Compare with equivalents Explore GCP in Treemap Explore ai-ml in Honeycomb See GCP regions on the World Map See ai-ml as a network Score jurisdiction exposure

Documentation Pricing GCP website

Jurisdictional exposure

Provider HQ

USMountain View, USA

Subject to CLOUD Act, FISA-702, DPF

Region locations

APACCNEEAEUUKUSOther44 regions across 7 jurisdictions

Sovereign option

Yes — 2 sovereign-flagged regions available

Full scorecard for this service →US lens detail →Sovereign cloud coverage map →

Attributes

GPU Support: Yes
Auto ML: Yes
Model Registry: Yes

Sub-services (4)

Custom Training

Distributed training for custom ML models

Online Prediction

Low-latency model serving endpoints

Vertex AI Pipelines

Serverless ML workflow orchestration

Feature Store

Centralized repository for ML features

Compliance & Certifications

This service is attested for the following frameworks. Always verify with the provider before relying on a specific compliance posture.

GDPR SOC 2 ISO 27001 HIPAA PCI DSS FedRAMP C5 TISAX IRAP ENS High CCCS Medium ISMAP MTCS L3

Where this runs

44 regions

28 countries

2sovereign

Sovereign regions (2)

T-Systems Sovereign Cloud · FrankfurtT-Systems Sovereign Cloud powered by Google Cloud
S3NS Sovereign Cloud · ParisS3NS — Google Cloud + Thales joint venture

Commercial regions (42)

Europe (13)

Belgium
Finland
Paris
Berlin
Frankfurt
Milan
Turin
Netherlands
Warsaw
Madrid
Stockholm
Zurich
London

North America (12)

Montréal
Toronto
Querétaro
Northern Virginia
Columbus
Iowa
Dallas
Las Vegas
Los Angeles
South Carolina
Salt Lake City
Oregon

South America (2)

São Paulo
Santiago

Asia (9)

Hong Kong
Delhi
Mumbai
Jakarta
Osaka
Tokyo
Singapore
Seoul
Taiwan

Oceania (2)

Melbourne
Sydney

Middle East (3)

Tel Aviv
Doha
Dammam

Africa (1)

Johannesburg

Equivalent services on other platforms

Alibaba Platform for AI (PAI)Alibaba

Enterprise ML and AI platform covering PAI-Studio visual workflow builder, PAI-DSW Jupyter notebooks, PAI-EAS elastic inference serving, PAI-Blade inference optimisation, and integration with Alibaba's Qwen foundation models

Alibaba Qwen (Tongyi Qianwen)Alibaba

Alibaba's flagship open-source foundation model family covering Qwen (text), Qwen-VL (vision-language), Qwen-Audio, and Qwen-Coder — accessible via the DashScope API with chat, completion, embeddings, and function-calling endpoints

Aruba AI StackAruba

Dedicated AI infrastructure stack combining GPU-on-Demand compute, Object Storage, and managed model hosting for end-to-end AI workloads on Italian-sovereign infrastructure

Amazon SageMakerAWS

Next-generation SageMaker (rebranded SageMaker AI) unifying data, analytics, and AI in one workspace — Studio notebooks, HyperPod for foundation-model training at scale, Lakehouse with QuickSight + S3 Tables integration, AutoPilot AutoML, managed training jobs, hosted inference endpoints, and Feature Store, with re:Invent 2024 introducing the unified SageMaker AI workspace and 2025 Summit additions extending it with lakehouse auto-onboarding

Amazon BedrockAWS

Build generative AI applications with foundation models from Anthropic (Claude Opus 4.7 from April 2026), Cohere, Meta, Mistral, Stability AI, TwelveLabs (video understanding), and Amazon's own Nova family — accessed via a single API with fine-tuning, knowledge bases, agents, and a model marketplace for discovery and easy onboarding

Amazon NovaAWS

AWS-built foundation model family covering text (Micro, Lite, Pro, Premier), image generation (Canvas), and video generation (Reel) — accessed through the Bedrock runtime with tight pricing and low-latency streaming, launched at re:Invent 2024

Amazon QAWS

Generative AI assistant family spanning software development (Q Developer, formerly CodeWhisperer), enterprise knowledge retrieval (Q Business), low-code app generation (Q Apps), and contact-centre augmentation (Q in Connect) with grounded answers against your own data

Amazon Bedrock AgentCoreAWS

Production runtime for AI agents — managed memory, identity, gateway, observability, and tool integration so teams can ship agentic workflows on top of any framework (Strands Agents, LangGraph, CrewAI, vendor-direct) without rebuilding the operational substrate

Amazon S3 VectorsAWS

Native vector storage in S3 — up to 2 billion vectors per index, sub-100 ms query latency, S3-native durability, and pricing claimed up to 90 percent lower than dedicated vector databases for retrieval-augmented generation and embedding-heavy workloads

Amazon KendraAWS

Managed enterprise search — natural-language question-answering over documents in S3, SharePoint, Confluence, Salesforce, RDS, and 40+ other sources with connectors, semantic ranking, and no ML expertise required

Amazon PersonalizeAWS

Managed real-time personalisation ML — recommendations, similar-items, and user-segmentation models trained on customer interaction data with no ML expertise, delivered via low-latency HTTP APIs

Azure OpenAI ServiceAzure

Enterprise access to OpenAI models including GPT-4, GPT-3.5, and DALL-E with Azure security, private networking, regional deployments, and pay-as-you-go or provisioned throughput

Azure Machine LearningAzure

End-to-end platform for building and deploying ML models with automated ML, designer (drag-and-drop), managed compute clusters, MLflow tracking, and responsible AI dashboards

Azure AI SearchAzure

Enterprise search-as-a-service (formerly Azure Cognitive Search) with vector, hybrid, and semantic ranking, built-in AI skills for OCR and NLP enrichment, first-class integration with Azure OpenAI for RAG workloads, and 90+ data-source connectors including SharePoint, OneDrive, and Salesforce

Azure Health BotAzure

HIPAA-compliant conversational AI platform for healthcare with a built-in clinical knowledge graph, triage scenarios, symptom checker, and compliance tooling for building patient-facing chat experiences grounded in medical ontologies

Azure AI FoundryAzure

Unified AI agent platform announced at Microsoft Build 2025 — covers agent authoring (AI Foundry Agent Service), multi-agent orchestration with MCP support, model catalogue across OpenAI / Mistral / Meta / Cohere, and operational tooling for governance, evaluation, and monitoring of production AI systems

Cloudflare StreamCloudflare

End-to-end video platform — ingestion via TUS or RTMP, automatic encoding to adaptive HLS/DASH, signed-token playback, integrated player SDK, and per-minute-streamed pricing with no separate egress charges

Cloudflare VectorizeCloudflare

Globally-distributed vector database for RAG, similarity search, and recommendations with native Workers AI integration, up to 5M vectors per index, metadata filtering, and cosine / Euclidean / dot-product similarity

Cloudflare Workers AICloudflare

Serverless GPU-backed AI inference at the edge running a catalogue of open-source text, image, speech, and embedding models (Llama, Mistral, Stable Diffusion, Whisper, BGE) with pay-per-neurone pricing and direct binding from Workers code

Cloudflare AI SearchCloudflare

Managed retrieval-augmented-generation service — index your content (R2 buckets, websites, Workers KV) and query it with natural language from a Workers binding, REST API, or MCP server. Originally launched as AutoRAG and renamed AI Search in 2026.

Cloud Temple LLM-as-a-ServiceCloud Temple

Hosted large-language-model inference endpoints on SecNumCloud-qualified GPUs. Positioned as the French sovereign alternative to hosted OpenAI / Anthropic APIs for regulated workloads — health data (HDS-qualified), public-sector, defence-adjacent

Mosaic AIDatabricks

End-to-end AI platform (formerly MLflow + Mosaic ML) for training, fine-tuning, deploying, and monitoring foundation models and custom ML models on the Lakehouse

Vector SearchDatabricks

Serverless vector database built into the Lakehouse for similarity search, RAG applications, and recommendation systems with automatic embedding sync from Delta tables

Managed MLflowDatabricks

Hosted MLflow with managed tracking server, model registry, and deployment integration — open-source ML lifecycle tooling tightly coupled to Databricks Unity Catalog and Mosaic AI

Exoscale AI Cloud InfrastructureExoscale

Dedicated AI/ML infrastructure combining GPU compute, Object Storage for datasets, and managed model hosting on Swiss-resident infrastructure — positioned for regulated EU customers needing AI workloads outside US jurisdiction

Gcore Inference at the EdgeGcore

AI inference runtime that deploys models to Gcore's edge POPs and routes requests to the nearest GPU-backed endpoint, with support for open-source LLMs and custom model containers

ModelArtsHuawei

End-to-end AI development platform with AutoML, data labelling, distributed training on Ascend and GPU clusters, and one-click deployment to cloud or edge

Watson AssistantIBM

Conversational AI for building chatbots and virtual agents with visual dialogue builder, intent and entity detection, voice integration via phone, and multi-channel deployment

IBM watsonx.aiIBM

Enterprise AI studio for training, validating, tuning, and deploying foundation models and traditional ML models, with IBM's Granite model family, Hugging Face integration, prompt lab, synthetic data generation, and governance via watsonx.governance

Infomaniak AI ToolsInfomaniak

Sovereign Swiss AI suite: managed inference endpoints for open-source LLMs (Llama, Mistral), AI Studio chat interface (Kchat), and document-analysis APIs — positioned as the Swiss-resident alternative to hosted OpenAI / Anthropic APIs

IONOS AI Model HubIONOS

Managed hosting and inference endpoints for open-source large language models (Llama, Mistral, DeepSeek) running on EU-resident GPU infrastructure with OpenAI-compatible API, positioned as the GDPR-resident alternative to OpenAI / Anthropic hosted APIs

Kanana AIKakao

Kakao's Korean-first foundation-model family (Kanana Flash / Essence / Nano) for chat, code, and embeddings — multilingual but tuned for Korean conversational performance

CLOVA StudioNaver

Naver's HyperCLOVA X foundation-model platform for Korean-language LLM workloads — chat completion, embeddings, function calling, RAG over Korean text with strong native-language performance

Nscale Serverless InferenceNscale

Managed serverless inference endpoints for open-source large language models hosted on Nscale's GPU infrastructure, with OpenAI-compatible API and per-million-tokens pricing

Red Hat OpenShift AIOpenShift

Managed MLOps platform (formerly Open Data Hub) for training, serving, and monitoring ML models on OpenShift with JupyterHub, KServe, Kubeflow, and PyTorch operators

OCI Generative AIOracle

Managed inference service hosting Cohere Command and Embed plus Meta Llama large language models — pay-per-token chat / completion / embedding APIs, plus fine-tuning on customer datasets via dedicated AI clusters

OCI Enterprise AIOracle

End-to-end platform for building, deploying, and governing production AI workloads on OCI — unifies Gen AI models, agent orchestration, retrieval-augmented generation, and policy-based governance controls in a single managed service so enterprises don't have to assemble them from primitives.

OCI Data ScienceOracle

Fully managed machine learning platform with JupyterLab notebooks, conda-environment library, job orchestration, model deployments as HTTPS endpoints, feature store, and model catalog — integrated with Autonomous Database and Object Storage for end-to-end ML workflows

OCI AI ForecastingOracle

Managed time-series forecasting service that automatically selects between classical statistical models (ARIMA, ETS) and modern ML approaches based on data shape, with support for related covariates, holidays, and multi-horizon predictions

OCI AI Anomaly DetectionOracle

Managed multivariate anomaly-detection service for IoT / IT / OT telemetry — trains models on healthy baselines then flags deviations across many signals simultaneously (root-cause candidates included), with sync + async detection endpoints

Outscale AI StudioOutscale

Managed AI platform hosting Mistral AI models (including Le Chat Enterprise) on Outscale's SecNumCloud-eligible French infrastructure — positioned as the strictest-sovereignty alternative to hosted OpenAI / Anthropic / Bedrock APIs for European public-sector and regulated workloads

OVHcloud AI EndpointsOVHcloud

Managed serverless inference endpoints for open-source large language models (Llama, Mistral, DeepSeek), with OpenAI-compatible API and per-million-tokens pricing — positioned as the EU-resident alternative to hosted OpenAI / Anthropic APIs

OVHcloud AI TrainingOVHcloud

Managed GPU-backed training jobs for custom ML models, with H100 / H200 / L40s / L4 GPU options, distributed-training orchestration, and Jupyter integration

OVHcloud AI DeployOVHcloud

Managed serverless deployment of containerised AI/ML inference workloads with autoscaling and per-second billing, for hosting custom models built outside the LLM-focused AI Endpoints product

OVHcloud AI NotebooksOVHcloud

Managed Jupyter / VS Code notebook environments with GPU access, pre-installed ML frameworks (PyTorch, TensorFlow, JAX), and persistent workspaces for experimentation

OVHcloud MCP ServerOVHcloud

Hosted Model Context Protocol server letting AI agents (Claude / GPT / custom) discover and interact with OVHcloud resources (PostgreSQL, Managed Kubernetes, Object Storage) via the MCP spec — positioned as the EU-resident bridge between AI agents and infrastructure operations

Einstein AISalesforce

AI layer integrated across Salesforce products: predictive lead scoring, opportunity scoring, generative AI for emails and summaries

Salesforce AgentforceSalesforce

Salesforce platform for building, deploying, and governing autonomous AI agents grounded in CRM data, with Atlas Reasoning Engine, Agent Studio, and Data Cloud retrieval

Generative APIsScaleway

Managed inference for open-source LLMs (Llama, Mistral, DeepSeek) hosted in EU datacentres

Scaleway Generative APIsScaleway

Managed serverless inference endpoints for hosted open-source LLMs (Llama, Mistral, DeepSeek) with OpenAI-compatible API and per-million-tokens pricing — distinct from Scaleway's full AI Platform (training + custom-model hosting)

SnowparkSnowflake

Developer framework for building data applications in Python, Java, and Scala that run inside Snowflake

Snowflake CortexSnowflake

Fully managed AI and ML service offering hosted LLMs, vector search, and ML functions inside Snowflake SQL

STACKIT AI Model ServingSTACKIT

Managed serverless inference endpoints for hosted open-source LLMs (Llama, Mistral) with OpenAI-compatible API and per-million-tokens pricing — positioned as the German-sovereign alternative to hosted OpenAI / Anthropic APIs

Tencent HunyuanTencent

Tencent's in-house family of large language models (Hunyuan-Pro, Standard, Lite, plus multimodal Hunyuan-Vision) accessible via the Hunyuan API, with enterprise-grade context windows up to 256K, function calling, embeddings, and tuning

Pricing

Pricing model:pay-as-you-go