Cloudflare Workers AI
CloudflareAI & MLFree tier availableServerless GPU-backed AI inference at the edge running a catalogue of open-source text, image, speech, and embedding models (Llama, Mistral, Stable Diffusion, Whisper, BGE) with pay-per-neurone pricing and direct binding from Workers code
Attributes
- Served From
- Cloudflare GPU edge
Sub-services (5)
Text Generation
Llama, Mistral, Gemma, Phi and other open-source LLMs served from edge GPUs
Image Generation
Stable Diffusion, Flux, and partner image models for text-to-image workflows
Speech Models
Whisper speech-to-text and partner voice models for real-time audio workloads
Embeddings
BGE and partner embedding models optimised for Vectorize upserts
AI Gateway
Managed proxy with caching, rate limits, and analytics across model providers
Compliance & Certifications
This service is attested for the following frameworks. Always verify with the provider before relying on a specific compliance posture.
Where this runs
Commercial regions (29)
Europe (10)
- Paris
- Frankfurt
- Dublin
- Milan
- Amsterdam
- Warsaw
- Madrid
- Stockholm
- Zurich
- London
North America (4)
- Toronto
- Ashburn
- Chicago
- San Jose
South America (2)
- Buenos Aires
- São Paulo
Asia (6)
- Hong Kong
- Mumbai
- Tokyo
- Singapore
- Seoul
- Taipei
Oceania (2)
- Sydney
- Auckland
Middle East (2)
- Tel Aviv
- Dubai
Africa (3)
- Lagos
- Cape Town
- Johannesburg
Tags
Equivalent services on other platforms
Build generative AI applications with foundation models from Anthropic, Cohere, Meta, Mistral, Stability AI, and Amazon, accessed via a single API with fine-tuning and agents
Enterprise access to OpenAI models including GPT-4, GPT-3.5, and DALL-E with Azure security, private networking, regional deployments, and pay-as-you-go or provisioned throughput
Unified platform to build, deploy, and scale ML models with AutoML, custom training on TPUs and GPUs, model registry, pipelines, feature store, and generative AI studio