Cloudflare Workers AI

CloudflareAI & MLFree tier available

Serverless GPU-backed AI inference at the edge running a catalogue of open-source text, image, speech, and embedding models (Llama, Mistral, Stable Diffusion, Whisper, BGE) with pay-per-neurone pricing and direct binding from Workers code

Attributes

Served From
Cloudflare GPU edge

Sub-services (5)

Text Generation

Llama, Mistral, Gemma, Phi and other open-source LLMs served from edge GPUs

Image Generation

Stable Diffusion, Flux, and partner image models for text-to-image workflows

Speech Models

Whisper speech-to-text and partner voice models for real-time audio workloads

Embeddings

BGE and partner embedding models optimised for Vectorize upserts

AI Gateway

Managed proxy with caching, rate limits, and analytics across model providers

Compliance & Certifications

This service is attested for the following frameworks. Always verify with the provider before relying on a specific compliance posture.

GDPRSOC 2ISO 27001HIPAAPCI DSS

Where this runs

29 regions
26 countries
Commercial regions (29)

Europe (10)

  • Paris
  • Frankfurt
  • Dublin
  • Milan
  • Amsterdam
  • Warsaw
  • Madrid
  • Stockholm
  • Zurich
  • London

North America (4)

  • Toronto
  • Ashburn
  • Chicago
  • San Jose

South America (2)

  • Buenos Aires
  • São Paulo

Asia (6)

  • Hong Kong
  • Mumbai
  • Tokyo
  • Singapore
  • Seoul
  • Taipei

Oceania (2)

  • Sydney
  • Auckland

Middle East (2)

  • Tel Aviv
  • Dubai

Africa (3)

  • Lagos
  • Cape Town
  • Johannesburg

Tags

Equivalent services on other platforms

Pricing

Pricing model:pay-per-request