Cloud Speech-to-Text

GCP AI & MLFree tier available

Automatic speech recognition with Chirp universal speech model supporting 125+ languages, real-time streaming and batch APIs, speaker diarisation, word-level timestamps, and phrase hints to bias recognition toward domain-specific vocabulary

FluffyStack tools

Add to Service Builder Add to Compare Compare with equivalents Explore GCP in Treemap Explore ai-ml in Honeycomb See GCP regions on the World Map See ai-ml as a network Score jurisdiction exposure

Documentation Pricing GCP website

Jurisdictional exposure

Provider HQ

USMountain View, USA

Subject to CLOUD Act, FISA-702, DPF

Region locations

APACCNEEAEUUKUSOther44 regions across 7 jurisdictions

Sovereign option

Yes — 2 sovereign-flagged regions available

Full scorecard for this service →US lens detail →Sovereign cloud coverage map →

Attributes

SLA Uptime: 99.9%
Streaming: Yes

Sub-services (4)

Chirp Universal Model

2B-parameter speech model supporting 125+ languages with unified weights

Streaming Recognition

Low-latency bidirectional streaming API for real-time transcription

Batch Recognition

Async long-form transcription for audio files up to 8 hours

Speech Adaptation

Phrase hints and boost values to bias recognition to domain terms

Compliance & Certifications

This service is attested for the following frameworks. Always verify with the provider before relying on a specific compliance posture.

GDPR SOC 2 ISO 27001 HIPAA PCI DSS FedRAMP C5 TISAX IRAP ENS High CCCS Medium ISMAP MTCS L3

Where this runs

44 regions

28 countries

2sovereign

Sovereign regions (2)

T-Systems Sovereign Cloud · FrankfurtT-Systems Sovereign Cloud powered by Google Cloud
S3NS Sovereign Cloud · ParisS3NS — Google Cloud + Thales joint venture

Commercial regions (42)

Europe (13)

Belgium
Finland
Paris
Berlin
Frankfurt
Milan
Turin
Netherlands
Warsaw
Madrid
Stockholm
Zurich
London

North America (12)

Montréal
Toronto
Querétaro
Northern Virginia
Columbus
Iowa
Dallas
Las Vegas
Los Angeles
South Carolina
Salt Lake City
Oregon

South America (2)

São Paulo
Santiago

Asia (9)

Hong Kong
Delhi
Mumbai
Jakarta
Osaka
Tokyo
Singapore
Seoul
Taiwan

Oceania (2)

Melbourne
Sydney

Middle East (3)

Tel Aviv
Doha
Dammam

Africa (1)

Johannesburg

Equivalent services on other platforms

Amazon PollyAWS

Neural text-to-speech service that converts text into lifelike speech in 40+ languages with dozens of voices, including expressive long-form, generative, and newscaster speaking styles plus SSML markup and phoneme control

Amazon TranscribeAWS

Automatic speech recognition service with real-time streaming and batch transcription across 100+ languages, speaker diarisation, custom vocabulary, PII redaction, and specialised Transcribe Medical and Transcribe Call Analytics flavours

Azure AI ServicesAzure

Pre-built AI APIs covering vision (Computer Vision, Custom Vision), speech (Speech-to-Text, Text-to-Speech, Translator), language (Language Understanding, Sentiment), and decision (Anomaly Detector, Content Moderator) — pay-per-call REST with Azure AD auth

Azure AI SpeechAzure

Unified speech service covering real-time and batch speech-to-text in 100+ languages, neural text-to-speech with 600+ voices and custom cloning, speaker recognition, and speech translation — the speech pillar of the rebranded Azure AI Services family

Cloud Text-to-SpeechGCP

Neural text-to-speech with 380+ voices in 50+ languages, including premium Journey voices, Studio voices for long-form narration, and Custom Voice for cloning an organisation's brand voice, plus full SSML and phoneme control

Pricing

Pricing model:pay-per-request