Cloud Speech-to-Text
GCPAI & MLFree tier availableAutomatic speech recognition with Chirp universal speech model supporting 125+ languages, real-time streaming and batch APIs, speaker diarisation, word-level timestamps, and phrase hints to bias recognition toward domain-specific vocabulary
Attributes
- SLA Uptime
- 99.9%
- Streaming
- Yes
Sub-services (4)
Chirp Universal Model
2B-parameter speech model supporting 125+ languages with unified weights
Streaming Recognition
Low-latency bidirectional streaming API for real-time transcription
Batch Recognition
Async long-form transcription for audio files up to 8 hours
Speech Adaptation
Phrase hints and boost values to bias recognition to domain terms
Compliance & Certifications
This service is attested for the following frameworks. Always verify with the provider before relying on a specific compliance posture.
Where this runs
Sovereign regions (2)
- T-Systems Sovereign Cloud · FrankfurtT-Systems Sovereign Cloud powered by Google Cloud
- S3NS Sovereign Cloud · ParisS3NS — Google Cloud + Thales joint venture
Commercial regions (42)
Europe (13)
- Belgium
- Finland
- Paris
- Berlin
- Frankfurt
- Milan
- Turin
- Netherlands
- Warsaw
- Madrid
- Stockholm
- Zurich
- London
North America (12)
- Montréal
- Toronto
- Querétaro
- Northern Virginia
- Columbus
- Iowa
- Dallas
- Las Vegas
- Los Angeles
- South Carolina
- Salt Lake City
- Oregon
South America (2)
- São Paulo
- Santiago
Asia (9)
- Hong Kong
- Delhi
- Mumbai
- Jakarta
- Osaka
- Tokyo
- Singapore
- Seoul
- Taiwan
Oceania (2)
- Melbourne
- Sydney
Middle East (3)
- Tel Aviv
- Doha
- Dammam
Africa (1)
- Johannesburg
Tags
Equivalent services on other platforms
Automatic speech recognition service with real-time streaming and batch transcription across 100+ languages, speaker diarisation, custom vocabulary, PII redaction, and specialised Transcribe Medical and Transcribe Call Analytics flavours
Unified speech service covering real-time and batch speech-to-text in 100+ languages, neural text-to-speech with 600+ voices and custom cloning, speaker recognition, and speech translation — the speech pillar of the rebranded Azure AI Services family