Google Speech-to-Text enables developers to convert audio to text by applying powerful neural network models in an easy-to-use API. The API recognizes more than 120 languages and variants to support your global user base. You can enable voice command-and-control, transcribe audio from call centers, and more. It can process real-time streaming or prerecorded audio, using Google's machine learning technology.
Required Ruby Version
>= 2.4
Authors
Google LLC
Versions
- 2.2.0 June 11, 2026 (14.5 KB)
- 2.1.0 March 20, 2026 (14.5 KB)
- 2.0.4 September 12, 2025 (14.5 KB)
- 2.0.3 August 29, 2025 (14 KB)
- 2.0.2 May 27, 2025 (14 KB)
- 1.1.3 February 03, 2021 (17.5 KB)