skip to content
gigarouter gigarouter
models / speech-to-text · coming soon

Parakeet RNNT 1.1B

nvidia/parakeet-rnnt-1.1b

published Dec 2023 · updated Jun 2026

Parakeet RNNT 1.1B is an automatic speech recognition model that transcribes English speech into lower-case text using a FastConformer Transducer architecture with 1.1 billion parameters.

status
coming soon
API providers
0
downloads / mo
2.4K
license
cc-by-4.0

specs

TaskAutomatic Speech Recognition (ASR)
ArchitectureFastConformer Transducer (RNNT)
Parameters1.1B
LanguageEnglish
LicenseCC-BY-4.0

about this model

Parakeet RNNT 1.1B is an automatic speech recognition (ASR) model that transcribes English speech into lower-case text using a FastConformer Transducer architecture with approximately 1.1 billion parameters.

Architecture and Performance

The model is built on the FastConformer architecture, which is 2.8x faster than the original Conformer and supports transcription of long-form speech up to 11 hours using limited context attention with a global token. The architecture was accepted at ASRU 2023.

It was trained on over 64,000 hours of English speech (40,000 hours from private data and 24,000 hours from public sources including LibriSpeech, Fisher, Switchboard, WSJ, VCTK, VoxPopuli, Europarl-ASR, MLS, Common Voice, and People’s Speech). The model uses a SentencePiece Unigram tokenizer (vocabulary size 1024).

Word Error Rate (WER) with greedy decoding on standard benchmarks:

Dataset WER (%)
LibriSpeech test-clean1.46
GigaSpeech9.96
SPGI Speech2.47
TED-LIUM v33.11
VoxPopuli3.92
Common Voice5.79
Earnings-2214.11
AMI17.10

Licensing

This model is released under the CC-BY-4.0 license.

best for

FAQ

What is the model's architecture?

FastConformer Transducer (RNNT) with 1.1B parameters, designed for efficient streaming and long-form inference.

What input format does the model accept?

The model accepts 16 kHz mono-channel WAV audio files as input.

What output does the model produce?

It outputs transcribed text as a string in lower-case English without punctuation.

What is the license for this model?

It is released under the CC-BY-4.0 license.

How can I use this model via the gigarouter API?

Call the OpenAI-compatible endpoint on gigarouter using an API key; no local installation required.

not yet live

We're benchmarking and onboarding Parakeet RNNT 1.1B as a hosted, OpenAI-compatible API. Sign in for free credit and be ready when it lands, or tell us you want it and we'll prioritize it.

related speech-to-text models

compare all →