skip to content
gigarouter gigarouter
models / speech-to-text · coming soon

wav2vec2-large-xlsr-53-telugu

anuragshas/wav2vec2-large-xlsr-53-telugu

A popular open speech-to-text model, with 2.8M downloads a month. gigarouter benchmarks and hosts it as an OpenAI-compatible API.

status
coming soon
API providers
0
downloads / mo
2.8M
license
apache-2.0

about this model

anuragshas/wav2vec2-large-xlsr-53-telugu is an automatic speech recognition (ASR) model fine-tuned for Telugu. It is based on Facebook’s XLSR-53 multilingual speech representation model and further trained on 70% of the OpenSLR SLR66 Telugu dataset. The model accepts 16 kHz mono audio input and produces transcribed text.

Key strengths

  • Designed specifically for Telugu ASR, leveraging cross-lingual pretraining from 53 languages.
  • Trained and evaluated on a standardized open dataset (OpenSLR SLR66), enabling reproducible comparison.
  • Direct usage without a separate language model; output is generated via greedy decoding from the CTC head.

Best for

  • Transcribing Telugu speech in applications such as media captioning, voice commands, and conversational analytics.
  • Scenarios where a dedicated Telugu model is preferred over general multilingual alternatives.

Benchmark result

The model achieves a Word Error Rate (WER) of 44.98% on the test split of OpenSLR SLR66 Telugu.

Additional details

  • Fine-tuned on Telugu only; not intended for other languages.
  • Input audio must be resampled to 16 kHz for optimal performance.
  • Text normalization (removal of punctuation, English characters, etc.) was applied during evaluation; similar preprocessing is recommended for production use.
not yet live

We're benchmarking and onboarding wav2vec2-large-xlsr-53-telugu as a hosted, OpenAI-compatible API. Sign in for free credit and be ready when it lands, or tell us you want it and we'll prioritize it.