skip to content
gigarouter gigarouter
models / speech-to-text · coming soon

wav2vec2-large-xlsr-53-persian

jonatasgrosman/wav2vec2-large-xlsr-53-persian

A popular open speech-to-text model, with 2.5M downloads a month. gigarouter benchmarks and hosts it as an OpenAI-compatible API.

status
coming soon
API providers
0
downloads / mo
2.5M
license
apache-2.0

about this model

Jonatasgrosman/wav2vec2-large-xlsr-53-persian is an automatic speech recognition (ASR) model fine-tuned from Facebook’s wav2vec2-large-xlsr-53 on Persian speech. It is optimized for transcribing Persian-language audio sampled at 16 kHz and achieves competitive accuracy on the Common Voice 6.1 test set.

Key strengths

  • Fine-tuned exclusively on Persian data (train and validation splits of Common Voice 6.1).
  • Requires no external language model for inference; can be used directly for transcription.
  • Delivers a Character Error Rate (CER) of 7.37%, indicating strong phonetic accuracy.

Benchmark results

The following table reports Word Error Rate (WER) and Character Error Rate (CER) on the Persian test set of Common Voice, evaluated on 2021-04-22.

ModelWERCER
jonatasgrosman/wav2vec2-large-xlsr-53-persian30.12%7.37%
m3hrdadfi/wav2vec2-large-xlsr-persian-v233.85%8.79%
m3hrdadfi/wav2vec2-large-xlsr-persian34.37%8.98%

Best for

Persian speech-to-text applications where low character error and direct transcription (without a language model) are priorities. The model is hosted on gigarouter as a managed, OpenAI-compatible API — no installation or local dependencies required.

not yet live

We're benchmarking and onboarding wav2vec2-large-xlsr-53-persian as a hosted, OpenAI-compatible API. Sign in for free credit and be ready when it lands, or tell us you want it and we'll prioritize it.