models / speech-to-text · coming soon

wav2vec2-large-xlsr-53-arabic

jonatasgrosman/wav2vec2-large-xlsr-53-arabic

A popular open speech-to-text model, with 3.5M downloads a month. gigarouter benchmarks and hosts it as an OpenAI-compatible API.

status

coming soon

API providers

downloads / mo

3.5M

license

apache-2.0

about this model

jonatasgrosman/wav2vec2-large-xlsr-53-arabic is an automatic speech recognition (ASR) model for Arabic that transcribes spoken audio into text. It is a fine-tuned version of Facebook’s wav2vec2-large-xlsr-53, trained on the train and validation splits of Common Voice 6.1 and the Arabic Speech Corpus. Input audio must be sampled at 16 kHz.

Key Strengths

This model achieves the lowest Word Error Rate (WER) and Character Error Rate (CER) among several publicly available Arabic ASR models when evaluated on the Common Voice Arabic test set. The evaluation was run on 2021‑05‑14 and results are reported below.

Model	WER	CER
jonatasgrosman/wav2vec2-large-xlsr-53-arabic	39.59%	18.18%
bakrianoo/sinai-voice-ar-stt	45.30%	21.84%
othrif/wav2vec2-large-xlsr-arabic	45.93%	20.51%
kmfoda/wav2vec2-large-xlsr-arabic	54.14%	26.07%
mohammed/wav2vec2-large-xlsr-arabic	56.11%	26.79%
anas/wav2vec2-large-xlsr-arabic	62.02%	27.09%
elgeish/wav2vec2-large-xlsr-53-arabic	100.00%	100.56%

Best For

This model is suitable for production Arabic speech-to-text pipelines where high accuracy and low latency are required. It handles Modern Standard Arabic and dialects represented in Common Voice and the Arabic Speech Corpus. Gigarouter hosts it as a managed OpenAI‑compatible API, eliminating the need for manual model loading or infrastructure setup.

Citation

Jonatas Grosman. Fine-tuned XLSR-53 large model for speech recognition in Arabic. 2021. Available at https://huggingface.co/jonatasgrosman/wav2vec2-large-xlsr-53-arabic.

not yet live

We're benchmarking and onboarding wav2vec2-large-xlsr-53-arabic as a hosted, OpenAI-compatible API. Sign in for free credit and be ready when it lands, or tell us you want it and we'll prioritize it.