wav2vec2-large-xlsr-53-greek
jonatasgrosman/wav2vec2-large-xlsr-53-greek
A popular open speech-to-text model, with 3.9M downloads a month. gigarouter benchmarks and hosts it as an OpenAI-compatible API.
about this model
jonatasgrosman/wav2vec2-large-xlsr-53-greek is an automatic speech recognition (ASR) model fine-tuned for Greek, based on the Facebook wav2vec2-large-xlsr-53 architecture. It was trained on the train and validation splits of Common Voice 6.1 and CSS10, and requires 16 kHz sampled audio input.
Key capabilities
- Recognizes spoken Greek across varied acoustic conditions, leveraging the cross-lingual pre-training of XLSR-53.
- Optimized for direct use without an external language model.
Benchmark performance
On the Common Voice 6.1 Greek test set, the model achieves a Word Error Rate (WER) of 11.62% and a Character Error Rate (CER) of 3.36%. The following table compares it with other Greek ASR models evaluated under the same script:
| Model | WER | CER |
|---|---|---|
| lighteternal/wav2vec2-large-xlsr-53-greek | 10.13% | 2.66% |
| jonatasgrosman/wav2vec2-large-xlsr-53-greek | 11.62% | 3.36% |
| vasilis/wav2vec2-large-xlsr-53-greek | 19.09% | 5.88% |
| PereLluis13/wav2vec2-large-xlsr-53-greek | 20.16% | 5.71% |
Best use cases
This model is well-suited for transcribing Greek speech in applications such as voice-to-text, media captioning, and conversational AI where accurate, real-time transcription is required.
We're benchmarking and onboarding wav2vec2-large-xlsr-53-greek as a hosted, OpenAI-compatible API. Sign in for free credit and be ready when it lands, or tell us you want it and we'll prioritize it.