wav2vec2-xls-r-300m-bengali
arijitx/wav2vec2-xls-r-300m-bengali
A popular open speech-to-text model, with 1.4M downloads a month. gigarouter benchmarks and hosts it as an OpenAI-compatible API.
about this model
Overview
This model is a fine-tuned version of facebook/wav2vec2-xls-r-300m for Bengali automatic speech recognition (ASR). It was trained on the OpenSLR SLR53 Bengali dataset and is hosted on gigarouter as a managed, OpenAI-compatible API. The model is optimized for transcribing Bengali speech with high accuracy, especially when combined with an external language model.
Performance
Evaluation was conducted on a held-out set of Bengali speech samples. The following word error rate (WER) and character error rate (CER) were achieved:
| Condition | WER | CER |
|---|---|---|
| Without language model | 0.2173 | 0.0473 |
| With 5-gram language model (trained on 30M sentences from AI4Bharat IndicCorp) | 0.1532 | 0.0341 |
We're benchmarking and onboarding wav2vec2-xls-r-300m-bengali as a hosted, OpenAI-compatible API. Sign in for free credit and be ready when it lands, or tell us you want it and we'll prioritize it.