wav2vec2-large-xlsr-53-polish
jonatasgrosman/wav2vec2-large-xlsr-53-polish
A popular open speech-to-text model, with 4.7M downloads a month. gigarouter benchmarks and hosts it as an OpenAI-compatible API.
about this model
jonatasgrosman/wav2vec2-large-xlsr-53-polish is an automatic speech recognition (ASR) model fine-tuned from facebook/wav2vec2-large-xlsr-53 for the Polish language.
Key Strengths
- Trained on the train and validation splits of Common Voice 6.1.
- Can be used directly without a language model for decoding.
- Requires input speech sampled at 16 kHz.
Best For
Polish speech-to-text applications where a dedicated, fine-tuned wav2vec2 model is desired. Suitable for general dictation, transcription of read speech, and similar tasks within the Common Voice domain.
Model citation (BibTeX):
@misc{grosman2021xlsr53-large-polish,
title={Fine-tuned {XLSR}-53 large model for speech recognition in {P}olish},
author={Grosman, Jonatas},
howpublished={\url{https://huggingface.co/jonatasgrosman/wav2vec2-large-xlsr-53-polish}},
year={2021}
}We're benchmarking and onboarding wav2vec2-large-xlsr-53-polish as a hosted, OpenAI-compatible API. Sign in for free credit and be ready when it lands, or tell us you want it and we'll prioritize it.