models / speech-to-text · coming soon

wav2vec2-large-xlsr-53-polish

jonatasgrosman/wav2vec2-large-xlsr-53-polish

A popular open speech-to-text model, with 4.7M downloads a month. gigarouter benchmarks and hosts it as an OpenAI-compatible API.

status

coming soon

API providers

downloads / mo

4.7M

license

apache-2.0

about this model

jonatasgrosman/wav2vec2-large-xlsr-53-polish is an automatic speech recognition (ASR) model fine-tuned from facebook/wav2vec2-large-xlsr-53 for the Polish language.

Key Strengths

Trained on the train and validation splits of Common Voice 6.1.
Can be used directly without a language model for decoding.
Requires input speech sampled at 16 kHz.

Best For

Polish speech-to-text applications where a dedicated, fine-tuned wav2vec2 model is desired. Suitable for general dictation, transcription of read speech, and similar tasks within the Common Voice domain.

Model citation (BibTeX):

@misc{grosman2021xlsr53-large-polish,
  title={Fine-tuned {XLSR}-53 large model for speech recognition in {P}olish},
  author={Grosman, Jonatas},
  howpublished={\url{https://huggingface.co/jonatasgrosman/wav2vec2-large-xlsr-53-polish}},
  year={2021}
}

not yet live

We're benchmarking and onboarding wav2vec2-large-xlsr-53-polish as a hosted, OpenAI-compatible API. Sign in for free credit and be ready when it lands, or tell us you want it and we'll prioritize it.