skip to content
gigarouter gigarouter
models / speech-to-text · coming soon

romanian-wav2vec2

gigant/romanian-wav2vec2

A popular open speech-to-text model, with 2.8M downloads a month. gigarouter benchmarks and hosts it as an OpenAI-compatible API.

status
coming soon
API providers
0
downloads / mo
2.8M
license
apache-2.0

about this model

gigant/romanian-wav2vec2 is an automatic speech recognition (ASR) model for Romanian, fine-tuned from facebook/wav2vec2-xls-r-300m on the Common Voice 8.0 Romanian dataset with additional data from Romanian Speech Synthesis.

The model achieved top-1 ranking for Romanian speech recognition in the Hugging Face Robust Speech Challenge (Speech Bench and Challenge Leaderboard). It includes a 5-gram language model (built with pyctcdecode and kenlm) trained on Romanian Corpora Parliament.

Key Benchmarks

On the Common Voice 8.0 Romanian test split (without the 5-gram LM optimization):

  • Word Error Rate (WER): 0.1174
  • Character Error Rate (CER): 0.0294
  • Loss: 0.1553

Intended Use

Best suited for Romanian speech recognition from audio sampled at 16 kHz. Output is lowercased without punctuation. The model is hosted as an OpenAI-compatible API on gigarouter, requiring no local installation or dependency management.

Training Details

HyperparameterValue
Learning rate0.003
Batch size (train/eval)16 / 8
Gradient accumulation steps3
OptimizerAdam (betas=0.9,0.999; epsilon=1e-8)
LR schedulerLinear, warmup 500 steps
Epochs50
Mixed precisionNative AMP

Final training loss reached 0.0376 after 49.69 epochs. The model uses a CTC head with an added 5-gram language model decoder for improved accuracy.

not yet live

We're benchmarking and onboarding romanian-wav2vec2 as a hosted, OpenAI-compatible API. Sign in for free credit and be ready when it lands, or tell us you want it and we'll prioritize it.