skip to content
gigarouter gigarouter
models / embeddings · coming soon

w2v-bert-2.0

facebook/w2v-bert-2.0

A popular open embeddings model, with 3.7M downloads a month. gigarouter benchmarks and hosts it as an OpenAI-compatible API.

est. price
~$0.008
/ 1M tokens · estimated, set at launch
API providers
0
downloads / mo
3.7M
license
mit

about this model

The W2v-BERT 2.0 is a Conformer-based speech encoder (600M parameters) that serves as the core of Meta’s Seamless communication models. It is pre-trained on 4.5 million hours of unlabeled audio data covering over 143 languages. The model produces high-quality audio embeddings and requires fine-tuning for downstream tasks such as automatic speech recognition (ASR) or audio classification.

Key Strengths

  • Large-scale multilingual pre-training (143+ languages) enables strong cross-lingual representation learning.
  • Conformer architecture combines convolution and self-attention for efficient sequence modeling.
  • Proven in Seamless models for speech-to-speech translation and other audio tasks.

Best For

Developers building custom speech recognition, speaker identification, language identification, or audio classification systems that benefit from a robust, pre-trained encoder. The model is particularly suited for multilingual or low-resource language scenarios due to its broad language coverage.

Model Specifications

Model Name #params Checkpoint
W2v-BERT 2.0 600M checkpoint

This model is hosted by gigarouter as a managed, OpenAI-compatible API. No installation or local setup is required; simply call the endpoint to generate embeddings.

not yet live

We're benchmarking and onboarding w2v-bert-2.0 as a hosted, OpenAI-compatible API. Sign in for free credit and be ready when it lands, or tell us you want it and we'll prioritize it.