DeepSeek V4 Flash DSpark
deepseek-ai/DeepSeek-V4-Flash-DSpark
published Jun 2026 · updated Jun 2026
DeepSeek V4 Flash DSpark is a text-generation model that uses a Mixture-of-Experts architecture with 284B total parameters (13B activated) and supports a context length of one million tokens, enhanced with speculative decoding for faster inference.
specs
| Task | Text Generation |
| Architecture | Mixture-of-Experts (MoE) with Hybrid Attention |
| Parameters | 284B total, 13B activated |
| Context Length | 1,000,000 tokens |
| License | MIT |
about this model
best for
- ·Processing long documents up to one million tokens
- ·Complex reasoning and problem-solving with think modes
- ·Agentic workflows and tool calling
FAQ
It is a preview of the DeepSeek-V4 series with a Mixture-of-Experts architecture, 284B total parameters, 13B activated, supporting one-million-token contexts and speculative decoding for faster inference.
DSpark is the same checkpoint with an additional speculative decoding module attached to improve inference speed, not a new model.
It is released under the MIT license.
Use the gigarouter OpenAI-compatible endpoint with an API key; the model supports standard text generation and tool calling.
The model uses OpenAI-compatible chat messages; refer to the encoding folder in the model repository for encoding and decoding details.
We're benchmarking and onboarding DeepSeek V4 Flash DSpark as a hosted, OpenAI-compatible API. Sign in for free credit and be ready when it lands, or tell us you want it and we'll prioritize it.