Ornith 1.0 35B
deepreinforce-ai/Ornith-1.0-35B-FP8
published Jun 2026 · updated Jun 2026
Ornith 1.0 35B is a text-generation model for agentic coding, fine-tuned from Qwen 3.5 using reinforcement learning to jointly optimize solution rollouts and scaffolding.
specs
| Task | Text Generation (Agentic Coding) |
| Architecture | Mixture of Experts (MoE) |
| Parameters | 35B |
| License | MIT |
about this model
Ornith-1.0-35B is a text-generation model that serves as the lightweight single-GPU member of the Ornith family, a self-improving family of open-source models for agentic coding post-trained on Gemma 4 and Qwen 3.5 architectures.
Training Framework
The model employs a self-improving reinforcement learning framework that jointly optimizes both the scaffolding (search trajectories) and the resulting solution rollouts. By learning to generate better scaffolds alongside solutions, Ornith-1.0-35B discovers higher-quality search trajectories and produces improved coding solutions. The model is released under the MIT license.
Key Capabilities
Ornith-1.0-35B is a reasoning model: by default, the assistant response opens with a <think> … </think> block before delivering the final answer. It achieves strong results across multiple agentic coding benchmarks.
Benchmark Results
| Ornith-1.0-35B | Qwen3.5-35B | Qwen3.6-35B | Gemma4-31B | Qwen3.5-397B | |
|---|---|---|---|---|---|
| Agentic Coding | |||||
| Terminal-Bench 2.1 (Terminus-2) | 64.2 | 41.4 | 52.5 | 42.1 | 53.5 |
| Terminal-Bench 2.1 (Claude Code) | 62.8 | 38.9 | 49.2 | - | 48.6 |
| SWE-bench Verified | 75.6 | 70 | 73.4 | 52 | 76.4 |
| SWE-bench Pro | 50.4 | 44.6 | 49.5 | 35.7 | 51.6 |
| SWE-bench Multilingual | 69.3 | 60.3 | 67.2 | 51.7 | 69.3 |
| NL2Repo | 34.6 | 20.5 | 29.4 | 15.5 | 36.8 |
| Claw-eval Avg | 69.8 | 65.4 | 68.7 | 48.5 | 70.7 |
| SWE Atlas - QnA | 37.1 | 13.2 | 15.5 | - | 20.4 |
| SWE Atlas - RF | 29.7 | 10.2 | 11.4 | - | 18.4 |
| SWE Atlas - TW | 27.8 | 9.8 | 13.3 | - | 18.5 |
Bold numbers indicate the highest score in each row among the compared models. Evaluation details: Terminal-Bench 2.1 uses Harbor/Terminus-2 and Claude Code frameworks with temperature 1.0; SWE-Bench uses OpenHands harness; SWE Atlas uses mini SWE agent harness; NL2Repo uses 400K context; ClawEval uses temperature 0.6 with 256K context. All results averaged over 5 runs where noted.

best for
- ·Automated software engineering and bug fixing via SWE-Bench tasks
- ·Natural language to repository code generation (NL2Repo)
- ·Agentic coding in terminal environments (Terminal-Bench)
FAQ
It is designed for agentic coding tasks such as automated software engineering, repository-level code generation, and terminal-based coding benchmarks.
It is a 35B parameter Mixture of Experts (MoE) model, post-trained on top of Qwen 3.5.
It is MIT licensed, globally accessible with no regional restrictions.
Use the gigarouter OpenAI-compatible endpoint with your API key. The model is a reasoning model that outputs a <think> block before the final answer.
Ornith 1.0 35B outperforms Qwen 3.5 35B on all reported agentic coding benchmarks, including SWE-Bench Verified (75.6 vs 70) and Terminal-Bench 2.1 (64.2 vs 41.4).
We're benchmarking and onboarding Ornith 1.0 35B as a hosted, OpenAI-compatible API. Sign in for free credit and be ready when it lands, or tell us you want it and we'll prioritize it.