Ornith 1.0 35B

deepreinforce-ai/Ornith-1.0-35B-FP8

published Jun 2026 · updated Jun 2026

Ornith 1.0 35B is a text-generation model for agentic coding, fine-tuned from Qwen 3.5 using reinforcement learning to jointly optimize solution rollouts and scaffolding.

status

coming soon

API providers

downloads / mo

61.6K

license

mit

specs

Task	Text Generation (Agentic Coding)
Architecture	Mixture of Experts (MoE)
Parameters	35B
License	MIT

about this model

Ornith-1.0-35B is a text-generation model that serves as the lightweight single-GPU member of the Ornith family, a self-improving family of open-source models for agentic coding post-trained on Gemma 4 and Qwen 3.5 architectures.

Training Framework

The model employs a self-improving reinforcement learning framework that jointly optimizes both the scaffolding (search trajectories) and the resulting solution rollouts. By learning to generate better scaffolds alongside solutions, Ornith-1.0-35B discovers higher-quality search trajectories and produces improved coding solutions. The model is released under the MIT license.

Key Capabilities

Ornith-1.0-35B is a reasoning model: by default, the assistant response opens with a <think> … </think> block before delivering the final answer. It achieves strong results across multiple agentic coding benchmarks.

Benchmark Results

	Ornith-1.0-35B	Qwen3.5-35B	Qwen3.6-35B	Gemma4-31B	Qwen3.5-397B
Agentic Coding
Terminal-Bench 2.1 (Terminus-2)	64.2	41.4	52.5	42.1	53.5
Terminal-Bench 2.1 (Claude Code)	62.8	38.9	49.2	-	48.6
SWE-bench Verified	75.6	70	73.4	52	76.4
SWE-bench Pro	50.4	44.6	49.5	35.7	51.6
SWE-bench Multilingual	69.3	60.3	67.2	51.7	69.3
NL2Repo	34.6	20.5	29.4	15.5	36.8
Claw-eval Avg	69.8	65.4	68.7	48.5	70.7
SWE Atlas - QnA	37.1	13.2	15.5	-	20.4
SWE Atlas - RF	29.7	10.2	11.4	-	18.4
SWE Atlas - TW	27.8	9.8	13.3	-	18.5

Bold numbers indicate the highest score in each row among the compared models. Evaluation details: Terminal-Bench 2.1 uses Harbor/Terminus-2 and Claude Code frameworks with temperature 1.0; SWE-Bench uses OpenHands harness; SWE Atlas uses mini SWE agent harness; NL2Repo uses 400K context; ClawEval uses temperature 0.6 with 256K context. All results averaged over 5 runs where noted.

best for

·Automated software engineering and bug fixing via SWE-Bench tasks
·Natural language to repository code generation (NL2Repo)
·Agentic coding in terminal environments (Terminal-Bench)

FAQ

What is Ornith 1.0 35B best used for?

It is designed for agentic coding tasks such as automated software engineering, repository-level code generation, and terminal-based coding benchmarks.

What architecture does Ornith 1.0 35B use?

It is a 35B parameter Mixture of Experts (MoE) model, post-trained on top of Qwen 3.5.

What is the license for Ornith 1.0 35B?

It is MIT licensed, globally accessible with no regional restrictions.

How do I call Ornith 1.0 35B via the API?

Use the gigarouter OpenAI-compatible endpoint with your API key. The model is a reasoning model that outputs a <think> block before the final answer.

How does Ornith 1.0 35B compare to Qwen 3.5 35B?

Ornith 1.0 35B outperforms Qwen 3.5 35B on all reported agentic coding benchmarks, including SWE-Bench Verified (75.6 vs 70) and Terminal-Bench 2.1 (64.2 vs 41.4).

not yet live

We're benchmarking and onboarding Ornith 1.0 35B as a hosted, OpenAI-compatible API. Sign in for free credit and be ready when it lands, or tell us you want it and we'll prioritize it.

related text generation models

tiny-Qwen2ForCausalLM-2.5

9.2M dl/mo

deepseek-v4-gguf

6.4M dl/mo

Qwen3.6-35B-A3B-NVFP4

6.2M dl/mo

gemma-3-270m

5.1M dl/mo