Ornith 1.0 35B MTP APEX
SC117/Ornith-1.0-35B-MTP-APEX-GGUF
published Jun 2026 · updated Jul 2026
Ornith 1.0 35B MTP APEX is a self-improving agentic coding model that jointly optimizes scaffold generation and solution rollouts, with vision capabilities via a multimodal projector.
specs
| Task | Text Generation (Multi-Modal Agentic Coding) |
| Architecture | Qwen3.5 MoE (Mixture of Experts) with 256 routed experts, 8 active per token |
| Parameters | 35B total, 3B active per token |
| License | MIT |
about this model
Ornith-1.0-35B-MTP-APEX-GGUF is a text-generation model for self-improving agentic coding, post-trained on Qwen3.5 with reinforcement learning to jointly optimize scaffold generation and solution rollouts. It is hosted on gigarouter as a managed API with OpenAI-compatible endpoints.
Key Specifications
The model uses a Mixture-of-Experts (MoE) architecture with 35 billion total parameters and approximately 3 billion active parameters per token. It routes 256 experts, with 8 active per token, across 40 transformer layers plus 1 multi-token prediction (MTP) layer. The context window supports 262,144 tokens. A vision projector (mmproj-F16.gguf) enables multimodal image-and-text inputs. Licensed under MIT.
Benchmark Performance
Among open-source models of comparable size, Ornith achieves state-of-the-art results on Terminal-Bench 2.1, SWE-Bench Verified/Pro/Multilingual, NL2Repo, and OpenClaw. BenchLocal results for the quantized APEX-I-Compact variant (15.85 GB) are shown below.
| Mode | ToolCall-15 | BugFind-15 | HermesAgent-20 | Max | Eff. |
|---|---|---|---|---|---|
| Thinking | 100 | 93 | 89 | 93.5 | 75.5 |
| No Thinking | 100 | 92 | 89 | 93.2 | 85.2 |
No-thinking mode delivers higher practical efficiency (fewer retries).
Recommended Configuration
General and coding use: temperature 0.6, top_p 0.95, top_k 20.
best for
- ·Agentic coding and scaffold generation for software engineering tasks
- ·Multi-turn tool calling and bug finding in automated workflows
- ·Multimodal reasoning combining images and code instructions
FAQ
It is designed for agentic coding tasks such as scaffold generation, tool calling, bug finding, and software engineering workflows, with support for multimodal inputs.
It has 35B total parameters but only 3B active per token due to its MoE architecture, making it efficient and faster than dense models of comparable size.
It is released under the MIT license, allowing commercial use and modification.
Yes, it includes an mmproj-F16.gguf vision projector for multimodal (image + text) capabilities when used with llama.cpp.
Use the OpenAI-compatible endpoint on gigarouter with your API key, specifying the model name as Ornith 1.0 35B MTP APEX.
We're benchmarking and onboarding Ornith 1.0 35B MTP APEX as a hosted, OpenAI-compatible API. Sign in for free credit and be ready when it lands, or tell us you want it and we'll prioritize it.