Qwen3.5 9B Uncensored Aggressive
HauhauCS/Qwen3.5-9B-Uncensored-HauhauCS-Aggressive
published Mar 2026 · updated Jun 2026
Qwen3.5 9B Uncensored Aggressive is a other model that is a fully uncensored version of Qwen3.5-9B with aggressive refusal removal, maintaining all original capabilities.
specs
| Task | other |
| Architecture | Hybrid: Gated DeltaNet linear attention + full softmax attention (3:1 ratio) |
| Parameters | 9B |
| Context length | 262K native (up to 1M with YaRN) |
about this model
HauhauCS/Qwen3.5-9B-Uncensored-HauhauCS-Aggressive is a fully uncensored 9B-parameter multimodal model based on Qwen3.5-9B, trained to eliminate refusals while retaining all original capabilities. It delivers zero refusals across all tested prompts (0/465) without any loss of functionality or performance.
Key capabilities
- Lossless uncensoring – No changes to datasets or capabilities; the model responds to any prompt without refusal, though it may append a short disclaimer baked into the base model’s training.
- Hybrid architecture – 9B dense parameters, 32 layers, combining Gated DeltaNet linear attention with full softmax attention in a 3:1 ratio.
- Extended context – 262,144 tokens native (extendable to 1M with YaRN).
- Multimodal by default – Accepts text, image, and video inputs.
- Multi-token prediction (MTP) support and a 248,000-token vocabulary covering 201 languages.
Inference settings
The model supports two inference modes, configurable via API parameters:
- Thinking mode (default) –
temperature=0.6,top_p=0.95,top_k=20,min_p=0 - Non-thinking mode –
temperature=0.7,top_p=0.8,top_k=20,min_p=0
Maintain at least 128K context to preserve thinking capabilities. This model (released 2026-03-02) is based on Qwen3.5-9B.
best for
- ·Uncensored text generation without refusals
- ·Multimodal tasks using image/video inputs
- ·Long-context reasoning up to 1M tokens
FAQ
The aggressive variant applies stronger refusal removal, fully unlocking the model. It may still append short disclaimers baked into the base training but generates full content.
Yes, it is natively multimodal. You need the main GGUF file and the mmproj vision encoder to use image/video inputs with compatible runtimes.
For thinking mode use temperature=0.6, top_p=0.95, top_k=20, min_p=0. For non-thinking mode use temperature=0.7, top_p=0.8, top_k=20, min_p=0. Maintain at least 128K context to preserve thinking capabilities.
Use the OpenAI-compatible endpoint with your gigarouter API key. Refer to gigarouter documentation for the exact endpoint and model identifier.
Native context is 262K tokens, extendable to 1M tokens using YaRN.
We're benchmarking and onboarding Qwen3.5 9B Uncensored Aggressive as a hosted, OpenAI-compatible API. Sign in for free credit and be ready when it lands, or tell us you want it and we'll prioritize it.