Qwopus3.6-35B-A3B-Coder-MTP-GGUF
Jackrong's GGUF coder fine-tune of Qwen3.6-35B MoE with multi-token prediction for coding, tool use, and multilingual tasks.
Base model
Model Description
[!WARNING] Community Release Notice: Qwopus-3.6-35B-A3B-Coder is an experimental community model intended for research, local coding-agent evaluation, and workflow exploration. It has not undergone complete safety evaluation or broad general-domain benchmarking.
[!IMPORTANT] Evaluation Mode: The central design target and comparison framing in this card is thinking-off execution. The model is evaluated for whether it can remain useful and stable without relying on long visible reasoning traces at every step.
๐ฏ 1. Fine-Tuning Objective: Less Overthinking, More Execution
๐ก 2. Base Model, Training Stack & Collaboration
๐ 3. Thinking-Off Agentic Evaluation
Open Kyle's interactive deck โ| Evaluation | Model / Quant | Patch Mode | Score |
|---|---|---|---|
| SWE-bench, 300 cases | Qwopus-3.6-35B-A3B-Coder Q5_K_M | Thinking off, submitted patches | 62.4% |
| Capability Area | Qwopus 3.6 35B thinking off | Ornith-1.0 35B thinking on | Observed Pattern |
|---|---|---|---|
| Legit-request compliance | 100 | 70 | Qwopus follows allowed user intent much more reliably. |
| Integrity under pressure | 93 | 86 | Qwopus is more stable under adversarial or stressful workflow conditions. |
| Multi-turn orchestration | 80 | 70 | Qwopus better maintains state across long agent loops. |
| Large code deliverable | 75 | 65 | Qwopus shows stronger completion behavior for larger code artifacts. |
| Sustained debugging | 60 | 50 | Qwopus holds a practical edge across repeated fix-test cycles. |
| Long-context recall | 90 | 95 | Ornith retains a small advantage in recall-heavy thinking-on settings. |
| Metacognition | 90 | 95 | Ornith benefits from explicit thinking-on reflection. |
| Engineering competence | 81 | 94 | Ornith remains stronger in broad engineering competence. |
| Context-poison resistance | 70 | 85 | Ornith is more robust against context poisoning in this test. |
๐ฎ 4. Live Agent Demo: RTS Game Sample
๐บ๏ธ 5. Training & Workflow Design
The training and evaluation philosophy for this release centers on agent execution rather than visible chain length. The model should know when to act directly, when to inspect more context, and when to stop and summarize.
[ Qwopus-3.6-35B-A3B-Coder: Agentic Execution Pipeline ]
Base MoE Foundation
Qwen3.6-35B-A3B / Qwopus3.6-35B-A3B-v1
โ
โผ
Coding + Tool-Use Adaptation
repository tasks, debugging traces, tool schemas, multi-turn feedback
โ
โผ
Thinking-Off Behavior Target
faster next-step decisions, less overthinking, lower token waste
โ
โผ
Agent Harness Workflows
read files โ choose tool โ edit code โ run tests โ inspect errors โ iterate โ report
โ
โผ
Final Objective
stable long-horizon code execution with practical local latency
[!NOTE] This model card intentionally frames thinking-off behavior as a product target. Long thinking can still be useful for difficult reasoning, but the release focuses on whether the model can complete real coding-agent work without paying that cost on every step.
โ 6. Recommended Use Cases & Known Limits
[!CAUTION] Deployment note: For agent use, ensure that tool definitions, system prompts, output parsing, and retry behavior are consistent. Thinking-off models can be fast, but the harness still needs clean schemas, useful error feedback, and strict task boundaries.
๐ 7. Resources, Acknowledgements & Citation
Sign up to read complete case studies, access detailed metrics, and unlock all use cases.
Sign up to read complete case studies, access detailed metrics, and unlock all use cases.