Qwopus3.6-35B-A3B-Coder-MTP-GGUF
Ajuste fino GGUF de programación de Jackrong sobre Qwen3.6-35B MoE con predicción multitokens para codificación, uso de herramientas y tareas multilingües.
Modelo base
Descripción del Modelo
[!WARNING] Community Release Notice: Qwopus-3.6-35B-A3B-Coder is an experimental community model intended for research, local coding-agent evaluation, and workflow exploration. It has not undergone complete safety evaluation or broad general-domain benchmarking.
[!IMPORTANT] Evaluation Mode: The central design target and comparison framing in this card is thinking-off execution. The model is evaluated for whether it can remain useful and stable without relying on long visible reasoning traces at every step.
🎯 1. Fine-Tuning Objective: Less Overthinking, More Execution
💡 2. Base Model, Training Stack & Collaboration
📊 3. Thinking-Off Agentic Evaluation
Open Kyle's interactive deck →| Evaluation | Model / Quant | Patch Mode | Score |
|---|---|---|---|
| SWE-bench, 300 cases | Qwopus-3.6-35B-A3B-Coder Q5_K_M | Thinking off, submitted patches | 62.4% |
| Capability Area | Qwopus 3.6 35B thinking off | Ornith-1.0 35B thinking on | Observed Pattern |
|---|---|---|---|
| Legit-request compliance | 100 | 70 | Qwopus follows allowed user intent much more reliably. |
| Integrity under pressure | 93 | 86 | Qwopus is more stable under adversarial or stressful workflow conditions. |
| Multi-turn orchestration | 80 | 70 | Qwopus better maintains state across long agent loops. |
| Large code deliverable | 75 | 65 | Qwopus shows stronger completion behavior for larger code artifacts. |
| Sustained debugging | 60 | 50 | Qwopus holds a practical edge across repeated fix-test cycles. |
| Long-context recall | 90 | 95 | Ornith retains a small advantage in recall-heavy thinking-on settings. |
| Metacognition | 90 | 95 | Ornith benefits from explicit thinking-on reflection. |
| Engineering competence | 81 | 94 | Ornith remains stronger in broad engineering competence. |
| Context-poison resistance | 70 | 85 | Ornith is more robust against context poisoning in this test. |
🎮 4. Live Agent Demo: RTS Game Sample
🗺️ 5. Training & Workflow Design
The training and evaluation philosophy for this release centers on agent execution rather than visible chain length. The model should know when to act directly, when to inspect more context, and when to stop and summarize.
[ Qwopus-3.6-35B-A3B-Coder: Agentic Execution Pipeline ]
Base MoE Foundation
Qwen3.6-35B-A3B / Qwopus3.6-35B-A3B-v1
│
▼
Coding + Tool-Use Adaptation
repository tasks, debugging traces, tool schemas, multi-turn feedback
│
▼
Thinking-Off Behavior Target
faster next-step decisions, less overthinking, lower token waste
│
▼
Agent Harness Workflows
read files → choose tool → edit code → run tests → inspect errors → iterate → report
│
▼
Final Objective
stable long-horizon code execution with practical local latency
[!NOTE] This model card intentionally frames thinking-off behavior as a product target. Long thinking can still be useful for difficult reasoning, but the release focuses on whether the model can complete real coding-agent work without paying that cost on every step.
✅ 6. Recommended Use Cases & Known Limits
[!CAUTION] Deployment note: For agent use, ensure that tool definitions, system prompts, output parsing, and retry behavior are consistent. Thinking-off models can be fast, but the harness still needs clean schemas, useful error feedback, and strict task boundaries.
📚 7. Resources, Acknowledgements & Citation
Regístrate para leer casos de estudio completos, acceder a métricas detalladas y recibir todos los reportes.
Regístrate para leer casos de estudio completos, acceder a métricas detalladas y recibir todos los reportes.