Qwopus3.6-27B-v2-MTP-GGUF
Jackrong's GGUF 27B multimodal fine-tune of Qwen3.6-27B with multi-token prediction and vision for coding and agentic tasks.
Base model
Qwen/Qwen3.6-27B
Model Card
</div>
π‘ 1. Base Model, Training Library & Cooperation
</div>
</div>
</div>
[!WARNING] Community Release Notice: Qwopus3.6-27B-v2-MTP is an experimental community release intended for research, evaluation, and workflow exploration.
π 2. MTP Benchmark: Qwen3.6-27B vs Qwopus3.6-27B-v2-MTP
- Speed: Qwopus3.6-27B-v2-MTP reaches 10.46 overall tokens/sec, compared with 6.29 tokens/sec for Qwen3.6-27B.
- Latency: total evaluation time drops from 14,901.69s to 6,487.81s, saving 8,413.88s across the full run.
- Output shape: MTP produces 67,862 completion tokens versus 93,802 from Qwen3.6-27B, giving a more compact overall response profile.
βοΈ 3. Test Environment & Configuration
- Compute platform: GB10 dedicated server platform.
- Evaluation format: same local GGUF server stack for both models.
- llama-server total context:
49152. - Temperature / Top-p:
1.0 / 0.95. - Max generated tokens: no explicit cap; generation is bounded by the request budget.
- Request format:
/v1/chat/completionswith user content as text payload.
| Benchmark Summary: Qwen3.6-27B vs Qwopus3.6-27B-v2-MTP | |||||
|---|---|---|---|---|---|
| Model | Completed | Avg Speed | Overall T/s | Completion Tokens | Total Time |
| Qwen3.6-27B | 30 | 6.32 | 6.29 | 93,802 | 14,901.69s |
| Qwopus3.6-27B-v2-MTP | 30 | 10.66 | 10.46 | 67,862 | 6,487.81s |
| Domain-Level Performance | |||||||
|---|---|---|---|---|---|---|---|
| Domain | Questions | Qwen3.6-27B T/s | MTP T/s | Latency Gain | Qwen3.6-27B Time | MTP Time | Token Delta |
| Logic | 5 | 6.33 | 10.77 | 2.31x | 38.5 min | 16.7 min | -26.3% |
| Coding | 7 | 6.26 | 10.27 | 2.25x | 1.52 h | 40.6 min | -27.3% |
| DevOps | 6 | 6.29 | 10.39 | 2.31x | 47.4 min | 20.5 min | -28.5% |
| Math | 8 | 6.29 | 11.00 | 2.35x | 1.01 h | 25.8 min | -25.6% |
| Edge | 4 | 6.48 | 8.28 | 2.27x | 10.3 min | 4.5 min | -43.6% |
π 4. Full 30-Question Comparison
π§ 5. Domain Reading
π― 6. Recommended Use Cases
- Agentic coding and code review assistance.
- DevOps runbooks, configuration generation, and incident diagnosis.
- Multi-step math and probability derivations.
- Structured reasoning with explicit intermediate logic.
- Fast constrained output generation where latency matters.
Get the full context.
Sign up to read complete case studies, access detailed metrics, and unlock all use cases.
Get the full context.
Sign up to read complete case studies, access detailed metrics, and unlock all use cases.