GP

Gemma-4-26B-A4B-StyleTune

LLMby Gryphe Padar·Model page

Gryphe's Gemma 4 26B fine-tune for roleplay and creative writing.

Share:

Base model

google/gemma-4-26B-A4B-it

Model Card

image/jpg

Note this version has been superseded by 26B-A4B V2 which I highly recommend you use instead.

Now available in 26B-A4B flavour upon popular request! The text below is mostly recycled from the 31B Style Tune, though statistics have been adjusted accordingly.

A happy accident in surgical finetuning - 54% fewer clichés, an entirely new writing style, and the same Gemma 4 26B-A4B you already know underneath. One tensor changed out of 659.

Also available in 31B version!

What is a style tune?

Normally when I finetune a model I train as much of it as possible, loading every tensor and transforming it to better approximate whatever's in my data. Not this time. This time I trained precisely one tensor: the lm_head output projection - the layer that decides which token to emit. Literally the last stop before text appears on your screen.

This specific tensor has a massive influence on a model's writing style, something I first discovered building MythoMax years ago. Gemma 31B (the first style tune) is a VRAM-hungry monster, so the question became: how do I have the maximum impact with the minimum hardware requirements?

The answer: freeze everything else. All 30 transformer layers, all the attention heads, all the MLPs — completely untouched. Only lm_head trains, which means VRAM requirements drop dramatically, training completes in a single overnight run on consumer hardware, and every single one of Gemma's capabilities remains fully intact. The model hasn't changed. Only the voice has, and it's done so in the best way possible. (Obligatory disclaimer: I might be biased towards my own data.)

I used the same data I had on me for my last Pantheon Reasoning release, with one notable exception - No instruct 24k set. 100% narrative data, certified cliché free.

What changed?

Benchmarked against 200 diverse roleplay prompts versus the base instruct model:

  • 54% fewer clichés per 100 words (1.141 → 0.528)
  • Only 18.3% shared trigram vocabulary - the model reaches for an almost entirely different set of phrases, with responses feeling much less sloppy as a result.

Considering we're talking about narrative data it's hard to provide you with many other meaningful statistics - It's one of those "try it to understand it" kinda situations.

What didn't change?

Everything else. All the reasoning capability, world knowledge, instruction following, and language understanding are completely intact - none of those live in lm_head. This isn't a full finetune. It's a targeted style replacement on a single tensor.

Inference

Whatever you prefer, Gemma seems remarkably flexible in that regard. I run with temp 1.0, 0.10 MinP and the DRY sampler.

Prompt Format

Gemma 4's native chat template applies automatically.

Notes

For all I know this might only genuinely work for Gemma 4 specifically, but I'll certainly be poking other models if people enjoy this release. Feedback is, as always, very welcome!

Credits

  • Everyone from Anthracite! Hi, guys!
  • Latitude, for which I am still producing finetunes on a regular basis, helping me keep my skills sharp and up-to-date!
  • All the folks I chat with on a daily basis on Discord! You know who you are.
  • Anyone I forgot to mention, just in case!
Author
GP
Gryphe Padar
User
Gryphe
Details
Downloads432
Likes44
AccessOpen Source
Tasktext-generation
Parameters26.5B
Trending42
Licenseapache-2.0
CreatedJun 14, 2026
UpdatedJun 20, 2026
View on Hugging Face
Languages
en
Get the full context.

Sign up to read complete case studies, access detailed metrics, and unlock all use cases.

Gemma-4-26B-A4B-StyleTune — AI Model Details | Applied