How many parameters does zephyr-7b-gemma-sft-v0.1 have?

zephyr-7b-gemma-sft-v0.1 has approximately 8.5 billion parameters.

Who created zephyr-7b-gemma-sft-v0.1?

zephyr-7b-gemma-sft-v0.1 was published by Hugging Face H4 on Hugging Face.

zephyr-7b-gemma-sft-v0.1

Name: zephyr-7b-gemma-sft-v0.1
Author: Hugging Face H4

LLMby Hugging Face H4·Model page ↗

HuggingFaceH4's 8.5B Zephyr chat model built on Gemma 7B via supervised fine-tuning on the Deita 10k dataset.

Base model

google/gemma-7b

Model Description

This model is a fine-tuned version of google/gemma-7b on the HuggingFaceH4/deita-10k-v0-sft dataset. It achieves the following results on the evaluation set:

Loss: 0.9732

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 2e-05
train_batch_size: 4
eval_batch_size: 4
seed: 42
distributed_type: multi-GPU
num_devices: 16
gradient_accumulation_steps: 2
total_train_batch_size: 128
total_eval_batch_size: 64
optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
lr_scheduler_type: cosine
lr_scheduler_warmup_ratio: 0.1
num_epochs: 3

Training results

Training Loss	Epoch	Step	Validation Loss
0.9482	1.0	299	0.9848
0.8139	2.0	599	0.9610
0.722	2.99	897	0.9732

Framework versions

Transformers 4.39.0.dev0
Pytorch 2.1.2+cu121
Datasets 2.14.6
Tokenizers 0.15.1

Author

Hugging Face H4

Organization

HuggingFaceH4

Details

Downloads114

Likes13

AccessOpen Source

Tasktext-generation

Parameters8.5B

Licenseother

Librarytransformers

CreatedMar 1, 2024

UpdatedMar 1, 2024

View on Hugging Face

Languages

Get the full context.

Author

Hugging Face H4

Organization

HuggingFaceH4

Details

Downloads114

Likes13

AccessOpen Source

Tasktext-generation

Parameters8.5B

Licenseother

Librarytransformers

CreatedMar 1, 2024

UpdatedMar 1, 2024

View on Hugging Face

Languages

Get the full context.