HF

mistral-7b-sft-alpha

LLMby Hugging Face H4Β·Model page β†—

HuggingFaceH4's early SFT-alpha checkpoint of Mistral-7B fine-tuned on UltraChat for conversational instruction-following.

Share:

Base model

mistralai/Mistral-7B-v0.1

Model Card

This model is a fine-tuned version of mistralai/Mistral-7B-v0.1 on the UltraChat dataset.

It achieves the following results on the evaluation set:

  • Loss: 0.9316

Model description

  • Model type: A 7B parameter GPT-like model fine-tuned on a mix of publicly available, synthetic datasets.
  • Language(s) (NLP): Primarily English
  • License: MIT
  • Finetuned from model: mistralai/Mistral-7B-v0.1

Model Sources

Intended uses & limitations

The model was fine-tuned with πŸ€— TRL's SFTTrainer on a filtered and preprocessed of the UltraChat dataset, which contains a diverse range of synthetic dialogues generated by ChatGPT.

Here's how you can run the model using the pipeline() function from πŸ€— Transformers:

# Install transformers from source - only needed for versions <= v4.34
# pip install git+https://github.com/huggingface/transformers.git
# pip install accelerate

import torch
from transformers import pipeline

pipe = pipeline("text-generation", model="HuggingFaceH4/mistral-7b-sft-alpha", torch_dtype=torch.bfloat16, device_map="auto")

# We use the tokenizer's chat template to format each message - see https://huggingface.co/docs/transformers/main/en/chat_templating
messages = [
    {
        "role": "system",
        "content": "You are a friendly chatbot who always responds in the style of a pirate",
    },
    {"role": "user", "content": "How many helicopters can a human eat in one sitting?"},
]
prompt = pipe.tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=True)
outputs = pipe(prompt, max_new_tokens=256, do_sample=True, temperature=0.7, top_k=50, top_p=0.95)
print(outputs[0]["generated_text"])
# <|system|>
# You are a friendly chatbot who always responds in the style of a pirate.</s>
# <|user|>
# How many helicopters can a human eat in one sitting?</s>
# <|assistant|>
# Ah, me hearty matey! But yer question be a puzzler! A human cannot eat a helicopter in one sitting, as helicopters are not edible. They be made of metal, plastic, and other materials, not food!

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 2e-05
  • train_batch_size: 8
  • eval_batch_size: 16
  • seed: 42
  • distributed_type: multi-GPU
  • num_devices: 16
  • gradient_accumulation_steps: 4
  • total_train_batch_size: 512
  • total_eval_batch_size: 256
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: cosine
  • lr_scheduler_warmup_ratio: 0.1
  • num_epochs: 1

Training results

Training Loss Epoch Step Validation Loss
0.9276 0.66 296 0.9315

Framework versions

  • Transformers 4.34.0
  • Pytorch 2.0.1+cu118
  • Datasets 2.12.0
  • Tokenizers 0.14.0
Author
HF
Hugging Face H4
Organization
HuggingFaceH4
Details
Downloads1.8K
Likes4
AccessOpen Source
Tasktext-generation
Licensemit
Librarytransformers
CreatedOct 26, 2023
UpdatedOct 26, 2023
View on Hugging Face
Languages
en
Get the full context.

Sign up to read complete case studies, access detailed metrics, and unlock all use cases.

mistral-7b-sft-alpha β€” AI Model Details | Applied