¿Quién creó mistral-7b-sft-beta?

mistral-7b-sft-beta fue publicado por Hugging Face H4 en Hugging Face.

mistral-7b-sft-beta

Name: mistral-7b-sft-beta
Author: Hugging Face H4

LLMpor Hugging Face H4·Página del modelo ↗

Checkpoint SFT-beta de HuggingFaceH4 de Mistral-7B ajustado sobre UltraChat 200k para conversaciones con seguimiento de instrucciones.

Modelo base

mistralai/Mistral-7B-v0.1

Descripción del Modelo

This model is a fine-tuned version of mistralai/Mistral-7B-v0.1 on the HuggingFaceH4/ultrachat_200k dataset. It is the SFT model that was used to train Zephyr-7B-β with Direct Preference Optimization.

It achieves the following results on the evaluation set:

Loss: 0.9399

Model description

Model type: A 7B parameter GPT-like model fine-tuned on a mix of publicly available, synthetic datasets.
Language(s) (NLP): Primarily English
License: MIT
Finetuned from model: mistralai/Mistral-7B-v0.1

Model Sources

Repository: https://github.com/huggingface/alignment-handbook

Intended uses & limitations

The model was fine-tuned with 🤗 TRL's SFTTrainer on a filtered and preprocessed of the UltraChat dataset, which contains a diverse range of synthetic dialogues generated by ChatGPT.

Here's how you can run the model using the pipeline() function from 🤗 Transformers:

# Install transformers from source - only needed for versions <= v4.34
# pip install git+https://github.com/huggingface/transformers.git
# pip install accelerate

import torch
from transformers import pipeline

pipe = pipeline("text-generation", model="HuggingFaceH4/mistral-7b-sft-beta", torch_dtype=torch.bfloat16, device_map="auto")

# We use the tokenizer's chat template to format each message - see https://huggingface.co/docs/transformers/main/en/chat_templating
messages = [
    {
        "role": "system",
        "content": "You are a friendly chatbot who always responds in the style of a pirate",
    },
    {"role": "user", "content": "How many helicopters can a human eat in one sitting?"},
]
prompt = pipe.tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=True)
outputs = pipe(prompt, max_new_tokens=256, do_sample=True, temperature=0.7, top_k=50, top_p=0.95)
print(outputs[0]["generated_text"])
# <|system|>
# You are a friendly chatbot who always responds in the style of a pirate.</s>
# <|user|>
# How many helicopters can a human eat in one sitting?</s>
# <|assistant|>
# Ah, me hearty matey! But yer question be a puzzler! A human cannot eat a helicopter in one sitting, as helicopters are not edible. They be made of metal, plastic, and other materials, not food!

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 2e-05
train_batch_size: 8
eval_batch_size: 16
seed: 42
distributed_type: multi-GPU
num_devices: 16
gradient_accumulation_steps: 4
total_train_batch_size: 512
total_eval_batch_size: 256
optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
lr_scheduler_type: cosine
lr_scheduler_warmup_ratio: 0.1
num_epochs: 1

Training results

Training Loss	Epoch	Step	Validation Loss
0.9367	0.67	272	0.9397

Framework versions

Transformers 4.35.0.dev0
Pytorch 2.0.1+cu118
Datasets 2.12.0
Tokenizers 0.14.0

Autor

Hugging Face H4

Organización

HuggingFaceH4

Detalles

Descargas4.4K

Me gusta25

AccesoCódigo Abierto

Tareatext-generation

Licenciamit

Libreríatransformers

Creado26 oct 2023

Actualizado24 sept 2024

Ver en Hugging Face

Idiomas

Entiende todo el contexto.

Regístrate para leer casos de estudio completos, acceder a métricas detalladas y recibir todos los reportes.