--- library_name: transformers license: apache-2.0 pipeline_tag: text-generation tags: - bitsandbytes - quantized - 8bit - Mistral - Mistral-7B - bnb --- # Model Card for alokabhishek/Mistral-7B-Instruct-v0.2-bnb-8bit This repo contains 8-bit quantized (using bitsandbytes) model Mistral AI_'s Mistral-7B-Instruct-v0.2 ## Model Details - Model creator: [Mistral AI_](https://huggingface.co/mistralai) - Original model: [Mistral-7B-Instruct-v0.2](https://huggingface.co/mistralai/Mistral-7B-Instruct-v0.2) ### About 8 bit quantization using bitsandbytes - QLoRA: Efficient Finetuning of Quantized LLMs: [arXiv - QLoRA: Efficient Finetuning of Quantized LLMs](https://arxiv.org/abs/2305.14314) - Hugging Face Blog post on 8-bit quantization using bitsandbytes: [A Gentle Introduction to 8-bit Matrix Multiplication for transformers at scale using Hugging Face Transformers, Accelerate and bitsandbytes](https://huggingface.co/blog/hf-bitsandbytes-integration) - bitsandbytes github repo: [bitsandbytes github repo](https://github.com/TimDettmers/bitsandbytes) # How to Get Started with the Model Use the code below to get started with the model. ## How to run from Python code #### First install the package ```shell !pip install --quiet bitsandbytes !pip install --quiet --upgrade transformers # Install latest version of transformers !pip install --quiet --upgrade accelerate !pip install --quiet sentencepiece pip install flash-attn --no-build-isolation ``` # Import ```python import torch import os from torch import bfloat16 from transformers import AutoTokenizer, AutoModelForCausalLM, pipeline, BitsAndBytesConfig, LlamaForCausalLM ``` # Use a pipeline as a high-level helper ```python model_id_mistral = "alokabhishek/Mistral-7B-Instruct-v0.2-bnb-8bit" tokenizer_mistral = AutoTokenizer.from_pretrained(model_id_mistral, use_fast=True) model_mistral = AutoModelForCausalLM.from_pretrained( model_id_mistral, device_map="auto" ) pipe_mistral = pipeline(model=model_mistral, tokenizer=tokenizer_mistral, task='text-generation') prompt_mistral = "Tell me a funny joke about Large Language Models meeting a Blackhole in an intergalactic Bar." output_mistral = pipe_llama(prompt_mistral, max_new_tokens=512) print(output_mistral[0]["generated_text"]) ``` ## Uses ### Direct Use [More Information Needed] ### Downstream Use [optional] [More Information Needed] ### Out-of-Scope Use [More Information Needed] ## Bias, Risks, and Limitations [More Information Needed] ## Evaluation #### Metrics [More Information Needed] ### Results [More Information Needed] ## Model Card Authors [optional] [More Information Needed] ## Model Card Contact [More Information Needed]