--- language: en datasets: - truthful_qa license: apache-2.0 tags: - qlora - falcon - fine-tuning - nlp - causal-lm - h100 library_name: peft base_model: tiiuae/falcon-7b-instruct --- # Falcon-7B QLoRA Fine-Tuned on TruthfulQA ## Model Description This model is a fine-tuned version of the `tiiuae/falcon-7b-instruct` using the QLoRA technique on the [TruthfulQA](https://huggingface.co/datasets/truthful_qa) dataset. ## Training - **Base Model**: [tiiuae/falcon-7b-instruct](https://huggingface.co/tiiuae/falcon-7b-instruct) - **Dataset**: [TruthfulQA](https://huggingface.co/datasets/truthful_qa) - **Training Technique**: QLoRA - **Hardware**: H100 GPUs - **Epochs**: 10 - **Batch Size**: 16 - **Learning Rate**: 2e-4 ### How to Use ```python from transformers import AutoModelForCausalLM, AutoTokenizer from peft import PeftModel # Load the base model base_model_name = "tiiuae/falcon-7b-instruct" base_model = AutoModelForCausalLM.from_pretrained(base_model_name) tokenizer = AutoTokenizer.from_pretrained(base_model_name) # Load the adapter and apply it to the base model adapter_repo_name = "MohammadOthman/falcon-7b-qlora-truthfulqa" model = PeftModel.from_pretrained(base_model, adapter_repo_name) # Move model to GPU if available device = "cuda" if torch.cuda.is_available() else "cpu" model.to(device) # Function to generate text def generate_text(prompt, max_length=100, num_return_sequences=1): # Tokenize the input prompt inputs = tokenizer(prompt, return_tensors="pt").to(device) # Generate text outputs = model.generate( input_ids=inputs["input_ids"], attention_mask=inputs["attention_mask"], max_length=max_length, num_return_sequences=num_return_sequences, do_sample=True, temperature=0.7 ) # Decode and print the output for i, output in enumerate(outputs): print(f"Generated Text {i+1}: {tokenizer.decode(output, skip_special_tokens=True)}") # Example usage prompt = "Once upon a time in a land far, far away" generate_text(prompt) ```