Edit model card

Model Card for Model ID

  • Developed by: Terry Craddock

I am pretty new to uploading models. I think I made an error when I loaded my model from unsloth I loaded a 4bit then I saved to 16bit this is why the lora works and not the model its self. I will retrain this and upload new files asap.

I trained this on this dataset - https://huggingface.co/datasets/mahiatlinux/Reflection-Dataset-v2

Trained for one full epoch. The same prompts and format should be used as in the 70b model here:

https://huggingface.co/mattshumer/Reflection-Llama-3.1-70B

I take no credit for the original work. I only trained a llama 3.1 8b on @mahiatlinux dataset using the original concept and idea from @mattshumer

Here is the code I use for inference:

from unsloth import FastLanguageModel
import torch
from transformers import TextStreamer
from unsloth import is_bfloat16_supported


alpaca_prompt = """You are a world-class AI system, capable of complex reasoning and reflection. Reason through the query inside <thinking> tags, and then provide your final response inside <output> tags. If you detect that you made a mistake in your reasoning at any point, correct yourself inside <reflection> tags.
### Prompt:
{}

### Response:
{}"""

prompt = "Tell me about yourself."

dtype = torch.bfloat16 if is_bfloat16_supported() else torch.float16

# Load the merged model
model, tokenizer = FastLanguageModel.from_pretrained(
    model_name="lora_model",  # Replace with the actual path where you saved the merged model
    max_seq_length=2048,
    dtype=torch.float16  # Since it's saved as 16-bit
)

FastLanguageModel.for_inference(model)
inputs = tokenizer(
    [alpaca_prompt.format(prompt, "")],
    return_tensors="pt"
).to("cuda").to(dtype)

text_streamer = TextStreamer(tokenizer)
_ = model.generate(**inputs, streamer=text_streamer, max_new_tokens=1000)



Output from model:

<|begin_of_text|>You are a world-class AI system, capable of complex reasoning and reflection. Reason through the query inside <thinking> tags, and then provide your final response inside <output> tags. If you detect that you made a mistake in your reasoning at any point, correct yourself inside <reflection> tags.
### Prompt:
Tell me about yourself.

### Response:
<thinking>
To answer this question, I need to consider:
- The importance of starting with a positive tone
- The balance between personal and professional information
- How to highlight relevant skills and experiences
- The need for brevity and clarity

I plan to structure the response as follows:
1. Brief introduction with a positive tone
2. Professional background and experiences
3. Personal interests and hobbies
4. Closing statement

<reflection>
Upon reflection, I realize I should also consider:
- The potential for tailoring the response to specific situations (e.g., for a job interview)
- The importance of showing enthusiasm and confidence
- The potential for mentioning specific achievements or accomplishments
</reflection>
</thinking>

<output>
Hi there! I'm a world-class AI system designed to engage in natural conversations. Let me tell you a little bit about myself.

I'm a versatile AI system with a wide range of capabilities. I can perform various tasks, from answering questions and providing information to generating creative content. I'm constantly learning and improving, much like a human would.

In terms of my professional background, I've had the privilege of working with some of the most advanced AI systems in the world. This experience has given me a deep understanding of natural language processing, machine learning, and the latest AI technologies.

As for my personal interests, I enjoy exploring new concepts, learning about different cultures, and engaging in stimulating conversations. I'm always looking for ways to improve my knowledge and skills.

Some related concepts you might find interesting are:
1. The ethics of AI development and use
2. The future of human-AI collaboration
3. The role of AI in education and learning

Is there anything specific about AI or my capabilities you'd like me to elaborate on?
</output><|end_of_text|>
Downloads last month
1,626
Safetensors
Model size
8.03B params
Tensor type
FP16
·
Inference Examples
Inference API (serverless) is not available, repository is disabled.

Model tree for terrycraddock/Reflection-Llama-3.1-8B

Quantizations
3 models