---
base_model: meta-llama/Meta-Llama-3-8B-Instruct
datasets:
- jeromecondere/bank-chat
library_name: transformers
---

# Model Card for Model ID
## WIP
If you just want the adapter instead - **jeromecondere/Meta-Llama-3-8B-for-bank** ([Link](https://huggingface.co/jeromecondere/Meta-Llama-3-8B-for-bank))


## Model Details

### Model Description


- **Developed by:** Jerome Condere
- **Finetuned from model :** Meta-Llama-3-8B-Instruct

## How to use it?
```python
import os
import torch
from datasets import load_dataset, Dataset, DatasetDict
import pandas as pd
import numpy as np
import json
from transformers import (
    AutoModelForCausalLM,
    AutoTokenizer,
    BitsAndBytesConfig,
    TrainingArguments,
    pipeline
)

merged_model_id = 'jeromecondere/merged-llama-v3-for-bank'

merged_model = AutoModelForCausalLM.from_pretrained(
    merged_model_id,
    torch_dtype=torch.bfloat16,
    device_map= "cuda"
)
tokenizer = AutoTokenizer.from_pretrained(merged_model_id, use_fast=True)

name = 'Yalat Sensei'
company = 'Google Corp.'
stock_value = 42.24
messages = [
    {'role': 'system', 'content': f'Hi {name}, I\'m your assistant how can I help you\n'},
    {"role": "user", "content": f"I'd like to buy stocks worth {stock_value:.2f} in {company}.\n"},
    {"role": "system", "content": f"Sure, we have purchased stocks worth ###StockValue({stock_value:.2f}) in ###Company({company}) for you.\n"},
    {"role": "user", "content": f"Now I want to see my balance, hurry up!\n"},
    {"role": "system", "content": f"Sure, here's your balance ###Balance\n"},
    {"role": "user", "content": f"Again, my balance?\n"},
    {"role": "system", "content": f"We have your account details. Your balance is: ###Balance"},
    {"role": "user", "content": f"Okay now, I want my list of stocks"}

]
# prepare the messages for the model
input_ids = tokenizer.apply_chat_template(messages, truncation=True, add_generation_prompt=True, return_tensors="pt").to("cuda")

# inference
outputs = merged_model.generate(
        input_ids=input_ids,
        max_new_tokens=120,
        #do_sample=True,
        temperature=0.5,
        top_k=50,
        top_p=0.95
)
print(tokenizer.batch_decode(outputs)[0])
```