jeromecondere's picture
Update README.md
d1ba32f verified
|
raw
history blame
3.36 kB
---
library_name: transformers
tags: []
---
# Model Card for Meta-Llama-3-8B-for-bank
This model, **Meta-Llama-3-8B-for-bank**, is a fine-tuned version of the `meta-llama/Meta-Llama-3-8B-Instruct` model (just the adapters from lora).
This is a **naive version**.
## Model Details
### Model Description
- **Model Name**: Meta-Llama-3-8B-for-bank
- **Base Model**: `meta-llama/Meta-Llama-3-8B-Instruct`
- **Fine-tuning Data**: Custom bank chat examples
- **Dataset**: jeromecondere/bank-chat
- **Version**: 1.0
- **License**: Free
- **Language**: English
### Model Type
- **Architecture**: LLaMA-3
- **Type**: Instruction-based language model
### Model Usage
This model is designed for financial service tasks such as:
- **Balance Inquiry**:
- *Example*: "Can you provide the current balance for my account?"
- **Stock List Retrieval**:
- *Example*: "Can you provide me with a list of my stocks?"
- **Stock Purchase**:
- *Example*: "I'd like to buy stocks worth 1,000.00 in Tesla."
- **Deposit Transactions**:
- *Example*: "I'd like to deposit 500.00 into my account."
- **Withdrawal Transactions**:
- *Example*: "I'd like to withdraw 200.00 from my account."
- **Transaction History**:
- *Example*: "I would like to view my transactions. Can you provide it?"
### Inputs and Outputs
- **Inputs**: Natural language queries related to financial services.
- **Outputs**: Textual responses or actions based on the input query.
### Fine-tuning
This model has been fine-tuned with a dataset specifically created to implement a bank chatbot.
## Limitations
- **Misinterpretation Risks**: Right now this is the first version, so when the query is too complex, inconsistent results will be returned.
## How to Use
```python
from transformers import AutoModelForCausalLM, AutoTokenizer
base_model = 'meta-llama/Meta-Llama-3-8B'
new_model = "jeromecondere/Meta-Llama-3-8B-for-bank"
# Load tokenizer and model
tokenizer = AutoTokenizer.from_pretrained(new_model, use_fast=False)
tokenizer.pad_token = tokenizer.eos_token
tokenizer.padding_side = "right"
# Quantization configuration for Lora
bnb_config = BitsAndBytesConfig(
load_in_4bit=True,
bnb_4bit_quant_type="nf4",
bnb_4bit_compute_dtype=torch.float16,
bnb_4bit_use_double_quant=True,
)
# Load base moodel
model = AutoModelForCausalLM.from_pretrained(
base_model,
quantization_config=bnb_config,
device_map={"": 0},
token=token
)
model = PeftModel.from_pretrained(model, new_model)
model = model.merge_and_unload()
# Example of usage
name = 'Walter Sensei'
company = 'Amazon Inc.'
stock_value = 42.24
messages = [
{'role': 'system', 'content': f'Hi {name}, I\'m your assistant how can I help you'},
{"role": "user", "content": f"yo, can you just give me the balance of my account?"}
]
# Prepare the message using the chat template
res1 = tokenizer.apply_chat_template(messages, tokenize=False)
print(res1+'\n\n')
# Prepare the messages for the model
input_ids = tokenizer.apply_chat_template(messages, truncation=True, add_generation_prompt=True, return_tensors="pt").to("cuda")
# Inference
outputs = model.generate(
input_ids=input_ids,
max_new_tokens=100,
do_sample=True,
temperature=0.1,
top_k=50,
top_p=0.95
)
print(tokenizer.batch_decode(outputs, skip_special_tokens=True)[0])