File size: 3,330 Bytes
82ce4c9
 
 
 
 
4262738
82ce4c9
8713894
82ce4c9
 
 
 
4262738
 
35ba108
11185e5
4262738
35ba108
4262738
82ce4c9
4262738
82ce4c9
4262738
 
82ce4c9
4262738
82ce4c9
4262738
82ce4c9
4262738
 
 
 
 
 
 
 
 
 
 
 
82ce4c9
4262738
82ce4c9
4262738
 
82ce4c9
4262738
82ce4c9
e9f520b
82ce4c9
 
4262738
82ce4c9
35ba108
82ce4c9
 
 
4262738
82ce4c9
4262738
 
82ce4c9
e9f520b
 
 
4262738
e9f520b
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
82ce4c9
4262738
 
 
 
 
 
 
 
82ce4c9
4262738
 
 
82ce4c9
4262738
 
82ce4c9
4262738
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
---
library_name: transformers
tags: []
---

# Model Card for Meta-Llama-3-8B-for-bank

This model, **Meta-Llama-3-8B-for-bank**, is a fine-tuned version of the `meta-llama/Meta-Llama-3-8B-Instruct` model (just the adapters from lora). 
## Model Details

### Model Description

- **Model Name**: Meta-Llama-3-8B-for-bank
- **Base Model**: `meta-llama/Meta-Llama-3-8B-Instruct`
- **Fine-tuning Data**: Custom bank chat examples
- **Dataset**: jeromecondere/bank-chat
- **Version**: 1.0
- **License**: Free
- **Language**: English

### Model Type

- **Architecture**: LLaMA-3
- **Type**: Instruction-based language model

### Model Usage

This model is designed for financial service tasks such as:

- **Balance Inquiry**:
  - *Example*: "Can you provide the current balance for my account?"
- **Stock List Retrieval**:
  - *Example*: "Can you provide me with a list of my stocks?"
- **Stock Purchase**:
  - *Example*: "I'd like to buy stocks worth $1,000.00 in Tesla."
- **Deposit Transactions**:
  - *Example*: "I'd like to deposit $500.00 into my account."
- **Withdrawal Transactions**:
  - *Example*: "I'd like to withdraw $200.00 from my account."
- **Transaction History**:
  - *Example*: "I would like to view my transactions. Can you provide it?"

### Inputs and Outputs

- **Inputs**: Natural language queries related to financial services.
- **Outputs**: Textual responses or actions based on the input query.

### Fine-tuning

This model has been fine-tuned with a dataset specifically created to implement a  bank chatbot.


## Limitations

- **Misinterpretation Risks**: Right now this is the first version, so when the query is too complex, inconsistent results will be returned.



## How to Use

```python
from transformers import AutoModelForCausalLM, AutoTokenizer

base_model = 'meta-llama/Meta-Llama-3-8B'
new_model = "jeromecondere/Meta-Llama-3-8B-for-bank"

# Load tokenizer and model
tokenizer = AutoTokenizer.from_pretrained(new_model, use_fast=False)
tokenizer.pad_token = tokenizer.eos_token
tokenizer.padding_side = "right"

# Quantization configuration for Lora
bnb_config = BitsAndBytesConfig(
    load_in_4bit=True,
    bnb_4bit_quant_type="nf4",
    bnb_4bit_compute_dtype=torch.float16,
    bnb_4bit_use_double_quant=True,
)

# Load base moodel
model = AutoModelForCausalLM.from_pretrained(
    base_model,
    quantization_config=bnb_config,
    device_map={"": 0},
    token=token
)

model = PeftModel.from_pretrained(model, new_model)
model = model.merge_and_unload()


# Example of usage
name = 'Walter Sensei'
company = 'Amazon Inc.'
stock_value = 42.24
messages = [
    {'role': 'system', 'content': f'Hi {name}, I\'m your assistant how can I help you'},
    {"role": "user", "content": f"yo, can you just give me the balance of my account?"}
]

# Prepare the message using the chat template
res1 = tokenizer.apply_chat_template(messages, tokenize=False)
print(res1+'\n\n')

# Prepare the messages for the model
input_ids = tokenizer.apply_chat_template(messages, truncation=True, add_generation_prompt=True, return_tensors="pt").to("cuda")

# Inference
outputs = model.generate(
        input_ids=input_ids,
        max_new_tokens=100,
        do_sample=True,
        temperature=0.1,
        top_k=50,
        top_p=0.95
)
print(tokenizer.batch_decode(outputs, skip_special_tokens=True)[0])