--- library_name: transformers tags: [] --- # Model Card for Meta-Llama-3-8B-for-bank This model, **Meta-Llama-3-8B-for-bank**, is a fine-tuned version of the `meta-llama/Meta-Llama-3-8B-Instruct` model (just the adapters from lora). This is a **naive version**. ## Model Details ### Model Description - **Model Name**: Meta-Llama-3-8B-for-bank - **Base Model**: `meta-llama/Meta-Llama-3-8B-Instruct` - **Fine-tuning Data**: Custom bank chat examples - **Dataset**: jeromecondere/bank-chat - **Version**: 1.0 - **License**: Free - **Language**: English ### Model Type - **Architecture**: LLaMA-3 - **Type**: Instruction-based language model ### Model Usage This model is designed for financial service tasks such as: - **Balance Inquiry**: - *Example*: "Can you provide the current balance for my account?" - **Stock List Retrieval**: - *Example*: "Can you provide me with a list of my stocks?" - **Stock Purchase**: - *Example*: "I'd like to buy stocks worth 1,000.00 in Tesla." - **Deposit Transactions**: - *Example*: "I'd like to deposit 500.00 into my account." - **Withdrawal Transactions**: - *Example*: "I'd like to withdraw 200.00 from my account." - **Transaction History**: - *Example*: "I would like to view my transactions. Can you provide it?" ### Inputs and Outputs - **Inputs**: Natural language queries related to financial services. - **Outputs**: Textual responses or actions based on the input query. ### Fine-tuning This model has been fine-tuned with a dataset specifically created to implement a bank chatbot. ## Limitations - **Misinterpretation Risks**: Right now this is the first version, so when the query is too complex, inconsistent results will be returned. ## How to Use ```python from transformers import AutoModelForCausalLM, AutoTokenizer base_model = 'meta-llama/Meta-Llama-3-8B' new_model = "jeromecondere/Meta-Llama-3-8B-for-bank" # Load tokenizer and model tokenizer = AutoTokenizer.from_pretrained(new_model, use_fast=False) tokenizer.pad_token = tokenizer.eos_token tokenizer.padding_side = "right" # Quantization configuration for Lora bnb_config = BitsAndBytesConfig( load_in_4bit=True, bnb_4bit_quant_type="nf4", bnb_4bit_compute_dtype=torch.float16, bnb_4bit_use_double_quant=True, ) # Load base moodel model = AutoModelForCausalLM.from_pretrained( base_model, quantization_config=bnb_config, device_map={"": 0}, token=token ) model = PeftModel.from_pretrained(model, new_model) model = model.merge_and_unload() # Example of usage name = 'Walter Sensei' company = 'Amazon Inc.' stock_value = 42.24 messages = [ {'role': 'system', 'content': f'Hi {name}, I\'m your assistant how can I help you'}, {"role": "user", "content": f"yo, can you just give me the balance of my account?"} ] # Prepare the message using the chat template res1 = tokenizer.apply_chat_template(messages, tokenize=False) print(res1+'\n\n') # Prepare the messages for the model input_ids = tokenizer.apply_chat_template(messages, truncation=True, add_generation_prompt=True, return_tensors="pt").to("cuda") # Inference outputs = model.generate( input_ids=input_ids, max_new_tokens=100, do_sample=True, temperature=0.1, top_k=50, top_p=0.95 ) print(tokenizer.batch_decode(outputs, skip_special_tokens=True)[0])