Safetensors
llama
EryriLabs commited on
Commit
9ceba2d
1 Parent(s): 2e9a516

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +107 -3
README.md CHANGED
@@ -1,3 +1,107 @@
1
- ---
2
- license: cc-by-nc-4.0
3
- ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ license: cc-by-nc-4.0
3
+ datasets:
4
+ - EryriLabs/uk_legislation_alpaca_style_cleaned
5
+ base_model:
6
+ - EryriLabs/llama-3.2-uk-legislation-3b
7
+ ---
8
+ license: cc-by-4.0
9
+ datasets:
10
+ - santoshtyss/uk_legislation
11
+ language:
12
+ - en
13
+ base_model:
14
+ - unsloth/Llama-3.2-3B
15
+ tags:
16
+ - legal
17
+ ---
18
+
19
+ # Llama 3.2 UK Legislation 3B
20
+
21
+
22
+ <figure>
23
+ <img src="UKlegislation.png" alt="Llama 3.2 UK Legislation 3B" width="300">
24
+ </figure>
25
+
26
+ This model is a fine-tuned version of the Llama 3.2 UK Legislation 3B base. It was instruction-tuned for Q and A on UK legislation.
27
+ It was trained as part of a blog series, see the article [here](https://www.gpt-labs.ai/post/making-a-domain-specific-uk-legislation-llm-part-1-pretraining)
28
+ ## Model Details
29
+
30
+ ### Model Description
31
+ - **Developed by:** GPT-LABS.AI
32
+ - **Model type:** Transformer-based language model
33
+ - **Language:** English
34
+ - **License:** [CC BY 4.0](https://creativecommons.org/licenses/by/4.0/)
35
+ - **Base model:** [llama-3.2-uk-legislation-3b](EryriLabs/llama-3.2-uk-legislation-3b)
36
+
37
+ ### Model Sources
38
+ - **Repository:** [EryriLabs/llama-3.2-uk-legislation-3b](https://huggingface.co/EryriLabs/llama-3.2-uk-legislation-3b)
39
+ - **Blog Post:** [Making a Domain-Specific UK Legislation LLM: Part 1 - Pretraining](https://www.gpt-labs.ai/post/making-a-domain-specific-uk-legislation-llm-part-1-pretraining)
40
+
41
+ ## Uses
42
+
43
+ ### Intended Use
44
+ This model is designed to serve as Q and A for UK legislation and for further development for tasks such as:
45
+ - Domain-specific applications in law or other fields
46
+ - Research and experimentation in natural language processing
47
+ - General-purpose natural language understanding and generation
48
+
49
+ ### Out-of-Scope Use
50
+ This model is **not suitable** for:
51
+ - Providing domain-specific expertise
52
+ - Applications requiring high accuracy or nuanced understanding of UK legislation
53
+ - Tasks involving sensitive or critical real-world applications without rigorous evaluation
54
+
55
+ ## Bias, Risks, and Limitations
56
+
57
+ - **Bias:** The model may reflect biases inherent in the pretraining data. Outputs should be critically evaluated for accuracy and fairness.
58
+ - **Risks:** As a base model, it may generate responses that are overly general or contextually inappropriate for specific tasks.
59
+ - **Limitations:** The model is not fine-tuned for specific domains, including legal or legislative text, and does not include the most recent developments in any field.
60
+
61
+ ## How to Get Started with the Model
62
+
63
+ ```python
64
+ from transformers import AutoModelForCausalLM, AutoTokenizer
65
+
66
+ # Load model and tokenizer
67
+ model = AutoModelForCausalLM.from_pretrained("EryriLabs/llama-3.2-uk-legislation-instruct-3b", device_map="auto")
68
+ tokenizer = AutoTokenizer.from_pretrained("EryriLabs/llama-3.2-uk-legislation-instruct-3b")
69
+
70
+ # Sample question
71
+ input_text = "What are the main principles of UK legislation?"
72
+
73
+ # Tokenize and generate response
74
+ inputs = tokenizer(input_text, return_tensors="pt").to("cuda")
75
+ outputs = model.generate(inputs["input_ids"], max_length=50)
76
+ response = tokenizer.decode(outputs[0], skip_special_tokens=True)
77
+
78
+ print(response)
79
+ ```
80
+
81
+ ## Technical Specifications
82
+
83
+ - **Model Architecture:** Llama 3.2 3B, a transformer-based model designed for natural language processing tasks.
84
+ - **Training Data:** Pretrained on a diverse dataset of general text.
85
+ - **Compute Infrastructure:** Training conducted on high-performance GPUs (e.g., NVIDIA A100).
86
+
87
+ ## Citation
88
+
89
+ If you use this model, please cite:
90
+
91
+ ```
92
+ @misc{llama3.2-uk-legislation-instruct-3b,
93
+ author = {GPT-LABS.AI},
94
+ title = {Llama 3.2 UK Legislation Instruct 3B},
95
+ year = {2024},
96
+ publisher = {Hugging Face},
97
+ url = {https://huggingface.co/EryriLabs/llama-3.2-uk-legislation-instruct-3b}
98
+ }
99
+ ```
100
+
101
+ ## Model Card Authors
102
+
103
+ - GPT-LABS.AI
104
+
105
+ ## Contact
106
+
107
+ For questions or feedback, please visit gpt-labs.ai