AIGym commited on
Commit
94077b0
•
1 Parent(s): 82e4600

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +41 -37
README.md CHANGED
@@ -1,37 +1,41 @@
1
- ---
2
- license: apache-2.0
3
- datasets:
4
- - cerebras/SlimPajama-627B
5
- - bigcode/starcoderdata
6
- language:
7
- - en
8
- ---
9
- <div align="center">
10
-
11
- # TinyLlama-1.1B
12
- </div>
13
-
14
- https://github.com/jzhang38/TinyLlama
15
-
16
- The TinyLlama project aims to **pretrain** a **1.1B Llama model on 3 trillion tokens**. With some proper optimization, we can achieve this within a span of "just" 90 days using 16 A100-40G GPUs 🚀🚀. The training has started on 2023-09-01.
17
-
18
- <div align="center">
19
- <img src="./TinyLlama_logo.png" width="300"/>
20
- </div>
21
-
22
- We adopted exactly the same architecture and tokenizer as Llama 2. This means TinyLlama can be plugged and played in many open-source projects built upon Llama. Besides, TinyLlama is compact with only 1.1B parameters. This compactness allows it to cater to a multitude of applications demanding a restricted computation and memory footprint.
23
-
24
- #### This Collection
25
- This collection contains all checkpoints after the 1T fix. Branch name indicates the step and number of tokens seen.
26
-
27
- #### Eval
28
-
29
- | Model | Pretrain Tokens | HellaSwag | Obqa | WinoGrande | ARC_c | ARC_e | boolq | piqa | avg |
30
- |-------------------------------------------|-----------------|-----------|------|------------|-------|-------|-------|------|-----|
31
- | Pythia-1.0B | 300B | 47.16 | 31.40| 53.43 | 27.05 | 48.99 | 60.83 | 69.21 | 48.30 |
32
- | TinyLlama-1.1B-intermediate-step-50K-104b | 103B | 43.50 | 29.80| 53.28 | 24.32 | 44.91 | 59.66 | 67.30 | 46.11|
33
- | TinyLlama-1.1B-intermediate-step-240k-503b| 503B | 49.56 |31.40 |55.80 |26.54 |48.32 |56.91 |69.42 | 48.28 |
34
- | TinyLlama-1.1B-intermediate-step-480k-1007B | 1007B | 52.54 | 33.40 | 55.96 | 27.82 | 52.36 | 59.54 | 69.91 | 50.22 |
35
- | TinyLlama-1.1B-intermediate-step-715k-1.5T | 1.5T | 53.68 | 35.20 | 58.33 | 29.18 | 51.89 | 59.08 | 71.65 | 51.29 |
36
- | TinyLlama-1.1B-intermediate-step-955k-2T | 2T | 54.63 | 33.40 | 56.83 | 28.07 | 54.67 | 63.21 | 70.67 | 51.64 |
37
- | **TinyLlama-1.1B-intermediate-step-1195k-token-2.5T** | **2.5T** | **58.96** | **34.40** | **58.72** | **31.91** | **56.78** | **63.21** | **73.07** | **53.86**|
 
 
 
 
 
1
+ # TinyLlama-1.1B-2.5T-chat
2
+ It was created by starting with the TinyLlama-1.1B-2.5T-chat and training it on the open assistant dataset. We have attached the wandb report in pdf form to view the training run at a glance.
3
+
4
+ # Reson
5
+ This model was fine tuned to allow it to follow direction and is a steeping stone to further training, but still would be good for asking qestions about code.
6
+
7
+ # How to use
8
+ You will need the transformers>=4.31
9
+ ```python
10
+ from transformers import AutoTokenizer
11
+ import transformers
12
+ import torch
13
+ model = "AIGym/TinyLlama-1.1B-2.5T-chat"
14
+ tokenizer = AutoTokenizer.from_pretrained(model)
15
+ pipeline = transformers.pipeline(
16
+ "text-generation",
17
+ model=model,
18
+ torch_dtype=torch.float16,
19
+ device_map="auto",
20
+ )
21
+ prompt = "What are the values in open source projects?"
22
+ formatted_prompt = (
23
+ f"### Human: {prompt}### Assistant:"
24
+ )
25
+ sequences = pipeline(
26
+ formatted_prompt,
27
+ do_sample=True,
28
+ top_k=50,
29
+ top_p = 0.7,
30
+ num_return_sequences=1,
31
+ repetition_penalty=1.1,
32
+ max_new_tokens=500,
33
+ )
34
+ for seq in sequences:
35
+ print(f"Result: {seq['generated_text']}")
36
+ ```
37
+
38
+ # Referrals
39
+ Run Pod - This is who I use to train th emodels on huggingface. If you use it we both get free crdits. - <a href="https://runpod.io?ref=kilq83n1" target="_blank" style="color: #3498db; text-decoration: none; font-weight: bold;">Visit Runpod's Website!</a>
40
+
41
+ Paypal - If you want to leave a tip, it is appecaheted. - <a href="paypal.me/OpenSourceTraining" target="_blank" style="color: #3498db; text-decoration: none; font-weight: bold;">Visit My Paypal!</a>