Muennighoff commited on
Commit
9a00d10
1 Parent(s): c961b6b

Create README.md

Browse files
Files changed (1) hide show
  1. README.md +51 -0
README.md ADDED
@@ -0,0 +1,51 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ pipeline_tag: text-generation
3
+ inference: true
4
+ license: apache-2.0
5
+ ---
6
+
7
+ # Table of Contents
8
+
9
+ 1. [Model Summary](#model-summary)
10
+ 2. [Use](#use)
11
+ 3. [Training](#training)
12
+ 4. [Citation](#citation)
13
+
14
+ # Model Summary
15
+
16
+ > GritLM is a generative-representational instruction-tuned language model. It performs well at both text representation and text generation.
17
+
18
+ - **Repository:** [ContextualAI/gritlm](https://github.com/ContextualAI/gritlm)
19
+ - **Paper:** [TODO](https://arxiv.org/abs/2308.07124)
20
+
21
+ # Use
22
+
23
+ The models usage is documented [here](TODO). It supports GritLM, Transformers, Sentence Transformers.
24
+
25
+ # Training
26
+
27
+ ## Model
28
+
29
+ - **Architecture:** Mistral-8x7B
30
+ - **Steps:** 250k pretraining & 30 instruction tuning
31
+ - **Pretraining tokens:** ? pretraining & 2M instruction tuning
32
+ - **Precision:** bfloat16
33
+
34
+ ## Hardware
35
+
36
+ - **Pretraining:**
37
+ - **GPUs:** 512 Tesla A100
38
+ - **Training time:** 1 day
39
+ - **Instruction tuning:**
40
+ - **GPUs:** 8 Tesla A100
41
+ - **Training time:** 4 hours
42
+
43
+ ## Software
44
+
45
+ https://github.com/ContextualAI/gritlm
46
+
47
+ # Citation
48
+
49
+ ```bibtex
50
+ TODO
51
+ ```