GritLM
/

GritLM-8x7B

Text Generation

text-generation-inference

Inference Endpoints

Model card Files Files and versions Community

Muennighoff commited on Feb 13

Commit

83b89f8

•

1 Parent(s): 9a00d10

Update README.md

Files changed (1) hide show

README.md +10 -25

README.md CHANGED Viewed

@@ -2,6 +2,8 @@
 pipeline_tag: text-generation
 inference: true
 license: apache-2.0
 ---
 # Table of Contents
@@ -13,39 +15,22 @@ license: apache-2.0
 # Model Summary
-> GritLM is a generative-representational instruction-tuned language model. It performs well at both text representation and text generation.
 - **Repository:** [ContextualAI/gritlm](https://github.com/ContextualAI/gritlm)
 - **Paper:** [TODO](https://arxiv.org/abs/2308.07124)
-# Use
-The models usage is documented [here](TODO). It supports GritLM, Transformers, Sentence Transformers.
-# Training
-## Model
-- **Architecture:** Mistral-8x7B
-- **Steps:** 250k pretraining & 30 instruction tuning
-- **Pretraining tokens:** ? pretraining & 2M instruction tuning
-- **Precision:** bfloat16
-## Hardware
-- **Pretraining:**
-  - **GPUs:** 512 Tesla A100
-  - **Training time:** 1 day
-- **Instruction tuning:**
-  - **GPUs:** 8 Tesla A100
-  - **Training time:** 4 hours
-## Software
-https://github.com/ContextualAI/gritlm
 # Citation
 ```bibtex
 TODO
-```

 pipeline_tag: text-generation
 inference: true
 license: apache-2.0
+datasets:
+- GritLM/tulu2
 ---
 # Table of Contents
 # Model Summary
+> GritLM is a generative representational instruction tuned language model. It unifies text representation (embedding) and text generation into a single model achieving state-of-the-art performance on both types of tasks.
 - **Repository:** [ContextualAI/gritlm](https://github.com/ContextualAI/gritlm)
 - **Paper:** [TODO](https://arxiv.org/abs/2308.07124)
+| Model | Description |
+|-------|-------------|
+| [GritLM 7B](https://hf.co/GritLM/GritLM-7B) | Mistral 7B finetuned using GRIT |
+| [GritLM 8x7B](https://hf.co/GritLM/GritLM-8x7B) | Mixtral 8x7B finetuned using GRIT |
+# Use
+The model usage is documented [here](TODO). It supports GritLM, Transformers, Sentence Transformers.
 # Citation
 ```bibtex
 TODO
+```