GritLM
/

GritLM-8x7B

Text Generation

text-generation-inference

Inference Endpoints

Model card Files Files and versions Community

Muennighoff commited on Feb 11

Commit

9a00d10

•

1 Parent(s): c961b6b

Create README.md

Files changed (1) hide show

README.md +51 -0

README.md ADDED Viewed

	@@ -0,0 +1,51 @@

+---
+pipeline_tag: text-generation
+inference: true
+license: apache-2.0
+---
+# Table of Contents
+1. [Model Summary](#model-summary)
+2. [Use](#use)
+3. [Training](#training)
+4. [Citation](#citation)
+# Model Summary
+> GritLM is a generative-representational instruction-tuned language model. It performs well at both text representation and text generation.
+- **Repository:** [ContextualAI/gritlm](https://github.com/ContextualAI/gritlm)
+- **Paper:** [TODO](https://arxiv.org/abs/2308.07124)
+# Use
+The models usage is documented [here](TODO). It supports GritLM, Transformers, Sentence Transformers.
+# Training
+## Model
+- **Architecture:** Mistral-8x7B
+- **Steps:** 250k pretraining & 30 instruction tuning
+- **Pretraining tokens:** ? pretraining & 2M instruction tuning
+- **Precision:** bfloat16
+## Hardware
+- **Pretraining:**
+  - **GPUs:** 512 Tesla A100
+  - **Training time:** 1 day
+- **Instruction tuning:**
+  - **GPUs:** 8 Tesla A100
+  - **Training time:** 4 hours
+## Software
+https://github.com/ContextualAI/gritlm
+# Citation
+```bibtex
+TODO
+```