---
pipeline_tag: text-generation
inference: true
license: apache-2.0
---

# Table of Contents

1. [Model Summary](#model-summary)
2. [Use](#use)
3. [Training](#training)
4. [Citation](#citation)

# Model Summary

> GritLM is a generative-representational instruction-tuned language model. It performs well at both text representation and text generation.

- **Repository:** [ContextualAI/gritlm](https://github.com/ContextualAI/gritlm)
- **Paper:** [TODO](https://arxiv.org/abs/2308.07124)

# Use

The models usage is documented [here](TODO). It supports GritLM, Transformers, Sentence Transformers.

# Training

## Model

- **Architecture:** Mistral-8x7B
- **Steps:** 250k pretraining & 30 instruction tuning
- **Pretraining tokens:** ? pretraining & 2M instruction tuning
- **Precision:** bfloat16

## Hardware

- **Pretraining:**
  - **GPUs:** 512 Tesla A100
  - **Training time:** 1 day
- **Instruction tuning:**
  - **GPUs:** 8 Tesla A100
  - **Training time:** 4 hours

## Software

https://github.com/ContextualAI/gritlm

# Citation

```bibtex
TODO
```