File size: 2,610 Bytes

93a4fd5

---
license: other
language:
- en
pipeline_tag: text-generation
inference: false
tags:
- transformers
- gguf
- imatrix
- c4ai-command-r-08-2024
---
Quantizations of https://huggingface.co/CohereForAI/c4ai-command-r-08-2024


### Inference Clients/UIs
* [llama.cpp](https://github.com/ggerganov/llama.cpp)
* [JanAI](https://github.com/janhq/jan)
* [KoboldCPP](https://github.com/LostRuins/koboldcpp)
* [text-generation-webui](https://github.com/oobabooga/text-generation-webui)
* [ollama](https://github.com/ollama/ollama)
* [GPT4All](https://github.com/nomic-ai/gpt4all)

---

# From original readme

## Model Summary
<!-- Provide a quick summary of what the model is/does. -->
C4AI Command R 08-2024 is a research release of a 35 billion parameter highly performant generative model. Command R 08-2024 is a large language model with open weights optimized for a variety of use cases including reasoning, summarization, and question answering. Command R 08-2024 has the capability for multilingual generation, trained on 23 languages and evaluated in 10 languages and highly performant RAG capabilities.

Developed by: Cohere and [Cohere For AI](https://cohere.for.ai)

- Point of Contact: Cohere For AI: [cohere.for.ai](https://cohere.for.ai/)
- License: [CC-BY-NC](https://cohere.com/c4ai-cc-by-nc-license), requires also adhering to [C4AI's Acceptable Use Policy](https://docs.cohere.com/docs/c4ai-acceptable-use-policy)
- Model: c4ai-command-r-08-2024
- Model Size: 35 billion parameters
- Context length: 128K

**Try C4AI Command R**

If you want to try Command R before downloading the weights, the model is hosted in a hugging face space [here](https://huggingface.co/spaces/CohereForAI/c4ai-command?model=command-r-08-2024).


**Usage**

Please use `transformers` version 4.39.1 or higher
```python
# pip install 'transformers>=4.39.1'
from transformers import AutoTokenizer, AutoModelForCausalLM

model_id = "CohereForAI/c4ai-command-r-08-2024"
tokenizer = AutoTokenizer.from_pretrained(model_id)
model = AutoModelForCausalLM.from_pretrained(model_id)

# Format message with the command-r-08-2024 chat template
messages = [{"role": "user", "content": "Hello, how are you?"}]
input_ids = tokenizer.apply_chat_template(messages, tokenize=True, add_generation_prompt=True, return_tensors="pt")
## <BOS_TOKEN><|START_OF_TURN_TOKEN|><|USER_TOKEN|>Hello, how are you?<|END_OF_TURN_TOKEN|><|START_OF_TURN_TOKEN|><|CHATBOT_TOKEN|>

gen_tokens = model.generate(
    input_ids, 
    max_new_tokens=100, 
    do_sample=True, 
    temperature=0.3,
)

gen_text = tokenizer.decode(gen_tokens[0])
print(gen_text)
```