Text Generation
Transformers
PyTorch
Safetensors
English
hf_olmo
custom_code
shanearora commited on
Commit
5105e9c
1 Parent(s): 3349ef2

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +8 -17
README.md CHANGED
@@ -14,7 +14,7 @@ language:
14
 
15
  <!-- Provide a quick summary of what the model is/does. -->
16
 
17
- **For transformers versions v4.40.0 or newer, please use [OLMo 7B HF](https://huggingface.co/allenai/OLMo-7B-hf) instead.**
18
 
19
  OLMo is a series of **O**pen **L**anguage **Mo**dels designed to enable the science of language models.
20
  The OLMo models are trained on the [Dolma](https://huggingface.co/datasets/allenai/dolma) dataset.
@@ -44,8 +44,9 @@ In particular, we focus on four revisions of the 7B models:
44
 
45
  To load a specific model revision with HuggingFace, simply add the argument `revision`:
46
  ```bash
47
- import hf_olmo # pip install ai2-olmo
48
- olmo = AutoModelForCausalLM.from_pretrained("allenai/OLMo-7B", revision="step1000-tokens4B")
 
49
  ```
50
 
51
  All revisions/branches are listed in the file `revisions.txt`.
@@ -95,11 +96,10 @@ pip install ai2-olmo
95
  ```
96
  Now, proceed as usual with HuggingFace:
97
  ```python
98
- import hf_olmo
99
 
100
- from transformers import AutoModelForCausalLM, AutoTokenizer
101
- olmo = AutoModelForCausalLM.from_pretrained("allenai/OLMo-7B")
102
- tokenizer = AutoTokenizer.from_pretrained("allenai/OLMo-7B")
103
  message = ["Language modeling is "]
104
  inputs = tokenizer(message, return_tensors='pt', return_token_type_ids=False)
105
  # optional verifying cuda
@@ -109,17 +109,8 @@ response = olmo.generate(**inputs, max_new_tokens=100, do_sample=True, top_k=50,
109
  print(tokenizer.batch_decode(response, skip_special_tokens=True)[0])
110
  >> 'Language modeling is the first step to build natural language generation...'
111
  ```
112
- Alternatively, with the pipeline abstraction:
113
- ```python
114
- import hf_olmo
115
-
116
- from transformers import pipeline
117
- olmo_pipe = pipeline("text-generation", model="allenai/OLMo-7B")
118
- print(olmo_pipe("Language modeling is "))
119
- >> 'Language modeling is a branch of natural language processing that aims to...'
120
- ```
121
 
122
- Or, you can make this slightly faster by quantizing the model, e.g. `AutoModelForCausalLM.from_pretrained("allenai/OLMo-7B", torch_dtype=torch.float16, load_in_8bit=True)` (requires `bitsandbytes`).
123
  The quantized model is more sensitive to typing / cuda, so it is recommended to pass the inputs as `inputs.input_ids.to('cuda')` to avoid potential issues.
124
 
125
  Note, you may see the following error if `ai2-olmo` is not installed correctly, which is caused by internal Python check naming. We'll update the code soon to make this error clearer.
 
14
 
15
  <!-- Provide a quick summary of what the model is/does. -->
16
 
17
+ **For transformers versions v4.40.0 or newer, we suggest using [OLMo 7B HF](https://huggingface.co/allenai/OLMo-7B-hf) instead.**
18
 
19
  OLMo is a series of **O**pen **L**anguage **Mo**dels designed to enable the science of language models.
20
  The OLMo models are trained on the [Dolma](https://huggingface.co/datasets/allenai/dolma) dataset.
 
44
 
45
  To load a specific model revision with HuggingFace, simply add the argument `revision`:
46
  ```bash
47
+ from hf_olmo import OLMoForCausalLM # pip install ai2-olmo
48
+
49
+ olmo = OLMoForCausalLM.from_pretrained("allenai/OLMo-7B", revision="step1000-tokens4B")
50
  ```
51
 
52
  All revisions/branches are listed in the file `revisions.txt`.
 
96
  ```
97
  Now, proceed as usual with HuggingFace:
98
  ```python
99
+ from hf_olmo import OLMoForCausalLM, OLMoTokenizerFast
100
 
101
+ olmo = OLMoForCausalLM.from_pretrained("allenai/OLMo-7B")
102
+ tokenizer = OLMoTokenizerFast.from_pretrained("allenai/OLMo-7B")
 
103
  message = ["Language modeling is "]
104
  inputs = tokenizer(message, return_tensors='pt', return_token_type_ids=False)
105
  # optional verifying cuda
 
109
  print(tokenizer.batch_decode(response, skip_special_tokens=True)[0])
110
  >> 'Language modeling is the first step to build natural language generation...'
111
  ```
 
 
 
 
 
 
 
 
 
112
 
113
+ You can make this slightly faster by quantizing the model, e.g. `AutoModelForCausalLM.from_pretrained("allenai/OLMo-7B", torch_dtype=torch.float16, load_in_8bit=True)` (requires `bitsandbytes`).
114
  The quantized model is more sensitive to typing / cuda, so it is recommended to pass the inputs as `inputs.input_ids.to('cuda')` to avoid potential issues.
115
 
116
  Note, you may see the following error if `ai2-olmo` is not installed correctly, which is caused by internal Python check naming. We'll update the code soon to make this error clearer.