File size: 3,769 Bytes
c7d6e2f 38a624d c7d6e2f 0abe88e c7d6e2f 38a624d c7d6e2f 0abe88e c7d6e2f e7ee55e 38a624d c7d6e2f 0abe88e c7d6e2f e7ee55e 38a624d c7d6e2f 0abe88e c7d6e2f e7ee55e 38a624d e7ee55e 0abe88e e7ee55e 0abe88e e7ee55e 38a624d c7d6e2f 0abe88e 38a624d c7d6e2f e7ee55e 38a624d c7d6e2f 0abe88e c7d6e2f e7ee55e 38a624d c7d6e2f 0abe88e c7d6e2f e7ee55e 38a624d c7d6e2f 0abe88e c7d6e2f e7ee55e 38a624d c7d6e2f 0abe88e c7d6e2f e7ee55e 38a624d c7d6e2f 34aa0bb 5095e68 c7d6e2f 5095e68 c7d6e2f e32422a c7d6e2f 3a0e6e0 c7d6e2f 3a0e6e0 c7d6e2f 0abe88e |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 |
---
pipeline_tag: text-generation
inference: false
license: apache-2.0
library_name: transformers
model-index:
- name: ibm/PowerLM-3b
results:
- task:
type: text-generation
dataset:
type: lm-eval-harness
name: ARC
metrics:
- name: accuracy-norm
type: accuracy-norm
value: 60.5
verified: false
- task:
type: text-generation
dataset:
type: lm-eval-harness
name: BoolQ
metrics:
- name: accuracy
type: accuracy
value: 72.0
verified: false
- task:
type: text-generation
dataset:
type: lm-eval-harness
name: Hellaswag
metrics:
- name: accuracy-norm
type: accuracy-norm
value: 74.6
verified: false
- task:
type: text-generation
dataset:
type: lm-eval-harness
name: OpenBookQA
metrics:
- name: accuracy-norm
type: accuracy-norm
value: 43.6
verified: false
- task:
type: text-generation
dataset:
type: lm-eval-harness
name: PIQA
metrics:
- name: accuracy-norm
type: accuracy-norm
value: 79.9
verified: false
- task:
type: text-generation
dataset:
type: lm-eval-harness
name: Winogrande
metrics:
- name: accuracy-norm
type: accuracy-norm
value: 70.0
verified: false
- task:
type: text-generation
dataset:
type: lm-eval-harness
name: MMLU (5 shot)
metrics:
- name: accuracy
type: accuracy
value: 49.2
verified: false
- task:
type: text-generation
dataset:
type: lm-eval-harness
name: GSM8k (5 shot)
metrics:
- name: accuracy
type: accuracy
value: 34.9
verified: false
- task:
type: text-generation
dataset:
type: lm-eval-harness
name: math (4 shot)
metrics:
- name: accuracy
type: accuracy
value: 15.2
verified: false
- task:
type: text-generation
dataset:
type: bigcode-eval
name: humaneval
metrics:
- name: pass@1
type: pass@1
value: 26.8
verified: false
- task:
type: text-generation
dataset:
type: bigcode-eval
name: MBPP
metrics:
- name: pass@1
type: pass@1
value: 33.6
verified: false
---
## Model Summary
PowerLM-3B is a 3B state-of-the-art small language model trained with the Power learning rate scheduler. It is trained on a mix of open-source and proprietary datasets. PowerLM-3B has shown promising results compared to other models in the size categories across various benchmarks, including natural language multi-choices, code generation, and math reasoning.
Paper: https://arxiv.org/abs/2408.13359
## Usage
Note: Requires installing HF transformers from source.
### Generation
This is a simple example of how to use **PowerLM-3b** model.
```python
import torch
from transformers import AutoModelForCausalLM, AutoTokenizer
device = "cuda" # or "cpu"
model_path = "ibm/PowerLM-3b"
tokenizer = AutoTokenizer.from_pretrained(model_path)
# drop device_map if running on CPU
model = AutoModelForCausalLM.from_pretrained(model_path, device_map=device)
model.eval()
# change input text as desired
prompt = "Write a code to find the maximum value in a list of numbers."
# tokenize the text
input_tokens = tokenizer(prompt, return_tensors="pt")
# transfer tokenized inputs to the device
for i in input_tokens:
input_tokens[i] = input_tokens[i].to(device)
# generate output tokens
output = model.generate(**input_tokens, max_new_tokens=100)
# decode output tokens into text
output = tokenizer.batch_decode(output)
# loop over the batch to print, in this example the batch size is 1
for i in output:
print(i)
``` |