abacaj
/

llama-161M-100B

Text Generation

text-generation-inference

Inference Endpoints

Model card Files Files and versions Community

llama-161M-100B / README.md

abacaj's picture

Update README.md

a513c8c verified 5 months ago

|

691 Bytes

metadata

library_name: transformers
license: apache-2.0

llama-161M

Trained on 100B tokens.

1e-3 LR
0.1 wd
WSD scheduler with 10% decay
80% code, 10% NL, 10% instruction data
Dataset decontaminated against popular benchmarks following bigcode
8x3090s 110~ hours

This is a base pretrained model and requires further fine tuning to be useful.

Model Details

openai/openai_humaneval (greedy)	mbpp (greedy)
9.2%	9.8%