Edit model card

BLOOM-CLP German (6.4B parameters)

This is a monolingual German language model trained using the CLP-Transfer method based on BLOOM-7b1.

You can try out the model at European Language Grid.

UPDATE: We recently released an instruction-tuned version of this model: malteos/bloom-6b4-clp-german-oasst-v0.1.

How to use

You can use this model directly with a pipeline for text generation. Since the generation relies on some randomness, we set a seed for reproducibility:

>>> from transformers import pipeline, set_seed
>>> generator = pipeline('text-generation', model='malteos/bloom-6b4-clp-german')
>>> set_seed(42)
>>> generator("Hello, I'm a language model,", max_length=30, num_return_sequences=3)

[{'generated_text': "Hello, I'm a language model, a language for thinking, a language for expressing thoughts."},
 {'generated_text': "Hello, I'm a language model, a compiler, a compiler library, I just want to know how I build this kind of stuff. I don"},
 {'generated_text': "Hello, I'm a language model, and also have more than a few of your own, but I understand that they're going to need some help"},]

Training dataset

Code

Hardware

Evaluation

Validation PPL compared to from-scratch training (the lower the better):

Tokens vs PPL

Additional evaluations can be found in our paper.

How to cite

If you are using our code or models, please cite our paper:

@misc{Ostendorff2023clp,
  doi = {10.48550/ARXIV.2301.09626},
  author = {Ostendorff, Malte and Rehm, Georg},
  title = {Efficient Language Model Training through Cross-Lingual and Progressive Transfer Learning},
  publisher = {arXiv},
  year = {2023}
}

License

BigScience BLOOM RAIL 1.0

Downloads last month
95
Safetensors
Model size
6.25B params
Tensor type
FP16
ยท
Inference Examples
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.

Model tree for malteos/bloom-6b4-clp-german

Quantizations
1 model

Dataset used to train malteos/bloom-6b4-clp-german

Spaces using malteos/bloom-6b4-clp-german 3