tiny-lm
This repository provides a tiny 16M parameters language model for debugging and testing purposes. This is created by tuning sbintuitions/tiny-lm with oasset1 datasets in Japanese and English.
How to use
from transformers import AutoModelForCausalLM, AutoTokenizer, pipeline
model = AutoModelForCausalLM.from_pretrained("sbintuitions/tiny-lm-chat", torch_dtype="auto")
tokenizer = AutoTokenizer.from_pretrained("sbintuitions/tiny-lm-chat", use_fast=False)
generator = pipeline("text-generation", model=model, tokenizer=tokenizer)
prompt = tokenizer.apply_chat_template([{"role": "user", "content": "Hello!"}], add_generation_prompt=True, tokenize=False)
print(generator(prompt, max_length=30, do_sample=True, top_k=100))
Model architecture
A 4-layer, 512-hidden-size transformer-based language model.
Training
The model was first pre-trained on English Wikipedia and Japanese Wikipedia to optimize a traditional language modelling objective for 25B tokens. And then it was fine-tuned on oasst1 datasets in Japanese and English for 15 epochs.
License
- Downloads last month
- 1,655
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social
visibility and check back later, or deploy to Inference Endpoints (dedicated)
instead.