Model Card for Model ID

Tokenizer for LLMs with added Hindi Vocabulary. Trained from the Tiny Llama Tokenizer. Updated vocab size: 35K

Model Details

Model Description

  • Developed by: Atharva Nighot, Shreyas Joshi, Mahek Bagde, and team
  • Model type: Tokenizer
  • Language(s) (NLP): Primarily Hindi and English
  • Finetuned from model: TinyLlama/TinyLlama-1.1B-Chat-v1.0

Training Data

Trained on harshitkaran/Hindi

Downloads last month

-

Downloads are not tracked for this model. How to track
Inference API
Unable to determine this model’s pipeline type. Check the docs .