Model Card for Model ID
Model Details
Model Description
This is the model card of a 🤗 transformers model that has been pushed on the Hub. This model card has been automatically generated.
- Developed by: Two Platforms
- Model type: Tokenizer for SUTRA Models. SUTRA Models are dual transformer based Multilingual LLMs
- Language(s) (NLP): 50+ Languages including English, Hindi, Gujarati, Bengali, Tamil, Korean, Arabic, Japanese, French, German etc.
- License: Proprietary
- Paper: SUTRA: Scalable Multilingual Language Model Architecture
- Demo: SUTRA tokenizer comparison
Citation
BibTeX:
@misc{bendale2023sutra,
author = {Abhijit Bendale and Michael Sapienza and Steven Ripplinger and Simon Gibbs and Jaewon Lee and Pranav Mistry},
title = {SUTRA: Scalable Multilingual Language Model Architecture},
howpublished = {arXiv preprint arXiv:2405.06694},
year = {2024}
}