Sentiment Classification for hinglish text: gk-hinglish-sentiment
Model description
Trained small amount of reviews dataset
Intended uses & limitations
I wanted something to work well with hinglish data as it is being used in India mostly. The training data was not much as expected
How to use
#sample code
from transformers import BertTokenizer, BertForSequenceClassification
tokenizerg = BertTokenizer.from_pretrained("/content/model")
modelg = BertForSequenceClassification.from_pretrained("/content/model")
text = "kuch bhi type karo hinglish mai"
encoded_input = tokenizerg(text, return_tensors='pt')
output = modelg(**encoded_input)
print(output)
#output contains 3 lables LABEL_0 = Negative ,LABEL_1 = Nuetral ,LABEL_2 = Positive
Limitations and bias
The data contains only hinglish codemixed text it and was very much limited may be I will Update this model if I can get good amount of data
Training data
Training data contains labeled data for 3 labels
link to the pre-trained model card with description of the pre-training data. I have Tuned below model
https://huggingface.co/rohanrajpal/bert-base-multilingual-codemixed-cased-sentiment
BibTeX entry and citation info
title = "{GLUEC}o{S}: An Evaluation Benchmark for Code-Switched {NLP}",
author = "Khanuja, Simran and
Dandapat, Sandipan and
Srinivasan, Anirudh and
Sitaram, Sunayana and
Choudhury, Monojit",
booktitle = "Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics",
month = jul,
year = "2020",
address = "Online",
publisher = "Association for Computational Linguistics",
url = "https://www.aclweb.org/anthology/2020.acl-main.329",
pages = "3575--3585"
}
- Downloads last month
- 264
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social
visibility and check back later, or deploy to Inference Endpoints (dedicated)
instead.