Model Card for Model ID
This is a fine-tune of bart-base
to a sentiment classification dataset.
from transformers import BartTokenizer, AutoModelForSequenceClassification
model = AutoModelForSequenceClassification.from_pretrained('mylonasc/bart-base-twitter-sent-ft-001')
import torch
_phrases = [
'this is a great model! I really like it!',
'Do you call this a model? This is not even 1B parameters! Get outta here!',
'Fine tuning transformers is very easy if you use all the right tools!',
"John couldn't write two correct lines of code without ChatGPT if his life depended on it..."
]
toks = tokenizer(_phrases, return_tensors='pt', padding = 'longest')
with torch.no_grad():
res = model(**toks)[0]
is_positive = res.softmax(1)[:,1]
is_positive
>> tensor([0.9994, 0.1362, 0.9995, 0.3840])
Model Details
Model Description
This is the model card of a 🤗 transformers model that has been pushed on the Hub. This model card has been automatically generated.
- Model type: BART-model (full-transformer)
- Language(s) (NLP): English
Model Sources [optional]
- Repository: [More Information Needed]
Uses
Sentiment classification for english sentences.
[More Information Needed]
Recommendations
This is the output of a short technical implementation project to demonstrate fine-tuning models using the transformer library.
- The model was trained on very short sentences (max 65 tokens)
- Some prelim. benchmarking showed that it does strongly out-perform a Zero-shot BART-large model on sentiment classification.
- It is uncertain how the model will behave in other contexts
It is not recommended to use this model - use at your own risk!
How to Get Started with the Model
[More Information Needed]
Training Details
Training Data
A twitter sentiment classification dataset - not from Huggingface.
[More Information Needed]
Training Procedure
5 epochs with 3e-5 learning rate, 256 batch size. 10% validation set held-out for early stopping (not included in the main set after training).
Preprocessing
- Some removal of rows with a log of special characters
- De-duplication was not necessary (max duplicates were 4, and it was single word tweets)
Hardware
1x RTX4090
- Downloads last month
- 6
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social
visibility and check back later, or deploy to Inference Endpoints (dedicated)
instead.