--- license: apache-2.0 datasets: - agentlans/twitter-sentiment-meta-analysis language: - en base_model: - microsoft/deberta-v3-base pipeline_tag: text-classification --- # DeBERTa-v3 Twitter Sentiment Models This page contains one of two DeBERTa-v3 models (xsmall and base) fine-tuned for Twitter sentiment regression. ## Model Details - **Model Architecture**: DeBERTa-v3 - **Variants**: - xsmall - base - **Task**: Sentiment regression - **Language**: English - **License**: Apache 2.0 ## Intended Use These models are designed for fine-grained sentiment analysis of English tweets. They output a **continuous sentiment score** rather than discrete categories. - negative score means negative sentiment - zero score means neutral sentiment - positive score means positive sentiment - the absolute value of the score represents how strong that sentiment is ## Training Data The models were fine-tuned on a dataset of English tweets collected between September 2009 and January 2010. The sentiment scores were derived from a meta-analysis of 10 different sentiment classifiers using principal component analysis. Find the dataset at [agentlans/twitter-sentiment-meta-analysis](https://huggingface.co/datasets/agentlans/twitter-sentiment-meta-analysis). ## How to use ```python from transformers import AutoTokenizer, AutoModelForSequenceClassification import torch model_name="agentlans/deberta-v3-base-tweet-sentiment" # Put model on GPU or else CPU tokenizer = AutoTokenizer.from_pretrained(model_name) model = AutoModelForSequenceClassification.from_pretrained(model_name) device = torch.device("cuda" if torch.cuda.is_available() else "cpu") model = model.to(device) def sentiment(text): """Processes the text using the model and returns its logits. In this case, it's interpreted as the sentiment score for that text.""" inputs = tokenizer(text, return_tensors="pt", truncation=True, padding=True).to(device) with torch.no_grad(): logits = model(**inputs).logits.squeeze().cpu() return logits.tolist() # Example usage text = [x.strip() for x in """ I absolutely despise this product and regret ever purchasing it. The service at that restaurant was terrible and ruined our entire evening. I'm feeling a bit under the weather today, but it's not too bad. The weather is quite average today, neither good nor bad. The movie was okay, I didn't love it but I didn't hate it either. I'm looking forward to the weekend, it should be nice to relax. This new coffee shop has a really pleasant atmosphere and friendly staff. I'm thrilled with my new job and the opportunities it presents! The concert last night was absolutely incredible, easily the best I've ever seen. I'm overjoyed and grateful for all the love and support from my friends and family. """.strip().split("\n")] for x, s in zip(text, sentiment(text)): print(f"Text: {x}\nSentiment: {round(s, 2)}\n") ``` Output: ```text Text: I absolutely despise this product and regret ever purchasing it. Sentiment: -2.03 Text: The service at that restaurant was terrible and ruined our entire evening. Sentiment: -2.14 Text: I'm feeling a bit under the weather today, but it's not too bad. Sentiment: 0.16 Text: The weather is quite average today, neither good nor bad. Sentiment: 0.09 Text: The movie was okay, I didn't love it but I didn't hate it either. Sentiment: -0.0 Text: I'm looking forward to the weekend, it should be nice to relax. Sentiment: 1.85 Text: This new coffee shop has a really pleasant atmosphere and friendly staff. Sentiment: 2.08 Text: I'm thrilled with my new job and the opportunities it presents! Sentiment: 2.46 Text: The concert last night was absolutely incredible, easily the best I've ever seen. Sentiment: 2.56 Text: I'm overjoyed and grateful for all the love and support from my friends and family. Sentiment: 2.38 ``` ## Performance Evaluation set RMSE: - xsmall: 0.2560 - base: 0.1938 ## Limitations - English language only - Trained specifically on tweets, may or may not generalize well to other text types - Lack of broader context beyond individual tweets - May struggle with detecting sarcasm or nuanced sentiment ## Ethical Considerations - Potential biases in the training data related to the time period and Twitter user demographics - Risk of misuse for large-scale sentiment monitoring without consent