Real-World Sentiment Analysis: Flipkart Product Reviews
Project Overview
This project tackles real-world sentiment analysis by training on user-generated product reviews from Flipkart. By utilizing a top-tier transformer model, DistilBERT, via transfer learning, this project demonstrates how businesses can leverage NLP to extract meaningful insights from customer feedback.
Real-World Applicability
- Business Insight: Analyzes customer reviews to provide businesses with critical insights into customer sentiments.
- User Reviews Training: Trained on real user reviews, ensuring the model's effectiveness in processing and understanding actual customer opinions.
Installation & Setup
Install the necessary libraries:
pip install datasets transformers evaluate
Data Collection and Preprocessing
This project involves sentiment analysis on real user reviews from Flipkart. The data collection and preprocessing phases were crucial for preparing the dataset for effective model training.
Steps:
- Data Collection: The dataset was sourced from Flipkart's customer reviews.
- Preprocessing Focus: We focused on the 'Summary' and 'Sentiment' columns of the dataset.
- Data Cleaning: The data was cleaned thoroughly to ensure quality and consistency.
- Label Conversion: Sentiment labels were converted to a numerical format, suitable for model training.
Model Training
For the sentiment analysis task, we employed transfer learning with the DistilBERT model, renowned for its efficiency and performance.
Training Parameters:
- Model: DistilBERT (Distilled Version of BERT)
- Learning Rate: 2e-5
- Batch Size: 16
- Epochs: 3
- Evaluation Strategy: Conducted at the end of each epoch
Training Outcome:
The model achieved an accuracy of approximately 95%, demonstrating its reliability and effectiveness in understanding and classifying sentiments.
Results
- Training Loss: 0.1373
- Validation Accuracy: Approximately 95%
Deployment and Usage
The trained model is hosted on Hugging Face Hub for easy accessibility and deployment.
Model URL:
Sample Code for Model Usage:
from transformers import AutoTokenizer, AutoModelForSequenceClassification
import torch
# Load the tokenizer and model
tokenizer = AutoTokenizer.from_pretrained("Arsalan8/my_multiclass_model")
model = AutoModelForSequenceClassification.from_pretrained("Arsalan8/my_multiclass_model")
# Example text
text = "This product is great!"
inputs = tokenizer(text, return_tensors="pt")
# Perform the prediction
with torch.no_grad():
logits = model(**inputs).logits
predicted_class_id = logits.argmax().item()
predicted_label = model.config.id2label[predicted_class_id]
print(f"Predicted sentiment: {predicted_label}")
The project successfully demonstrates how advanced NLP techniques, combined with real user data, can create a robust model applicable in business contexts for sentiment analysis. Its adaptability and accuracy make it a valuable tool for understanding and leveraging customer feedback.
- Downloads last month
- 17
Model tree for Arsalan8/my_multiclass_model
Base model
distilbert/distilbert-base-uncased