Real-World Sentiment Analysis: Flipkart Product Reviews

Project Overview

This project tackles real-world sentiment analysis by training on user-generated product reviews from Flipkart. By utilizing a top-tier transformer model, DistilBERT, via transfer learning, this project demonstrates how businesses can leverage NLP to extract meaningful insights from customer feedback.

Real-World Applicability

Business Insight: Analyzes customer reviews to provide businesses with critical insights into customer sentiments.
User Reviews Training: Trained on real user reviews, ensuring the model's effectiveness in processing and understanding actual customer opinions.

Installation & Setup

Install the necessary libraries:

pip install datasets transformers evaluate

Data Collection and Preprocessing

This project involves sentiment analysis on real user reviews from Flipkart. The data collection and preprocessing phases were crucial for preparing the dataset for effective model training.

Steps:

Data Collection: The dataset was sourced from Flipkart's customer reviews.
Preprocessing Focus: We focused on the 'Summary' and 'Sentiment' columns of the dataset.
Data Cleaning: The data was cleaned thoroughly to ensure quality and consistency.
Label Conversion: Sentiment labels were converted to a numerical format, suitable for model training.

Model Training

For the sentiment analysis task, we employed transfer learning with the DistilBERT model, renowned for its efficiency and performance.

Training Parameters:

Model: DistilBERT (Distilled Version of BERT)
Learning Rate: 2e-5
Batch Size: 16
Epochs: 3
Evaluation Strategy: Conducted at the end of each epoch

Training Outcome:

The model achieved an accuracy of approximately 95%, demonstrating its reliability and effectiveness in understanding and classifying sentiments.

Results

Training Loss: 0.1373
Validation Accuracy: Approximately 95%

Deployment and Usage

The trained model is hosted on Hugging Face Hub for easy accessibility and deployment.

Model URL:

Arsalan8/my_multiclass_model

Sample Code for Model Usage:

from transformers import AutoTokenizer, AutoModelForSequenceClassification
import torch

# Load the tokenizer and model
tokenizer = AutoTokenizer.from_pretrained("Arsalan8/my_multiclass_model")
model = AutoModelForSequenceClassification.from_pretrained("Arsalan8/my_multiclass_model")

# Example text
text = "This product is great!"
inputs = tokenizer(text, return_tensors="pt")

# Perform the prediction
with torch.no_grad():
    logits = model(**inputs).logits
predicted_class_id = logits.argmax().item()
predicted_label = model.config.id2label[predicted_class_id]

print(f"Predicted sentiment: {predicted_label}")

The project successfully demonstrates how advanced NLP techniques, combined with real user data, can create a robust model applicable in business contexts for sentiment analysis. Its adaptability and accuracy make it a valuable tool for understanding and leveraging customer feedback.

Arsalan8
/

my_multiclass_model