AG-News BERT Classification
Model Details
Model Name: AG-News BERT Classification
Model Type: Text Classification
Developer: Mansoor Hamidzadeh
Repository: mansoorhamidzadeh/ag-news-bert-classification
Language(s): English
License: MIT
Model Description
Overview
The AG-News BERT Classification model is a fine-tuned BERT (Bidirectional Encoder Representations from Transformers) model designed for text classification tasks, specifically for classifying news articles into four categories: World, Sports, Business, and Sci/Tech. The model leverages the pre-trained BERT architecture, which has been fine-tuned on the AG-News dataset to optimize its performance for this specific task.
Intended Use
Primary Use Case
The primary use case for this model is to automatically classify news articles into one of the four predefined categories:
- World
- Sports
- Business
- Sci/Tech
This can be useful for news aggregation services, content recommendation systems, and any application that requires automated content categorization.
Applications
- News aggregators and curators
- Content recommendation engines
- Media monitoring tools
- Sentiment analysis and trend detection in news
Training Data
Dataset
- Name: AG-News Dataset
- Source: AG News Corpus
- Description: The AG-News dataset is a widely used benchmark dataset for text classification. It contains 120,000 training samples and 7,600 test samples of news articles categorized into four classes: World, Sports, Business, and Sci/Tech.
Data Preprocessing
The text data was preprocessed to tokenize the sentences using the BERT tokenizer, converting the tokens to their corresponding IDs, and creating attention masks.
Training Procedure
Training Configuration:
- Number of Epochs: 4
- Batch Size: 8
- Learning Rate: 1e-5
- Optimizer: AdamW
Training and Validation Losses:
- Epoch 1:
- Average training loss: 0.1330
- Average test loss: 0.1762
- Epoch 2:
- Average training loss: 0.0918
- Average test loss: 0.1733
- Epoch 3:
- Average training loss: 0.0622
- Average test loss: 0.1922
- Epoch 4:
- Average training loss: 0.0416
- Average test loss: 0.2305
Hardware:
- Training Environment: NVIDIA P100 GPU
- Training Time: Approximately 3 hours
Performance
Evaluation Metrics
The model was evaluated using standard text classification metrics:
- Accuracy
- Precision
- Recall
- F1 Score
Results
On the AG-News test set, the model achieved the following performance:
- Accuracy: 93.8%
- Precision: 93.8%
- Recall: 93.8%
- F1 Score: 93.8%
Limitations and Biases
Limitations
- The model may not generalize well to other text types or news sources outside the AG-News dataset.
- Primarily designed for English text and may not perform well on text in other languages.
Biases
- Potential biases present in the training data, reflecting biases in news reporting.
- Category-specific biases due to the distribution of articles in the dataset.
Ethical Considerations
- Ensure the model is used in compliance with user privacy and data security standards.
- Be aware of potential biases and take steps to mitigate negative impacts, especially in sensitive applications.
How to Use
Inference
To use the model for inference, load it using the Hugging Face Transformers library:
from transformers import BertTokenizer, BertForSequenceClassification
from transformers import TextClassificationPipeline
tokenizer = BertTokenizer.from_pretrained("mansoorhamidzadeh/ag-news-bert-classification")
model = BertForSequenceClassification.from_pretrained("mansoorhamidzadeh/ag-news-bert-classification")
pipeline = TextClassificationPipeline(model=model, tokenizer=tokenizer)
text = "Sample news article text here."
prediction = pipeline(text)
print(prediction)
@misc{mansoorhamidzadeh,
author = {Mansoor Hamidzadeh},
title = {AG-News BERT Classification},
year = {2024},
publisher = {Hugging Face},
howpublished = {\url{https://huggingface.co/mansoorhamidzadeh/ag-news-bert-classification}},
}
- Downloads last month
- 12