News2Topic-T5-base
Model Details
- Model type: Text-to-Text Generation
- Language(s) (NLP): English
- License: MIT License
- Finetuned from model: T5 Base Model (Google AI)
Uses
The News2Topic T5-base model is designed for automatic generation of topic names from news articles or news-like text. It can be integrated into news aggregation platforms, content management systems, or used for enhancing news browsing and searching experiences by providing concise topics.
How to Get Started with the Model
from transformers import pipeline
pipe = pipeline("text2text-generation", model="textgain/News2Topic-T5-base")
news_text = "Your news text here."
print(pipe(news_text))
Training Details
The News2Topic T5-base model was trained on a 21K sample of the "Newsroom" dataset (https://lil.nlp.cornell.edu/newsroom/index.html) annotated with synthetic data generated by GPT-3.5-turbo
The model was trained for 3 epochs, with a learning rate of 0.00001, a maximum sequence length of 512, and a training batch size of 12.
Citation
BibTeX:
@article{Kosar_DePauw_Daelemans_2024,
title={Comparative Evaluation of Topic Detection: Humans vs. LLMs}, volume={13},
url={https://www.clinjournal.org/clinj/article/view/173}, journal={Computational Linguistics in the Netherlands Journal},
author={Kosar, Andriy and De Pauw, Guy and Daelemans, Walter},
year={2024},
month={Mar.},
pages={91–120} }
- Downloads last month
- 8
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social
visibility and check back later, or deploy to Inference Endpoints (dedicated)
instead.