Edit model card

cnn_dailymail_6789_3000_1500_test

This is a BERTopic model. BERTopic is a flexible and modular topic modeling framework that allows for the generation of easily interpretable topics from large datasets.

Usage

To use this model, please install BERTopic:

pip install -U bertopic

You can use the model as follows:

from bertopic import BERTopic
topic_model = BERTopic.load("KingKazma/cnn_dailymail_6789_3000_1500_test")

topic_model.get_topic_info()

Topic overview

  • Number of topics: 15
  • Number of training documents: 1500
Click here for an overview of all topics.
Topic ID Topic Keywords Topic Frequency Label
-1 season - league - liverpool - player - club 12 -1_season_league_liverpool_player
0 said - one - police - year - people 151 0_said_one_police_year
1 madrid - league - champions - real - barcelona 1070 1_madrid_league_champions_real
2 chelsea - united - manchester - van - league 55 2_chelsea_united_manchester_van
3 fight - pacquiao - ticket - mayweather - boxing 43 3_fight_pacquiao_ticket_mayweather
4 race - hamilton - rosberg - marathon - vettel 28 4_race_hamilton_rosberg_marathon
5 england - cook - pietersen - cricket - test 25 5_england_cook_pietersen_cricket
6 villa - sherwood - benteke - aston - game 19 6_villa_sherwood_benteke_aston
7 try - minute - huddersfield - bristol - league 17 7_try_minute_huddersfield_bristol
8 celtic - scottish - rangers - game - inverness 15 8_celtic_scottish_rangers_game
9 mcilroy - masters - woods - augusta - golf 14 9_mcilroy_masters_woods_augusta
10 arsenal - wenger - arsenals - reading - coquelin 14 10_arsenal_wenger_arsenals_reading
11 newcastle - sunderland - advocaat - game - rangers 13 11_newcastle_sunderland_advocaat_game
12 cup - toulon - saracens - clermont - bath 12 12_cup_toulon_saracens_clermont
13 stadium - stand - fan - fa - final 12 13_stadium_stand_fan_fa

Training hyperparameters

  • calculate_probabilities: True
  • language: english
  • low_memory: False
  • min_topic_size: 10
  • n_gram_range: (1, 1)
  • nr_topics: None
  • seed_topic_list: None
  • top_n_words: 10
  • verbose: False

Framework versions

  • Numpy: 1.22.4
  • HDBSCAN: 0.8.33
  • UMAP: 0.5.3
  • Pandas: 1.5.3
  • Scikit-Learn: 1.2.2
  • Sentence-transformers: 2.2.2
  • Transformers: 4.31.0
  • Numba: 0.56.4
  • Plotly: 5.13.1
  • Python: 3.10.6
Downloads last month
2
Inference Examples
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.