bertopic_WGnews_Oct31

This is a BERTopic model. BERTopic is a flexible and modular topic modeling framework that allows for the generation of easily interpretable topics from large datasets.

Usage

To use this model, please install BERTopic:

pip install -U bertopic

You can use the model as follows:

from bertopic import BERTopic
topic_model = BERTopic.load("tyrealqian/bertopic_WGnews_Oct31")

topic_model.get_topic_info()

Topic overview

  • Number of topics: 28
  • Number of training documents: 6196
Click here for an overview of all topics.
Topic ID Topic Keywords Topic Frequency Label
-1 beijing - winter - olympics - winter olympics - olympic 18 -1_beijing_winter_olympics_winter olympics
0 gold - medal - olympics - beijing - womens 2054 0_gold_medal_olympics_beijing
1 covid - olympics - beijing - cases - winter 633 1_covid_olympics_beijing_cases
2 gold - gu - womens - chinas - mens 524 2_gold_gu_womens_chinas
3 president - xi - xi jinping - jinping - president xi 388 3_president_xi_xi jinping_jinping
4 boycott - diplomatic - diplomatic boycott - boycott beijing - rights 372 4_boycott_diplomatic_diplomatic boycott_boycott beijing
5 dwen - mascot - bing - bing dwen - dwen dwen 328 5_dwen_mascot_bing_bing dwen
6 ceremony - opening - opening ceremony - beijing - ceremony beijing 305 6_ceremony_opening_opening ceremony_beijing
7 kamila - valieva - kamila valieva - russian - figure 249 7_kamila_valieva_kamila valieva_russian
8 torch - flame - relay - torch relay - olympic 208 8_torch_flame_relay_torch relay
9 venue - ice - venues - zhangjiakou - beijing 194 9_venue_ice_venues_zhangjiakou
10 sports - winter sports - winter - globalink - snow 159 10_sports_winter sports_winter_globalink
11 food - robot - robots - served - serving 122 11_food_robot_robots_served
12 green - carbon - games - beijing - winter 120 12_green_carbon_games_beijing
13 coverage - heres - day - olympics - gold 90 13_coverage_heres_day_olympics
14 bach - thomas bach - thomas - president thomas - ioc 59 14_bach_thomas bach_thomas_president thomas
15 snow - snowfall - heavy - weather - heavy snowfall 48 15_snow_snowfall_heavy_weather
16 bank - commemorative - digital - yuan - set 43 16_bank_commemorative_digital_yuan
17 paralympic - paralympic games - games - paralympic winter - winter paralympic 37 17_paralympic_paralympic games_games_paralympic winter
18 phones - personal - burner - app - smartphonelike 34 18_phones_personal_burner_app
19 nbc - nbcuniversal - ads - ratings - nbcs 31 19_nbc_nbcuniversal_ads_ratings
20 watch beijing - watch - athletes watch - know - names 27 20_watch beijing_watch_athletes watch_know
21 ukraine - invasion - russian - invasion ukraine - ukraine beijing 27 21_ukraine_invasion_russian_invasion ukraine
22 city - summer winter - summer - host summer - city host 27 22_city_summer winter_summer_host summer
23 leduc - nonbinary - timothy leduc - timothy - openly 26 23_leduc_nonbinary_timothy leduc_timothy
24 ralph lauren - lauren - ralph - uniforms - team 26 24_ralph lauren_lauren_ralph_uniforms
25 peng - shuai - peng shuai - tennis - chinese tennis 25 25_peng_shuai_peng shuai_tennis
26 women - female athletes - record - athletes - female 22 26_women_female athletes_record_athletes

Training hyperparameters

  • calculate_probabilities: True
  • language: None
  • low_memory: False
  • min_topic_size: 10
  • n_gram_range: (1, 1)
  • nr_topics: None
  • seed_topic_list: None
  • top_n_words: 10
  • verbose: True
  • zeroshot_min_similarity: 0.7
  • zeroshot_topic_list: None

Framework versions

  • Numpy: 1.26.4
  • HDBSCAN: 0.8.39
  • UMAP: 0.5.7
  • Pandas: 2.2.2
  • Scikit-Learn: 1.5.2
  • Sentence-transformers: 3.2.1
  • Transformers: 4.44.2
  • Numba: 0.60.0
  • Plotly: 5.24.1
  • Python: 3.10.12
Downloads last month
5
Inference Examples
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.