xsum_6789_3000_1500_test
This is a BERTopic model. BERTopic is a flexible and modular topic modeling framework that allows for the generation of easily interpretable topics from large datasets.
Usage
To use this model, please install BERTopic:
pip install -U bertopic
You can use the model as follows:
from bertopic import BERTopic
topic_model = BERTopic.load("KingKazma/xsum_6789_3000_1500_test")
topic_model.get_topic_info()
Topic overview
- Number of topics: 27
- Number of training documents: 1500
Click here for an overview of all topics.
Topic ID | Topic Keywords | Topic Frequency | Label |
---|---|---|---|
-1 | said - people - would - also - one | 10 | -1_said_people_would_also |
0 | police - said - court - mr - found | 508 | 0_police_said_court_mr |
1 | mr - us - said - president - military | 144 | 1_mr_us_said_president |
2 | sport - team - world - race - champion | 136 | 2_sport_team_world_race |
3 | wales - vote - party - said - labour | 96 | 3_wales_vote_party_said |
4 | foul - win - right - box - half | 84 | 4_foul_win_right_box |
5 | care - nhs - tax - said - health | 62 | 5_care_nhs_tax_said |
6 | league - club - season - appearance - football | 50 | 6_league_club_season_appearance |
7 | wicket - cricket - england - ball - test | 36 | 7_wicket_cricket_england_ball |
8 | rate - share - bank - growth - price | 35 | 8_rate_share_bank_growth |
9 | rugby - england - wales - player - ospreys | 31 | 9_rugby_england_wales_player |
10 | school - teacher - education - child - council | 29 | 10_school_teacher_education_child |
11 | road - crash - police - collision - barrier | 27 | 11_road_crash_police_collision |
12 | fire - said - rescue - plane - injured | 27 | 12_fire_said_rescue_plane |
13 | music - radio - band - singer - show | 27 | 13_music_radio_band_singer |
14 | passenger - airport - railway - said - scotrail | 24 | 14_passenger_airport_railway_said |
15 | museum - painting - said - collection - royal | 23 | 15_museum_painting_said_collection |
16 | road - flooding - weather - beach - rain | 22 | 16_road_flooding_weather_beach |
17 | eu - trade - european - bank - deal | 19 | 17_eu_trade_european_bank |
18 | cell - cancer - ebola - disease - human | 18 | 18_cell_cancer_ebola_disease |
19 | temperature - dr - glacier - heat - researcher | 16 | 19_temperature_dr_glacier_heat |
20 | bitcoin - software - android - superfish - battery | 15 | 20_bitcoin_software_android_superfish |
21 | club - football - league - manager - rodgers | 14 | 21_club_football_league_manager |
22 | zwolle - pec - ajax - zidane - real | 13 | 22_zwolle_pec_ajax_zidane |
23 | film - best - actress - role - gillan | 12 | 23_film_best_actress_role |
24 | women - mexico - denmark - footed - romania | 12 | 24_women_mexico_denmark_footed |
25 | dairy - comedy - uk - export - food | 10 | 25_dairy_comedy_uk_export |
Training hyperparameters
- calculate_probabilities: True
- language: english
- low_memory: False
- min_topic_size: 10
- n_gram_range: (1, 1)
- nr_topics: None
- seed_topic_list: None
- top_n_words: 10
- verbose: False
Framework versions
- Numpy: 1.22.4
- HDBSCAN: 0.8.33
- UMAP: 0.5.3
- Pandas: 1.5.3
- Scikit-Learn: 1.2.2
- Sentence-transformers: 2.2.2
- Transformers: 4.31.0
- Numba: 0.57.1
- Plotly: 5.13.1
- Python: 3.10.12
- Downloads last month
- 0
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social
visibility and check back later, or deploy to Inference Endpoints (dedicated)
instead.