File size: 16,871 Bytes
627ebfb |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 |
---
tags:
- bertopic
library_name: bertopic
pipeline_tag: text-classification
---
# BERTopic
This is a [BERTopic](https://github.com/MaartenGr/BERTopic) model.
BERTopic is a flexible and modular topic modeling framework that allows for the generation of easily interpretable topics from large datasets.
## Usage
To use this model, please install BERTopic:
```
pip install -U bertopic
```
You can use the model as follows:
```python
from bertopic import BERTopic
topic_model = BERTopic.load("keonju/BERTopic")
topic_model.get_topic_info()
```
## Topic overview
* Number of topics: 158
* Number of training documents: 10158
<details>
<summary>Click here for an overview of all topics.</summary>
| Topic ID | Topic Keywords | Topic Frequency | Label |
|----------|----------------|-----------------|-------|
| -1 | and - the - of - in - to | 10 | -1_and_the_of_in |
| 0 | holocene - china - the - monsoon - bp | 3858 | 0_holocene_china_the_monsoon |
| 1 | energy - biofuels - production - biodiesel - bioenergy | 291 | 1_energy_biofuels_production_biodiesel |
| 2 | coal - coals - the - basin - seams | 248 | 2_coal_coals_the_basin |
| 3 | yr - holocene - the - bp - and | 205 | 3_yr_holocene_the_bp |
| 4 | hg - mercury - mehg - of hg - in | 202 | 4_hg_mercury_mehg_of hg |
| 5 | ch4 - methane - emissions - fluxes - flux | 159 | 5_ch4_methane_emissions_fluxes |
| 6 | data - forest - spectral - for - mapping | 118 | 6_data_forest_spectral_for |
| 7 | bp - the - holocene - pollen - lake | 116 | 7_bp_the_holocene_pollen |
| 8 | wetlands - wetland - and - are - of | 104 | 8_wetlands_wetland_and_are |
| 9 | co2 - ecosystem - nee - exchange - net | 103 | 9_co2_ecosystem_nee_exchange |
| 10 | species - of - fen - the - restoration | 100 | 10_species_of_fen_the |
| 11 | peat - tropical - peatlands - palm - peatland | 98 | 11_peat_tropical_peatlands_palm |
| 12 | pb - lead - atmospheric - metal - deposition | 96 | 12_pb_lead_atmospheric_metal |
| 13 | the - lake - of the - of - poland | 93 | 13_the_lake_of the_of |
| 14 | pm2 - haze - burning - air - aerosol | 90 | 14_pm2_haze_burning_air |
| 15 | doc - catchments - carbon - organic carbon - export | 88 | 15_doc_catchments_carbon_organic carbon |
| 16 | the - carbon - of - co2 - of the | 73 | 16_the_carbon_of_co2 |
| 17 | wetland - wetlands - classification - mapping - and | 69 | 17_wetland_wetlands_classification_mapping |
| 18 | uv - ozone - o3 - isoprene - elevated | 67 | 18_uv_ozone_o3_isoprene |
| 19 | mediterranean - the - glacial - iberian - during | 66 | 19_mediterranean_the_glacial_iberian |
| 20 | media - compost - growing media - growing - biochar | 63 | 20_media_compost_growing media_growing |
| 21 | 137cs - of 137cs - sup - ce sup - radiocaesium | 63 | 21_137cs_of 137cs_sup_ce sup |
| 22 | testate - amoebae - testate amoebae - of testate - amoeba | 62 | 22_testate_amoebae_testate amoebae_of testate |
| 23 | peat - pyrolysis - lignin - gc - of | 62 | 23_peat_pyrolysis_lignin_gc |
| 24 | cu - zn - metals - peat - elements | 62 | 24_cu_zn_metals_peat |
| 25 | alkanes - alkane - chain - values - plants | 61 | 25_alkanes_alkane_chain_values |
| 26 | permafrost - active layer - thermal - ground - layer | 60 | 26_permafrost_active layer_thermal_ground |
| 27 | streams - diatom - species - macroinvertebrate - stream | 60 | 27_streams_diatom_species_macroinvertebrate |
| 28 | records - the - of - record - ireland | 60 | 28_records_the_of_record |
| 29 | water - flow - groundwater - recharge - runoff | 59 | 29_water_flow_groundwater_recharge |
| 30 | habitat - species - breeding - bird - nest | 57 | 30_habitat_species_breeding_bird |
| 31 | brgdgts - gdgts - glycerol - brgdgt - branched | 56 | 31_brgdgts_gdgts_glycerol_brgdgt |
| 32 | deposition - nitrogen - nitrogen deposition - sphagnum - of | 55 | 32_deposition_nitrogen_nitrogen deposition_sphagnum |
| 33 | oil sands - sands - fen - oil - reclamation | 54 | 33_oil sands_sands_fen_oil |
| 34 | fire - burned - severity - burning - post fire | 54 | 34_fire_burned_severity_burning |
| 35 | acidification - deposition - acid - ph - catchment | 54 | 35_acidification_deposition_acid_ph |
| 36 | farm - land - agricultural - farmers - policy | 53 | 36_farm_land_agricultural_farmers |
| 37 | cdom - doc - dom - dissolved organic - dissolved | 53 | 37_cdom_doc_dom_dissolved organic |
| 38 | redd - indonesia - deforestation - in indonesia - forest | 50 | 38_redd_indonesia_deforestation_in indonesia |
| 39 | ash - wood ash - wood - growth - of wood | 49 | 39_ash_wood ash_wood_growth |
| 40 | fungal - fungi - mycorrhizal - species - root | 49 | 40_fungal_fungi_mycorrhizal_species |
| 41 | stand - growth - models - tree - stands | 49 | 41_stand_growth_models_tree |
| 42 | smouldering - smoldering - spread - peat - combustion | 49 | 42_smouldering_smoldering_spread_peat |
| 43 | pollen - of pollen - vegetation - of - from | 49 | 43_pollen_of pollen_vegetation_of |
| 44 | arsenic - as - of as - fe - of arsenic | 49 | 44_arsenic_as_of as_fe |
| 45 | ch4 - methane - production - peat - methanogenesis | 47 | 45_ch4_methane_production_peat |
| 46 | africa - the - bp - south - late | 46 | 46_africa_the_bp_south |
| 47 | soc - carbon - soil - stocks - land | 45 | 47_soc_carbon_soil_stocks |
| 48 | soil - organic - carbon - soil organic - soils | 45 | 48_soil_organic_carbon_soil organic |
| 49 | wetlands - constructed - wetland - treatment - phosphorus | 43 | 49_wetlands_constructed_wetland_treatment |
| 50 | microbial - rare - soil - bacterial - diversity | 43 | 50_microbial_rare_soil_bacterial |
| 51 | litter - decomposition - mass loss - litter decomposition - mass | 39 | 51_litter_decomposition_mass loss_litter decomposition |
| 52 | co2 - pco2 - emissions - carbon - ch4 | 39 | 52_co2_pco2_emissions_carbon |
| 53 | soc - carbon - wetland - wetlands - soil | 39 | 53_soc_carbon_wetland_wetlands |
| 54 | countries - emissions - emission - to - climate | 38 | 54_countries_emissions_emission_to |
| 55 | services - ecosystem - ecosystem services - es - pes | 37 | 55_services_ecosystem_ecosystem services_es |
| 56 | catalyst - peat - pyrolysis - char - catalysts | 37 | 56_catalyst_peat_pyrolysis_char |
| 57 | clearfelling - water - phosphorus - buffer - nutrient | 35 | 57_clearfelling_water_phosphorus_buffer |
| 58 | forest - forests - trees - tree - stands | 35 | 58_forest_forests_trees_tree |
| 59 | carbon - climate - atmosphere - earth - carbon cycle | 34 | 59_carbon_climate_atmosphere_earth |
| 60 | tephra - volcanic - cryptotephra - eruptions - tephras | 34 | 60_tephra_volcanic_cryptotephra_eruptions |
| 61 | testate - arcellinida - coi - species - amoebae | 34 | 61_testate_arcellinida_coi_species |
| 62 | methane - methanogenic - community - methanogen - methanogens | 34 | 62_methane_methanogenic_community_methanogen |
| 63 | consolidation - soil - embankment - road - the | 33 | 63_consolidation_soil_embankment_road |
| 64 | species - spider - bogs - spiders - habitat | 33 | 64_species_spider_bogs_spiders |
| 65 | evaporation - energy - model - was - the | 33 | 65_evaporation_energy_model_was |
| 66 | phosphorus - catchment - in - tp - concentrations | 33 | 66_phosphorus_catchment_in_tp |
| 67 | co2 - ch4 - marsh - wetland - emissions | 33 | 67_co2_ch4_marsh_wetland |
| 68 | runoff - peat - channels - flow - catchment | 33 | 68_runoff_peat_channels_flow |
| 69 | nutrient - nitrogen - fertilizer - litter - of | 32 | 69_nutrient_nitrogen_fertilizer_litter |
| 70 | brazil - bp - the - of - in the | 31 | 70_brazil_bp_the_of |
| 71 | tsunami - holocene - the - volcanic - deposits | 30 | 71_tsunami_holocene_the_volcanic |
| 72 | climate change - change - climate - biodiversity - ecosystem | 30 | 72_climate change_change_climate_biodiversity |
| 73 | gpr - resistivity - radar - penetrating - penetrating radar | 29 | 73_gpr_resistivity_radar_penetrating |
| 74 | holocene - the - andes - and - bp | 29 | 74_holocene_the_andes_and |
| 75 | permafrost - soc - soil - soils - arctic | 28 | 75_permafrost_soc_soil_soils |
| 76 | policy - forest - owners - arguments - forest owners | 28 | 76_policy_forest_owners_arguments |
| 77 | bog - poland - peatland - europe - ca | 28 | 77_bog_poland_peatland_europe |
| 78 | ch4 - oxidation - methane - paddy - aom | 28 | 78_ch4_oxidation_methane_paddy |
| 79 | enzyme - enzymes - eea - soil - activities | 28 | 79_enzyme_enzymes_eea_soil |
| 80 | channel - catchment - flow - bends - model | 28 | 80_channel_catchment_flow_bends |
| 81 | soil - soil science - science - of soil - eu | 27 | 81_soil_soil science_science_of soil |
| 82 | pahs - pah - polycyclic aromatic - polycyclic - aromatic | 27 | 82_pahs_pah_polycyclic aromatic_polycyclic |
| 83 | n2o - n2o emissions - emissions - emission - nitrous | 26 | 83_n2o_n2o emissions_emissions_emission |
| 84 | peat water - adsorption - electrocoagulation - brackish peat - brackish peat water | 26 | 84_peat water_adsorption_electrocoagulation_brackish peat |
| 85 | mangrove - mangroves - carbon - coastal - b2 | 26 | 85_mangrove_mangroves_carbon_coastal |
| 86 | species - retention - alien - richness - forests | 25 | 86_species_retention_alien_richness |
| 87 | colloidal - river - elements - fe - colloids | 25 | 87_colloidal_river_elements_fe |
| 88 | sulfate - sulfur - 34s - peat - sulphur | 24 | 88_sulfate_sulfur_34s_peat |
| 89 | caribou - habitat - woodland caribou - populations - wolf | 24 | 89_caribou_habitat_woodland caribou_populations |
| 90 | food - agriculture - food system - change - covid 19 | 24 | 90_food_agriculture_food system_change |
| 91 | microbial - community - microbial community - communities - bacterial | 23 | 91_microbial_community_microbial community_communities |
| 92 | sorption - cu - ions - ii - cu ii | 22 | 92_sorption_cu_ions_ii |
| 93 | fire - fires - algorithm - frp - hotspot | 22 | 93_fire_fires_algorithm_frp |
| 94 | choice - wtp - preferences - valuation - choice experiment | 22 | 94_choice_wtp_preferences_valuation |
| 95 | nematodes - earthworm - soil - food - nematode | 22 | 95_nematodes_earthworm_soil_food |
| 96 | conservation - orangutan - habitat - forest - species | 21 | 96_conservation_orangutan_habitat_forest |
| 97 | cushion - accumulation - peat - amazonian - vegetation | 21 | 97_cushion_accumulation_peat_amazonian |
| 98 | ch4 - oxidation - ch4 oxidation - uptake - ch4 uptake | 20 | 98_ch4_oxidation_ch4 oxidation_uptake |
| 99 | tidal - sediment - coastal - delta - the | 20 | 99_tidal_sediment_coastal_delta |
| 100 | emissions - co2 - ghg - n2o - table | 20 | 100_emissions_co2_ghg_n2o |
| 101 | methane - ph - cytochrome - methanotrophs - acetic acid | 20 | 101_methane_ph_cytochrome_methanotrophs |
| 102 | patterns - model - self organization - evolutionary - self | 20 | 102_patterns_model_self organization_evolutionary |
| 103 | nitrogen - denitrification - n2o - soil - n2 | 20 | 103_nitrogen_denitrification_n2o_soil |
| 104 | birch - rotation - biomass - buds - biomass production | 19 | 104_birch_rotation_biomass_buds |
| 105 | fire - wildfire - fires - wildfires - health | 19 | 105_fire_wildfire_fires_wildfires |
| 106 | grazing - heathland - heather - moorland - england | 19 | 106_grazing_heathland_heather_moorland |
| 107 | emissions - fire - burning - fire emissions - biomass burning | 19 | 107_emissions_fire_burning_fire emissions |
| 108 | peat - landslides - failure - of peat - peat compaction | 18 | 108_peat_landslides_failure_of peat |
| 109 | biochar - straw - soil - fe - bc | 18 | 109_biochar_straw_soil_fe |
| 110 | ecosystem - respiration - carbon - ecosystem respiration - meadow | 17 | 110_ecosystem_respiration_carbon_ecosystem respiration |
| 111 | wetland - wetlands - risk - of wetland - the wetland | 17 | 111_wetland_wetlands_risk_of wetland |
| 112 | dom - thm - groundwater - molecular - organic | 17 | 112_dom_thm_groundwater_molecular |
| 113 | geochemistry - landscape geochemistry - rocks - peat - mafic | 17 | 113_geochemistry_landscape geochemistry_rocks_peat |
| 114 | tundra - ch4 - n2o - fluxes - antarctic | 16 | 114_tundra_ch4_n2o_fluxes |
| 115 | cellulose - sphagnum - isotopic - isotope - δ18ocel | 16 | 115_cellulose_sphagnum_isotopic_isotope |
| 116 | solute - transport - chloride - peat - pore | 16 | 116_solute_transport_chloride_peat |
| 117 | charcoal - fire - fires - holocene - fire history | 15 | 117_charcoal_fire_fires_holocene |
| 118 | ghg - agricultural - dairy - abatement - emissions | 15 | 118_ghg_agricultural_dairy_abatement |
| 119 | palm - oil - palm oil - sustainability - industry | 15 | 119_palm_oil_palm oil_sustainability |
| 120 | humic - humic substances - substances - acids - fluorescence | 15 | 120_humic_humic substances_substances_acids |
| 121 | canopy - ndvi - pri - lue - phenological | 15 | 121_canopy_ndvi_pri_lue |
| 122 | pollen - bog - peat - the - human impact | 15 | 122_pollen_bog_peat_the |
| 123 | marshes - tidal - marshes are - salt - or | 15 | 123_marshes_tidal_marshes are_salt |
| 124 | soil - prediction - mapping - covariates - dsm | 15 | 124_soil_prediction_mapping_covariates |
| 125 | si - of si - silicon - biogenic - protozoic | 14 | 125_si_of si_silicon_biogenic |
| 126 | et - evapotranspiration - le - wetland - rice | 14 | 126_et_evapotranspiration_le_wetland |
| 127 | forest - finland - forests - stock - management | 14 | 127_forest_finland_forests_stock |
| 128 | iodine - 129i - sorption - iodide - the sorption | 14 | 128_iodine_129i_sorption_iodide |
| 129 | palm - oil - palm oil - smallholders - certification | 14 | 129_palm_oil_palm oil_smallholders |
| 130 | dndc - model - models - soil - carbon | 14 | 130_dndc_model_models_soil |
| 131 | snow - thaw - cover - sca - data | 14 | 131_snow_thaw_cover_sca |
| 132 | stx2 - microbiota - gut - gut microbiota - microbial | 13 | 132_stx2_microbiota_gut_gut microbiota |
| 133 | dom - doc - organic - dissolved organic - of dom | 13 | 133_dom_doc_organic_dissolved organic |
| 134 | forest - cbm - ontario - cfs3 - cbm cfs3 | 13 | 134_forest_cbm_ontario_cfs3 |
| 135 | wind - wind farms - farms - onshore - onshore wind | 13 | 135_wind_wind farms_farms_onshore |
| 136 | uranium - of uranium - 232th - th - ar | 13 | 136_uranium_of uranium_232th_th |
| 137 | groundwater - springs - spring - gdes - discharge | 13 | 137_groundwater_springs_spring_gdes |
| 138 | fire - forest - boreal - burned - fires | 13 | 138_fire_forest_boreal_burned |
| 139 | metal - metals - cd - sediments - zn | 13 | 139_metal_metals_cd_sediments |
| 140 | slr - sea level - coastal - sea - sea level rise | 13 | 140_slr_sea level_coastal_sea |
| 141 | damo - methane - anaerobic - oxidation - aom | 12 | 141_damo_methane_anaerobic_oxidation |
| 142 | temperature - microbial - soil - co2 - pd | 12 | 142_temperature_microbial_soil_co2 |
| 143 | soil - respiration - root - soil respiration - enchytraeid | 12 | 143_soil_respiration_root_soil respiration |
| 144 | kerp - fusiformisporites - permian - genus - flora | 11 | 144_kerp_fusiformisporites_permian_genus |
| 145 | dust - dust deposition - dust sources - deposition - atmospheric dust | 11 | 145_dust_dust deposition_dust sources_deposition |
| 146 | methane - sources - ch4 - les - de | 11 | 146_methane_sources_ch4_les |
| 147 | n2o - n2o emissions - emissions - permafrost - n2o fluxes | 11 | 147_n2o_n2o emissions_emissions_permafrost |
| 148 | australia - mis - record - ka - crater | 11 | 148_australia_mis_record_ka |
| 149 | oc - fjords - fjord - lakes - of oc | 10 | 149_oc_fjords_fjord_lakes |
| 150 | fe - reduction - fe iii - sr10 - iron | 10 | 150_fe_reduction_fe iii_sr10 |
| 151 | loading - eutrophication - nitrogen - coastal - phytoplankton | 10 | 151_loading_eutrophication_nitrogen_coastal |
| 152 | model - wetlands - groundwater - water - the wetlands | 10 | 152_model_wetlands_groundwater_water |
| 153 | co2 - soil - co2 efflux - soil co2 efflux - soil co2 | 10 | 153_co2_soil_co2 efflux_soil co2 efflux |
| 154 | transfer - transfer functions - transfer function - testate - functions | 10 | 154_transfer_transfer functions_transfer function_testate |
| 155 | peat - spain - bog - matter - autofluorescent | 10 | 155_peat_spain_bog_matter |
| 156 | isbas - insar - subsidence - motion - deformation | 10 | 156_isbas_insar_subsidence_motion |
</details>
## Training hyperparameters
* calculate_probabilities: False
* language: None
* low_memory: False
* min_topic_size: 10
* n_gram_range: (1, 3)
* nr_topics: None
* seed_topic_list: None
* top_n_words: 30
* verbose: False
## Framework versions
* Numpy: 1.22.4
* HDBSCAN: 0.8.29
* UMAP: 0.5.3
* Pandas: 1.5.3
* Scikit-Learn: 1.2.2
* Sentence-transformers: 2.2.2
* Transformers: 4.30.2
* Numba: 0.56.4
* Plotly: 5.13.1
* Python: 3.10.12
|