Edit model card

transformers_issues_topics

This is a BERTopic model. BERTopic is a flexible and modular topic modeling framework that allows for the generation of easily interpretable topics from large datasets.

Usage

To use this model, please install BERTopic:

pip install -U bertopic

You can use the model as follows:

from bertopic import BERTopic
topic_model = BERTopic.load("flumboyantApple/transformers_issues_topics")

topic_model.get_topic_info()

Topic overview

  • Number of topics: 30
  • Number of training documents: 9000
Click here for an overview of all topics.
Topic ID Topic Keywords Topic Frequency Label
-1 bert - pytorch - tensorflow - pretrained - gpu 12 -1_bert_pytorch_tensorflow_pretrained
0 tokenizer - tokenizers - tokenization - tokenize - encoderdecoder 2279 0_tokenizer_tokenizers_tokenization_tokenize
1 cuda - pytorch - tensorflow - gpu - gpus 1830 1_cuda_pytorch_tensorflow_gpu
2 modelcard - modelcards - card - model - cards 887 2_modelcard_modelcards_card_model
3 seq2seq - s2s - seq2seqtrainer - seq2seqdataset - runseq2seq 451 3_seq2seq_s2s_seq2seqtrainer_seq2seqdataset
4 trainer - trainertrain - trainers - training - evaluateduringtraining 445 4_trainer_trainertrain_trainers_training
5 albertbasev2 - albertforpretraining - albert - albertformaskedlm - albertmodel 435 5_albertbasev2_albertforpretraining_albert_albertformaskedlm
6 gpt2 - gpt2tokenizer - gpt2xl - gpt2tokenizerfast - gpt 347 6_gpt2_gpt2tokenizer_gpt2xl_gpt2tokenizerfast
7 typos - typo - fix - correction - fixed 278 7_typos_typo_fix_correction
8 readmemd - readmetxt - readme - file - camembertbasereadmemd 274 8_readmemd_readmetxt_readme_file
9 t5 - t5model - tf - t5base - t5large 259 9_t5_t5model_tf_t5base
10 transformerscli - transformers - transformer - importerror - import 228 10_transformerscli_transformers_transformer_importerror
11 ci - testing - tests - testgeneratefp16 - test 198 11_ci_testing_tests_testgeneratefp16
12 longformerforquestionanswering - questionansweringpipeline - tfalbertforquestionanswering - distilbertforquestionanswering - questionanswering 142 12_longformerforquestionanswering_questionansweringpipeline_tfalbertforquestionanswering_distilbertforquestionanswering
13 pipeline - pipelines - ner - fixpipeline - nerpipeline 140 13_pipeline_pipelines_ner_fixpipeline
14 longformer - longformers - longform - longformerlayer - longformermodel 136 14_longformer_longformers_longform_longformerlayer
15 benchmark - benchmarks - accuracy - precision - hardcoded 113 15_benchmark_benchmarks_accuracy_precision
16 onnx - onnxexport - onnxonnxruntime - onnxruntime - 04onnxexport 77 16_onnx_onnxexport_onnxonnxruntime_onnxruntime
17 generationbeamsearchpy - generatebeamsearch - beamsearch - nonbeamsearch - beam 76 17_generationbeamsearchpy_generatebeamsearch_beamsearch_nonbeamsearch
18 flax - flaxelectraformaskedlm - flaxelectraforpretraining - flaxjax - flaxelectramodel 75 18_flax_flaxelectraformaskedlm_flaxelectraforpretraining_flaxjax
19 datacollatorforlanguagemodelingfile - datacollatorforlanguagemodeling - datacollatorforlanguagemodelling - datacollatorforpermutationlanguagemodeling - runlanguagemodelingpy 49 19_datacollatorforlanguagemodelingfile_datacollatorforlanguagemodeling_datacollatorforlanguagemodelling_datacollatorforpermutationlanguagemodeling
20 huggingfacetokenizers297 - huggingfacetransformers - huggingface - huggingfaces - huggingfacecn 43 20_huggingfacetokenizers297_huggingfacetransformers_huggingface_huggingfaces
21 cachedir - cache - cachedpath - caching - cached 43 21_cachedir_cache_cachedpath_caching
22 notebook - notebooks - blenderbot3b - community - blenderbot 35 22_notebook_notebooks_blenderbot3b_community
23 wandbproject - ga - wandbcallback - wandb - fork 33 23_wandbproject_ga_wandbcallback_wandb
24 closed - adding - add - bort - added 32 24_closed_adding_add_bort
25 electra - electrapretrainedmodel - electraformaskedlm - electralarge - electraformultiplechoice 27 25_electra_electrapretrainedmodel_electraformaskedlm_electralarge
26 layoutlm - layout - layoutlmtokenizer - layoutlmbaseuncased - tf 23 26_layoutlm_layout_layoutlmtokenizer_layoutlmbaseuncased
27 pplm - pr - deprecated - variable - ppl 18 27_pplm_pr_deprecated_variable
28 isort - blackisortflake8 - github - repo - version 15 28_isort_blackisortflake8_github_repo

Training hyperparameters

  • calculate_probabilities: False
  • language: english
  • low_memory: False
  • min_topic_size: 10
  • n_gram_range: (1, 1)
  • nr_topics: 30
  • seed_topic_list: None
  • top_n_words: 10
  • verbose: True
  • zeroshot_min_similarity: 0.7
  • zeroshot_topic_list: None

Framework versions

  • Numpy: 1.23.3
  • HDBSCAN: 0.8.38.post1
  • UMAP: 0.5.6
  • Pandas: 1.5.3
  • Scikit-Learn: 1.1.2
  • Sentence-transformers: 3.0.1
  • Transformers: 4.44.1
  • Numba: 0.60.0
  • Plotly: 5.10.0
  • Python: 3.9.18
Downloads last month
5
Inference Examples
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.