CCRO2 / README.md
Corran's picture
Add SetFit model
c5c4986 verified
|
raw
history blame
15.7 kB
metadata
library_name: setfit
tags:
  - setfit
  - sentence-transformers
  - text-classification
  - generated_from_setfit_trainer
metrics:
  - accuracy
widget:
  - text: >-
      considering the use of so-called “fractional citations” in which one
      divides the number of citations associated with a given paper by the
      number of authors on that paper [33–38];
  - text: >-
      Indeed, this is only one of a number of such practical inconsistencies
      inherent in the traditional h-index; other similar inconsistencies are
      discussed in Refs. [3, 4].
  - text: >-
      One of the referees recommends mentioning Quesada (2008) as another
      characterization of the Hirsch index relying as well on monotonicity.
  - text: >-
      considering the use of so-called “fractional citations” in which one
      divides the number of citations associated with a given paper by the
      number of authors on that paper [33–38];
  - text: >-
      increasing the weighting of very highly-cited papers, either through the
      introduction of intrinsic weighting factors or the development of entirely
      new indices which mix the h-index with other more traditional indices
      (such as total citation count) [3, 4, 7, 8, 26–32];
pipeline_tag: text-classification
inference: true
base_model: sentence-transformers/paraphrase-multilingual-mpnet-base-v2
model-index:
  - name: SetFit with sentence-transformers/paraphrase-multilingual-mpnet-base-v2
    results:
      - task:
          type: text-classification
          name: Text Classification
        dataset:
          name: Unknown
          type: unknown
          split: test
        metrics:
          - type: accuracy
            value: 0.6111111111111112
            name: Accuracy

SetFit with sentence-transformers/paraphrase-multilingual-mpnet-base-v2

This is a SetFit model that can be used for Text Classification. This SetFit model uses sentence-transformers/paraphrase-multilingual-mpnet-base-v2 as the Sentence Transformer embedding model. A LogisticRegression instance is used for classification.

The model has been trained using an efficient few-shot learning technique that involves:

  1. Fine-tuning a Sentence Transformer with contrastive learning.
  2. Training a classification head with features from the fine-tuned Sentence Transformer.

Model Details

Model Description

Model Sources

Model Labels

Label Examples
ccro:BasedOn
  • 'The axiomatizations presented in Quesada (2010, 2011) also dispense with strong monotonicity.'
ccro:Basedon
  • 'A formal mathematical description of the h-index introduced by Hirsch (2005)'
  • 'Woeginger (2008a, b) and Quesada (2009, 2010) have already suggested characterizations of the Hirsch index'
  • 'Woeginger (2008a, b) and Quesada (2009, 2010) have already suggested characterizations of the Hirsch index'
ccro:Compare
  • 'Instead, a variety of studies [8, 9] have shown that the h index by and large agrees with other objective and subjective measures of scientific quality in a variety of different disciplines (10–15),'
  • 'Instead, a variety of studies [8, 9] have shown that the h index by and large agrees with other objective and subjective measures of scientific quality in a variety of different disciplines (10–15),'
  • 'Instead, a variety of studies [8, 9] have shown that the h index by and large agrees with other objective and subjective measures of scientific quality in a variety of different disciplines (10–15),'
ccro:Contrast
  • 'Hirsch (2005) argues that two individuals with similar Hirsch-index are comparable in terms of their overall scientific impact, even if their total number of papers or their total number of citations is very different.'
  • 'The three differ from Woeginger’s (2008a) characterization in requiring fewer axioms (three instead of five)'
  • 'Marchant (2009), instead of characterizing the index itself, characterizes the ranking that the Hirsch index induces on outputs.'
ccro:Criticize
  • 'The h-index does not take into account that some papers may have extraordinarily many citations, and the g-index tries to compensate for this; see also Egghe (2006b) and Tol (2008).'
  • 'The h-index does not take into account that some papers may have extraordinarily many citations, and the g-index tries to compensate for this; see also Egghe (2006b) and Tol (2008).'
  • 'Woeginger (2008a, p. 227) stresses that his axioms should be interpreted within the context of MON.'
ccro:Discuss
  • 'The relation between N and h will depend on the detailed form of the particular distribution (HI0501-01)'
  • 'As discussed by Redner (HI0501-03), most papers earn their citations over a limited period of popularity and then they are no longer cited.'
  • 'It is also possible that papers "drop out" and then later come back into the h count, as would occur for the kind of papers termed "sleeping beauties" (HI0501-04).'
ccro:Extend
  • 'In [3] the analogous formula for the g-index has been proved'
ccro:Incorporate
  • 'In this paper, we provide an axiomatic characterization of the Hirsch-index, in very much the same spirit as Arrow (1950, 1951), May (1952), and Moulin (1988) did for numerous other problems in mathematical decision making.'
  • 'In this paper, we provide an axiomatic characterization of the Hirsch-index, in very much the same spirit as Arrow (1950, 1951), May (1952), and Moulin (1988) did for numerous other problems in mathematical decision making.'
  • 'In this paper, we provide an axiomatic characterization of the Hirsch-index, in very much the same spirit as Arrow (1950, 1951), May (1952), and Moulin (1988) did for numerous other problems in mathematical decision making.'
ccro:Negate
  • 'Recently, Lehmann et al. (2, 3) have argued that the mean number of citations per paper (nc = Nc/Np) is a superior indicator.'
  • 'If one chose instead to use as indicator of scientific achievement the mean number of citations per paper [following Lehmann et al. (2, 3)], our results suggest that (as in the stock market) ‘‘past performance is not predictive of future performance.’’'
  • 'It has been argued in the literature that one drawback of the h index is that it does not give enough ‘‘credit’’ to very highly cited papers, and various modifications have been proposed to correct this, in particular, Egghe’s g index (4), Jin et al.’s AR index (5), and Komulski’s H(2) index (6).'

Evaluation

Metrics

Label Accuracy
all 0.6111

Uses

Direct Use for Inference

First install the SetFit library:

pip install setfit

Then you can load this model and run inference.

from setfit import SetFitModel

# Download from the 🤗 Hub
model = SetFitModel.from_pretrained("Corran/CCRO2")
# Run inference
preds = model("One of the referees recommends mentioning Quesada (2008) as another characterization of the Hirsch index relying as well on monotonicity.")

Training Details

Training Set Metrics

Training set Min Median Max
Word count 6 25.7812 53
Label Training Sample Count
ccro:BasedOn 1
ccro:Basedon 11
ccro:Compare 21
ccro:Contrast 3
ccro:Criticize 4
ccro:Discuss 37
ccro:Extend 1
ccro:Incorporate 14
ccro:Negate 4

Training Hyperparameters

  • batch_size: (60, 60)
  • num_epochs: (1, 1)
  • max_steps: -1
  • sampling_strategy: oversampling
  • num_iterations: 100
  • body_learning_rate: (2e-05, 1e-05)
  • head_learning_rate: 0.01
  • loss: CosineSimilarityLoss
  • distance_metric: cosine_distance
  • margin: 0.25
  • end_to_end: False
  • use_amp: False
  • warmup_proportion: 0.1
  • seed: 42
  • eval_max_steps: -1
  • load_best_model_at_end: False

Training Results

Epoch Step Training Loss Validation Loss
0.0042 1 0.2507 -
0.2083 50 0.0639 -
0.4167 100 0.0017 -
0.625 150 0.0016 -
0.8333 200 0.0059 -
0.0031 1 0.0051 -
0.1562 50 0.0005 -
0.3125 100 0.001 -
0.4688 150 0.0001 -
0.625 200 0.0 -
0.7812 250 0.0 -
0.9375 300 0.0001 -

Framework Versions

  • Python: 3.10.12
  • SetFit: 1.0.3
  • Sentence Transformers: 2.2.2
  • Transformers: 4.35.2
  • PyTorch: 2.1.0+cu121
  • Datasets: 2.16.1
  • Tokenizers: 0.15.0

Citation

BibTeX

@article{https://doi.org/10.48550/arxiv.2209.11055,
    doi = {10.48550/ARXIV.2209.11055},
    url = {https://arxiv.org/abs/2209.11055},
    author = {Tunstall, Lewis and Reimers, Nils and Jo, Unso Eun Seo and Bates, Luke and Korat, Daniel and Wasserblat, Moshe and Pereg, Oren},
    keywords = {Computation and Language (cs.CL), FOS: Computer and information sciences, FOS: Computer and information sciences},
    title = {Efficient Few-Shot Learning Without Prompts},
    publisher = {arXiv},
    year = {2022},
    copyright = {Creative Commons Attribution 4.0 International}
}