Behnamm commited on
Commit
ba78a38
1 Parent(s): 04eb2c0

Update src/about.py

Browse files
Files changed (1) hide show
  1. src/about.py +2 -2
src/about.py CHANGED
@@ -72,8 +72,8 @@ We use the given *test* subset (for those benchmarks that also have *train* and
72
  These benchmarks are picked for now, but several other benchmarks are going to be added later to help us perform a more thorough examination of models.
73
 
74
  The last two benchmarks, ParsiNLU NLI and ParsiNLU QQP are evaluated in different few-shot settings and then the maximum score is returned as the final evaluation.
75
- We argue that this is indeed a fair evaluation scheme since many light-weight models (around ~7B and less) can have a poor in-context learning and thus perform better
76
- in small shots (or have a small knowledge capacity and perform poorly in zero-shot). We wish to not hold this against the model by trying to measure performances in different settings and take the maximum score achieved .
77
 
78
  ## REPRODUCIBILITY
79
  The parameters used for evaluation along with instructions and prompts will be available once the framework is released. (TO BE COMPLETED)
 
72
  These benchmarks are picked for now, but several other benchmarks are going to be added later to help us perform a more thorough examination of models.
73
 
74
  The last two benchmarks, ParsiNLU NLI and ParsiNLU QQP are evaluated in different few-shot settings and then the maximum score is returned as the final evaluation.
75
+ We argue that this is indeed a fair evaluation scheme since many light-weight models (around ~7B and less) can have a poor in-context learning in long-context prompts and thus perform better
76
+ in smaller shots (or have a small knowledge capacity and perform poorly in zero-shot). We wish to not hold this against the model by trying to measure performances in different settings and take the maximum score achieved .
77
 
78
  ## REPRODUCIBILITY
79
  The parameters used for evaluation along with instructions and prompts will be available once the framework is released. (TO BE COMPLETED)