rasyosef commited on
Commit
0344e91
1 Parent(s): 0fb8040

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +2 -2
README.md CHANGED
@@ -33,7 +33,7 @@ The model has underwent a post-training process that incorporates both **supervi
33
  ### Chat Format
34
 
35
  Given the nature of the training data, the Phi-1.5 Instruct model is best suited for prompts using the chat format as follows.
36
- You can provide the prompt as a question with a generic template as follow:
37
  ```markdown
38
  <|im_start|>system
39
  You are a helpful assistant.<|im_end|>
@@ -99,7 +99,7 @@ Note: If you want to use flash attention, call _AutoModelForCausalLM.from_pretra
99
 
100
  ## Benchmarks
101
 
102
- This model outperforms HuggingFace's SmolLM-1.7B-Instruct and the TinyLlama-1.1B-Chat-v1.0 models on IFEval and GSM8K benchmarks. These benchmarks were run using EleutherAI's [lm-evaluation-harness](https://github.com/EleutherAI/lm-evaluation-harness)
103
 
104
  - **IFEval (Instruction Following Evaluation)**: IFEval is a fairly interesting dataset that tests the capability of models to clearly follow explicit instructions, such as “include keyword x” or “use format y”. The models are tested on their ability to strictly follow formatting instructions rather than the actual contents generated, allowing strict and rigorous metrics to be used.
105
  - **GSM8k (5-shot)**: diverse grade school math word problems to measure a model's ability to solve multi-step mathematical reasoning problems.
 
33
  ### Chat Format
34
 
35
  Given the nature of the training data, the Phi-1.5 Instruct model is best suited for prompts using the chat format as follows.
36
+ You can provide the prompt as a question with a generic template as follows:
37
  ```markdown
38
  <|im_start|>system
39
  You are a helpful assistant.<|im_end|>
 
99
 
100
  ## Benchmarks
101
 
102
+ This model outperforms HuggingFace's SmolLM-1.7B-Instruct and the TinyLlama-1.1B-Chat-v1.0 models on **all 5** of the following benchmarks. These benchmarks were run using EleutherAI's [lm-evaluation-harness](https://github.com/EleutherAI/lm-evaluation-harness)
103
 
104
  - **IFEval (Instruction Following Evaluation)**: IFEval is a fairly interesting dataset that tests the capability of models to clearly follow explicit instructions, such as “include keyword x” or “use format y”. The models are tested on their ability to strictly follow formatting instructions rather than the actual contents generated, allowing strict and rigorous metrics to be used.
105
  - **GSM8k (5-shot)**: diverse grade school math word problems to measure a model's ability to solve multi-step mathematical reasoning problems.