--- license: llama2 model-index: - name: llama2-13b-ft-openllm-leaderboard-v1 results: - task: type: text-generation name: Text Generation dataset: name: AI2 Reasoning Challenge (25-Shot) type: ai2_arc config: ARC-Challenge split: test args: num_few_shot: 25 metrics: - type: acc_norm value: 59.64 name: normalized accuracy source: url: https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=zyh3826/llama2-13b-ft-openllm-leaderboard-v1 name: Open LLM Leaderboard - task: type: text-generation name: Text Generation dataset: name: HellaSwag (10-Shot) type: hellaswag split: validation args: num_few_shot: 10 metrics: - type: acc_norm value: 83.14 name: normalized accuracy source: url: https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=zyh3826/llama2-13b-ft-openllm-leaderboard-v1 name: Open LLM Leaderboard - task: type: text-generation name: Text Generation dataset: name: MMLU (5-Shot) type: cais/mmlu config: all split: test args: num_few_shot: 5 metrics: - type: acc value: 60.93 name: accuracy source: url: https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=zyh3826/llama2-13b-ft-openllm-leaderboard-v1 name: Open LLM Leaderboard - task: type: text-generation name: Text Generation dataset: name: TruthfulQA (0-shot) type: truthful_qa config: multiple_choice split: validation args: num_few_shot: 0 metrics: - type: mc2 value: 40.72 source: url: https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=zyh3826/llama2-13b-ft-openllm-leaderboard-v1 name: Open LLM Leaderboard - task: type: text-generation name: Text Generation dataset: name: Winogrande (5-shot) type: winogrande config: winogrande_xl split: validation args: num_few_shot: 5 metrics: - type: acc value: 77.35 name: accuracy source: url: https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=zyh3826/llama2-13b-ft-openllm-leaderboard-v1 name: Open LLM Leaderboard - task: type: text-generation name: Text Generation dataset: name: GSM8k (5-shot) type: gsm8k config: main split: test args: num_few_shot: 5 metrics: - type: acc value: 1.36 name: accuracy source: url: https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=zyh3826/llama2-13b-ft-openllm-leaderboard-v1 name: Open LLM Leaderboard --- finetue LLAMA2-13B myself 20231024-172733 # Model Details + Developed by: zyh3826 + Backbone Model: llama-2-13B + Library: HuggingFace Transformers # Limitations & Biases: Llama2 and fine-tuned variants are a new technology that carries risks with use. Testing conducted to date has been in English, and has not covered, nor could it cover all scenarios. For these reasons, as with all LLMs, Llama 2 and any fine-tuned varient's potential outputs cannot be predicted in advance, and the model may in some instances produce inaccurate, biased or other objectionable responses to user prompts. Therefore, before deploying any applications of Llama 2 variants, developers should perform safety testing and tuning tailored to their specific applications of the model. Please see the Responsible Use Guide available at https://ai.meta.com/llama/responsible-use-guide/ # License Disclaimer: This model is bound by the license & usage restrictions of the original Llama-2 model. And comes with no warranty or gurantees of any kind. # [Open LLM Leaderboard Evaluation Results](https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard) Detailed results can be found [here](https://huggingface.co/datasets/open-llm-leaderboard/details_zyh3826__llama2-13b-ft-openllm-leaderboard-v1) | Metric |Value| |---------------------------------|----:| |Avg. |53.86| |AI2 Reasoning Challenge (25-Shot)|59.64| |HellaSwag (10-Shot) |83.14| |MMLU (5-Shot) |60.93| |TruthfulQA (0-shot) |40.72| |Winogrande (5-shot) |77.35| |GSM8k (5-shot) | 1.36|