Model details
Motivation
This models contains the fine-tuned weights from liuhaotian/llava-v1.5-7b
so LLM benchmarking can be done.
Model type: LLaVA is an open-source chatbot trained by fine-tuning LLaMA/Vicuna on GPT-generated multimodal instruction-following data. It is an auto-regressive language model, based on the transformer architecture.
License
Llama 2 is licensed under the LLAMA 2 Community License, Copyright (c) Meta Platforms, Inc. All Rights Reserved.
Training dataset
- 558K filtered image-text pairs from LAION/CC/SBU, captioned by BLIP.
- 158K GPT-generated multimodal instruction-following data.
- 450K academic-task-oriented VQA data mixture.
- 40K ShareGPT data.
Open LLM Leaderboard Evaluation Results
Detailed results can be found here
Metric | Value |
---|---|
Avg. | 52.28 |
AI2 Reasoning Challenge (25-Shot) | 52.65 |
HellaSwag (10-Shot) | 76.09 |
MMLU (5-Shot) | 51.68 |
TruthfulQA (0-shot) | 45.86 |
Winogrande (5-shot) | 72.06 |
GSM8k (5-shot) | 15.31 |
- Downloads last month
- 75
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social
visibility and check back later, or deploy to Inference Endpoints (dedicated)
instead.
Evaluation results
- normalized accuracy on AI2 Reasoning Challenge (25-Shot)test set Open LLM Leaderboard52.650
- normalized accuracy on HellaSwag (10-Shot)validation set Open LLM Leaderboard76.090
- accuracy on MMLU (5-Shot)test set Open LLM Leaderboard51.680
- mc2 on TruthfulQA (0-shot)validation set Open LLM Leaderboard45.860
- accuracy on Winogrande (5-shot)validation set Open LLM Leaderboard72.060
- accuracy on GSM8k (5-shot)test set Open LLM Leaderboard15.310