Spaces:
Running
Running
metadata
title: MultiModal Phi2
emoji: π
colorFrom: blue
colorTo: red
sdk: gradio
sdk_version: 3.35.2
app_file: app.py
pinned: false
license: mit
Phi2 : Multimodal Finetuning
Details
- LLM Backbone: Phi2
- Vision Tower: clip-vit-large-patch14-336
- Audio Model: Whisper
- Pretraining Dataset: LAION-CC-SBU dataset with BLIP captions(200k samples)
- Finetuning Dataset: Instruct 150k dataset based on COCO