ggcristian commited on
Commit
8a77abc
·
verified ·
1 Parent(s): 90bab47

Create README.md

Browse files
Files changed (1) hide show
  1. README.md +104 -0
README.md ADDED
@@ -0,0 +1,104 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ language:
3
+ - en
4
+ base_model:
5
+ - openai/clip-vit-large-patch14
6
+ - microsoft/phi-2
7
+ pipeline_tag: image-classification
8
+ tags:
9
+ - emotion
10
+ - visual emotion recognition
11
+ - affective computing
12
+ - emotional classification
13
+ - metric learning
14
+ ---
15
+
16
+ # TinyEmo-CLIP-Phi-2
17
+
18
+ [TinyEmo GitHub repo](https://github.com/ggcr/TinyEmo)
19
+
20
+ [Metric Projector Card] [TinyEmo MM-LLM Card]
21
+
22
+ [[Reasoning Pre-training Dataset]](https://huggingface.co/datasets/ggcristian/TinyEmo-Pretrain-525k) [[Reasoning Fine-tuning Dataset]](https://huggingface.co/datasets/ggcristian/TinyEmo-EmoReason-175k) [[Reasoning Claude Dataset]](https://huggingface.co/datasets/ggcristian/TinyEmo-EmoReasonHQ-Claude-1.4k)
23
+
24
+ TinyEmo is a family of small multi-modal language models for emotional reasoning and classification. Our
25
+ approach features: (1) a synthetic emotional instruct dataset for both pre-training and fine-tuning stages, (2) a Metric Projector
26
+ that delegates classification from the language model allowing for more efficient training and inference, (3) a multi-modal large
27
+ language model (MM-LLM) for emotional reasoning, and (4) a semi-automated framework for bias detection. TinyEmo is able to
28
+ perform emotion classification and emotional reasoning, all while using substantially fewer parameters than comparable models.
29
+ This efficiency allows us to freely incorporate more diverse emotional datasets, enabling strong performance on classification tasks,
30
+ with our smallest model (700M parameters) outperforming larger state-of-the-art models based on general-purpose MM-LLMs
31
+ with over 7B parameters. Additionally, the Metric Projector allows for interpretability and indirect bias detection in large models
32
+ without additional training, offering an approach to understand and improve AI systems.
33
+
34
+ ## Installation and Requirements
35
+
36
+ 1. Clone this repository and navigate to the root of the project:
37
+ ```
38
+ git clone https://github.com/ggcr/TinyEmo.git
39
+ cd TinyEmo
40
+ ```
41
+
42
+ 2. Create an environment and install dependencies:
43
+ ```
44
+ conda create -n projector_mps python=3.10 -y
45
+ conda activate projector_mps
46
+ pip install --upgrade pip # enable PEP 660 support
47
+ pip install -e projector_mps/.
48
+ ```
49
+
50
+ ## Quickstart
51
+
52
+ ### Metric Projector inference
53
+
54
+ We provide precomputed CLIP features for the Emotion6 dataset, and you can evaluate them using two methods:
55
+
56
+ #### Our Projectors from Hugging Face
57
+
58
+ To evaluate the projectors from Hugging Face, use the [scripts/eval.sh](https://github.com/ggcr/TinyEmo/blob/main/projector_mps/scripts/eval.sh) script:
59
+
60
+ ```bash
61
+ conda activate projector_mps
62
+ bash projector_mps/scripts/eval.sh
63
+ ```
64
+
65
+ Below is a table of the available projectors:
66
+
67
+ | Model Architecture | Parameters | Zero-shot Accuracy | HuggingFace Link |
68
+ |----------------------------------------| ---------- |--------------------|----------------------------------------------------------------------|
69
+ | CLIP ViT-L/14 + OpenELM-270M-I | 0.70B | 57.87% | [HF Projector 0.70B Link](https://huggingface.co/ggcristian/TinyEmo-CLIP-OpenELM-270M) |
70
+ | CLIP ViT-L/14 + OpenELM-450M-I | 0.88B | 55.24% | [HF Projector 0.88B Link](https://huggingface.co/ggcristian/TinyEmo-CLIP-OpenELM-450M) |
71
+ | CLIP ViT-L/14 + TinyLLaMA 1.1 | 1.53B | 56.13% | [HF Projector 1.53B Link](https://huggingface.co/ggcristian/TinyEmo-CLIP-TinyLlama-1_1-Syn) |
72
+ | CLIP ViT-L/14 + Microsoft Phi 2 | 3.21B | 56.28% | [HF Projector 3.21B Link](https://huggingface.co/ggcristian/TinyEmo-CLIP-Phi-2) |
73
+
74
+ #### Custom Projectors with Local Weights
75
+
76
+ To use custom local weights or models, run the following:
77
+
78
+ ```bash
79
+ conda activate projector_mps
80
+ bash projector_mps/scripts/eval_custom.sh
81
+ ```
82
+
83
+ This allows you to specify different vision encoders, language models, and loss functions, as well as use your own projector weights.
84
+
85
+
86
+ ## Acknowledgement
87
+
88
+ The Metric Projector was built from the foundations of [CLIP-E](https://arxiv.org/abs/2310.12062) paper!
89
+
90
+ Our codebase for the MM-LLM is forked from the [TinyLLaVA](https://github.com/TinyLLaVA/TinyLLaVA_Factory) project.
91
+
92
+ ## Citation
93
+
94
+ ```
95
+ @mastersthesis{gutierrez2024tinyemo,
96
+ title = {TinyEmo: Scaling down Emotional Reasoning via Metric Projection},
97
+ author = {Cristian Gutierrez},
98
+ year = 2024,
99
+ month = {September},
100
+ address = {Barcelona, Spain},
101
+ school = {Universitat Autònoma de Barcelona (UAB)},
102
+ type = {Master's thesis}
103
+ }
104
+ ```