wchai commited on
Commit
49ab358
1 Parent(s): 07050f6

Upload README.md with huggingface_hub

Browse files
Files changed (1) hide show
  1. README.md +128 -0
README.md ADDED
@@ -0,0 +1,128 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ license: apache-2.0
3
+ datasets:
4
+ - Reself/AuroraCap-trainset
5
+ base_model:
6
+ - lmsys/vicuna-7b-v1.5-16k
7
+ tags:
8
+ - caption
9
+ model-index:
10
+ - name: AuroraCap-7B
11
+ results:
12
+ - task:
13
+ type: video detailed caption
14
+ dataset:
15
+ type: VDC
16
+ name: VDC
17
+ metrics:
18
+ - type: Acc
19
+ value: 38.21
20
+ name: VDCScore
21
+ - type: Acc
22
+ value: 48.33
23
+ name: VDD
24
+ - type: cider
25
+ value: 9.51
26
+ - type: bleu
27
+ value: 30.90
28
+ name: bleu@1
29
+ - type: bleu
30
+ value: 4.06
31
+ name: bleu@4
32
+ - type: meteor
33
+ value: 19.09
34
+ - type: rouge
35
+ value: 21.58
36
+ name: rouge-l
37
+ - task:
38
+ type: video caption
39
+ dataset:
40
+ type: MSR-VTT
41
+ name: NSR-VTT
42
+ metrics:
43
+ - type: cider
44
+ value: 33.1
45
+ - type: bleu
46
+ value: 58.6
47
+ name: bleu@1
48
+ - type: bleu
49
+ value: 21.0
50
+ name: bleu@4
51
+ - type: meteor
52
+ value: 23.9
53
+ - type: rouge
54
+ value: 49.5
55
+ name: rouge-l
56
+ - task:
57
+ type: video caption
58
+ dataset:
59
+ type: VATEX
60
+ name: VATEX
61
+ metrics:
62
+ - type: cider
63
+ value: 33.8
64
+ - type: bleu
65
+ value: 57.1
66
+ name: bleu@1
67
+ - type: bleu
68
+ value: 18.4
69
+ name: bleu@4
70
+ - type: meteor
71
+ value: 19.0
72
+ - type: rouge
73
+ value: 40.8
74
+ name: rouge-l
75
+ - task:
76
+ type: video question anwering
77
+ dataset:
78
+ type: ActivityNet
79
+ name: ActivityNet
80
+ metrics:
81
+ - type: Acc
82
+ value: 61.8
83
+ - task:
84
+ type: video question anwering
85
+ dataset:
86
+ type: MSVD
87
+ name: MSVD
88
+ metrics:
89
+ - type: Acc
90
+ value: 62.6
91
+ - task:
92
+ type: video question anwering
93
+ dataset:
94
+ type: MSR-VTT
95
+ name: MSR-VTT
96
+ metrics:
97
+ - type: Acc
98
+ value: 43.5
99
+ - task:
100
+ type: video question anwering
101
+ dataset:
102
+ type: iVQA
103
+ name: iVQA
104
+ metrics:
105
+ - type: Acc
106
+ value: 55.2
107
+ ---
108
+
109
+ <img src="assets/teaser.png" align="center">
110
+
111
+ ## Resources
112
+
113
+ - [Website](https://rese1f.github.io/aurora-web/)
114
+ - [arXiv: Paper]()
115
+ - [GitHub: Code](https://github.com/rese1f/aurora)
116
+ - [Huggingface: AuroraCap Model](https://huggingface.co/collections/Reself/auroracap-66d117ffe13bedda96702013)
117
+ - [Huggingface: VDC Benchmark](https://huggingface.co/datasets/Reself/Video-Detailed-Caption)
118
+ - [Huggingface: Trainset](https://huggingface.co/datasets/Reself/AuroraCap-trainset)
119
+
120
+ ## Features
121
+
122
+ <img src="assets/vdc_baseline.png" align="center">
123
+
124
+ AuroraCap is a multimodal large language model for image and video captioning.
125
+
126
+ ## Quick Start
127
+ See [Docs](https://github.com/rese1f/aurora/blob/main/docs/auroracap/README.md).
128
+ ## Citation