Konstantinos Kakkavas's picture

3 9

Konstantinos Kakkavas

kkakkavas

kkakkavas

AI & ML interests

- NLP - CV - docVQA

Recent Activity

upvoted a paper 2 days ago

2.5 Years in Class: A Multimodal Textbook for Vision-Language Pretraining

upvoted a paper 2 days ago

LLaVA-Mini: Efficient Image and Video Large Multimodal Models with One Vision Token

liked a model about 2 months ago

Xkev/Llama-3.2V-11B-cot

View all activity

Organizations

kkakkavas's activity

upvoted 2 papers 2 days ago

2.5 Years in Class: A Multimodal Textbook for Vision-Language Pretraining

Paper • 2501.00958 • Published 9 days ago • 91

LLaVA-Mini: Efficient Image and Video Large Multimodal Models with One Vision Token

Paper • 2501.03895 • Published 3 days ago • 37

upvoted a collection 7 months ago

Table Transformer

The Table Transformer (TATR) is a series of object detection models useful for table extraction from PDF images. • 5 items • Updated 2 days ago • 20