David Kasakaitis's picture

24 6

David Kasakaitis

dkasa

·

https://dkasa.dev

ddkasa

AI & ML interests

Reinforcement Learning & Autonomous Agents

Organizations

None yet

dkasa's activity

upvoted a paper 6 days ago

A Pointer Network-based Approach for Joint Extraction and Detection of Multi-Label Multi-Class Intents

Paper • 2410.22476 • Published 9 days ago • 24

upvoted a paper 10 days ago

ROCKET-1: Master Open-World Interaction with Visual-Temporal Context Prompting

Paper • 2410.17856 • Published 16 days ago • 48

upvoted a paper 12 days ago

Breaking the Memory Barrier: Near Infinite Batch Size Scaling for Contrastive Loss

Paper • 2410.17243 • Published 17 days ago • 86

upvoted a paper 13 days ago

PyramidDrop: Accelerating Your Large Vision-Language Models via Pyramid Visual Redundancy Reduction

Paper • 2410.17247 • Published 17 days ago • 43

upvoted 3 papers 20 days ago

Harnessing Webpage UIs for Text-Rich Visual Understanding

Paper • 2410.13824 • Published 22 days ago • 29

Revealing the Barriers of Language Agents in Planning

Paper • 2410.12409 • Published 23 days ago • 23

Large Language Model Evaluation via Matrix Nuclear-Norm

Paper • 2410.10672 • Published 25 days ago • 18

upvoted a paper 23 days ago

LOKI: A Comprehensive Synthetic Data Detection Benchmark using Large Multimodal Models

Paper • 2410.09732 • Published 26 days ago • 54

upvoted 2 papers 24 days ago

Mechanistic Permutability: Match Features Across Layers

Paper • 2410.07656 • Published 29 days ago • 16

Baichuan-Omni Technical Report

Paper • 2410.08565 • Published 28 days ago • 82

upvoted a paper 27 days ago

MLLM as Retriever: Interactively Learning Multimodal Retrieval for Embodied Agents

Paper • 2410.03450 • Published Oct 4 • 32

upvoted 3 papers 29 days ago

Only-IF:Revealing the Decisive Effect of Instruction Diversity on Generalization

Paper • 2410.04717 • Published Oct 7 • 17

A Spark of Vision-Language Intelligence: 2-Dimensional Autoregressive Transformer for Efficient Finegrained Image Generation

Paper • 2410.01912 • Published Oct 2 • 13

Differential Transformer

Paper • 2410.05258 • Published Oct 7 • 165

upvoted 3 papers about 1 month ago

Imagine yourself: Tuning-Free Personalized Image Generation

Paper • 2409.13346 • Published Sep 20 • 67

YesBut: A High-Quality Annotated Multimodal Dataset for evaluating Satire Comprehension capability of Vision-Language Models

Paper • 2409.13592 • Published Sep 20 • 48

Prithvi WxC: Foundation Model for Weather and Climate

Paper • 2409.13598 • Published Sep 20 • 37

upvoted 3 papers about 2 months ago

Scaling Smart: Accelerating Large Language Model Pre-training with Small Model Initialization

Paper • 2409.12903 • Published Sep 19 • 21

Training Language Models to Self-Correct via Reinforcement Learning

Paper • 2409.12917 • Published Sep 19 • 133

LLMs + Persona-Plug = Personalized LLMs

Paper • 2409.11901 • Published Sep 18 • 30