oaishi
's Collections
Interesting Papers
updated
Paper
•
2309.03179
•
Published
•
29
Scaling Autoregressive Multi-Modal Models: Pretraining and Instruction
Tuning
Paper
•
2309.02591
•
Published
•
14
Efficient Memory Management for Large Language Model Serving with
PagedAttention
Paper
•
2309.06180
•
Published
•
25
LEAP Hand: Low-Cost, Efficient, and Anthropomorphic Hand for Robot
Learning
Paper
•
2309.06440
•
Published
•
9
InstaFlow: One Step is Enough for High-Quality Diffusion-Based
Text-to-Image Generation
Paper
•
2309.06380
•
Published
•
32
PhotoVerse: Tuning-Free Image Customization with Text-to-Image Diffusion
Models
Paper
•
2309.05793
•
Published
•
50
AstroLLaMA: Towards Specialized Foundation Models in Astronomy
Paper
•
2309.06126
•
Published
•
16
Stabilizing RLHF through Advantage Model and Selective Rehearsal
Paper
•
2309.10202
•
Published
•
9
Multimodal Foundation Models: From Specialists to General-Purpose
Assistants
Paper
•
2309.10020
•
Published
•
40
Q-Transformer: Scalable Offline Reinforcement Learning via
Autoregressive Q-Functions
Paper
•
2309.10150
•
Published
•
24
A Large-scale Dataset for Audio-Language Representation Learning
Paper
•
2309.11500
•
Published
•
9
LMDX: Language Model-based Document Information Extraction and
Localization
Paper
•
2309.10952
•
Published
•
63
LMSYS-Chat-1M: A Large-Scale Real-World LLM Conversation Dataset
Paper
•
2309.11998
•
Published
•
24
Calibrating LLM-Based Evaluator
Paper
•
2309.13308
•
Published
•
11
Exploring Large Language Models' Cognitive Moral Development through
Defining Issues Test
Paper
•
2309.13356
•
Published
•
36
PaLI-3 Vision Language Models: Smaller, Faster, Stronger
Paper
•
2310.09199
•
Published
•
24
Self-RAG: Learning to Retrieve, Generate, and Critique through
Self-Reflection
Paper
•
2310.11511
•
Published
•
74
Let's Synthesize Step by Step: Iterative Dataset Synthesis with Large
Language Models by Extrapolating Errors from Small Models
Paper
•
2310.13671
•
Published
•
18
TiC-CLIP: Continual Training of CLIP Models
Paper
•
2310.16226
•
Published
•
8
RoboVQA: Multimodal Long-Horizon Reasoning for Robotics
Paper
•
2311.00899
•
Published
•
7
RoboGen: Towards Unleashing Infinite Data for Automated Robot Learning
via Generative Simulation
Paper
•
2311.01455
•
Published
•
28
FlashDecoding++: Faster Large Language Model Inference on GPUs
Paper
•
2311.01282
•
Published
•
35
OtterHD: A High-Resolution Multi-modality Model
Paper
•
2311.04219
•
Published
•
31
PEARL: Personalizing Large Language Model Writing Assistants with
Generation-Calibrated Retrievers
Paper
•
2311.09180
•
Published
•
7
Faithful Persona-based Conversational Dataset Generation with Large
Language Models
Paper
•
2312.10007
•
Published
•
6
Weak-to-Strong Generalization: Eliciting Strong Capabilities With Weak
Supervision
Paper
•
2312.09390
•
Published
•
32
StarVector: Generating Scalable Vector Graphics Code from Images
Paper
•
2312.11556
•
Published
•
27
AppAgent: Multimodal Agents as Smartphone Users
Paper
•
2312.13771
•
Published
•
51
Taiyi-Diffusion-XL: Advancing Bilingual Text-to-Image Generation with
Large Vision-Language Model Support
Paper
•
2401.14688
•
Published
•
13
From GPT-4 to Gemini and Beyond: Assessing the Landscape of MLLMs on
Generalizability, Trustworthiness and Causality through Four Modalities
Paper
•
2401.15071
•
Published
•
34
Mobile-Agent: Autonomous Multi-Modal Mobile Device Agent with Visual
Perception
Paper
•
2401.16158
•
Published
•
17
StructLM: Towards Building Generalist Models for Structured Knowledge
Grounding
Paper
•
2402.16671
•
Published
•
26
MyVLM: Personalizing VLMs for User-Specific Queries
Paper
•
2403.14599
•
Published
•
15
OSWorld: Benchmarking Multimodal Agents for Open-Ended Tasks in Real
Computer Environments
Paper
•
2404.07972
•
Published
•
43
The Rise and Potential of Large Language Model Based Agents: A Survey
Paper
•
2309.07864
•
Published
•
5
The Prompt Report: A Systematic Survey of Prompting Techniques
Paper
•
2406.06608
•
Published
•
52
INTRA: Interaction Relationship-aware Weakly Supervised Affordance
Grounding
Paper
•
2409.06210
•
Published
•
24
Paper Copilot: A Self-Evolving and Efficient LLM System for Personalized
Academic Assistance
Paper
•
2409.04593
•
Published
•
19
Windows Agent Arena: Evaluating Multi-Modal OS Agents at Scale
Paper
•
2409.08264
•
Published
•
39