hllj (Bui Van Hop)

upvoted a collection about 2 months ago

Research projects on top of vLLM

Collection

Papers cited in https://blog.vllm.ai/2024/07/25/lfai-perf.html • 6 items • Updated Jul 29 • 12

upvoted an article about 2 months ago

Article

Fine-tune Llama 3.1 Ultra-Efficiently with Unsloth

By

•

Jul 29

• 193

upvoted a collection about 2 months ago

Vista

Collection

A family of Vietnamese Vision Language Model • 4 items • Updated Jun 30 • 2

upvoted 8 papers 2 months ago

Understanding Alignment in Multimodal LLMs: A Comprehensive Study

Paper • 2407.02477 • Published Jul 2 • 21

Summary of a Haystack: A Challenge to Long-Context LLMs and RAG Systems

Paper • 2407.01370 • Published Jul 1 • 84

InternLM-XComposer-2.5: A Versatile Large Vision Language Model Supporting Long-Contextual Input and Output

Paper • 2407.03320 • Published Jul 3 • 92

upvoted a paper 4 months ago

SILC: Improving Vision Language Pretraining with Self-Distillation

Paper • 2310.13355 • Published Oct 20, 2023 • 6

upvoted 2 collections 4 months ago

Vision Language Models Papers 🖼️💬📝

Collection

Papers about vision-language models, most important ones are on top of the list. • 27 items • Updated Apr 30 • 32

Vision-Language Model

Collection

16 items • Updated Jul 18 • 1

upvoted a paper 4 months ago

CogVLM: Visual Expert for Pretrained Language Models

Paper • 2311.03079 • Published Nov 6, 2023 • 23

upvoted a collection 4 months ago

Speculative

Collection

55 items • Updated Jul 24 • 4

upvoted a paper 4 months ago

What matters when building vision-language models?

Paper • 2405.02246 • Published May 3 • 98

upvoted 32 papers 5 months ago

KAN: Kolmogorov-Arnold Networks

Paper • 2404.19756 • Published Apr 30 • 108

Better & Faster Large Language Models via Multi-token Prediction

Paper • 2404.19737 • Published Apr 30 • 73

Layer Skip: Enabling Early Exit Inference and Self-Speculative Decoding

Paper • 2404.16710 • Published Apr 25 • 57

TriForce: Lossless Acceleration of Long Sequence Generation with Hierarchical Speculative Decoding

Paper • 2404.11912 • Published Apr 18 • 16

An Emulator for Fine-Tuning Large Language Models using Small Language Models

Paper • 2310.12962 • Published Oct 19, 2023 • 14

Accelerating LLM Inference with Staged Speculative Decoding

Paper • 2308.04623 • Published Aug 8, 2023 • 23

Kangaroo: Lossless Self-Speculative Decoding via Double Early Exiting

Paper • 2404.18911 • Published Apr 29 • 29

PLLaVA : Parameter-free LLaVA Extension from Images to Videos for Video Dense Captioning

Paper • 2404.16994 • Published Apr 25 • 35

How Far Are We to GPT-4V? Closing the Gap to Commercial Multimodal Models with Open-Source Suites

Paper • 2404.16821 • Published Apr 25 • 53

Tele-FLM Technical Report

Paper • 2404.16645 • Published Apr 25 • 17

Megalodon: Efficient LLM Pretraining and Inference with Unlimited Context Length

Paper • 2404.08801 • Published Apr 12 • 62

Multi-Head Mixture-of-Experts

Paper • 2404.15045 • Published Apr 23 • 58

OpenELM: An Efficient Language Model Family with Open-source Training and Inference Framework

Paper • 2404.14619 • Published Apr 22 • 124

Pegasus-v1 Technical Report

Paper • 2404.14687 • Published Apr 23 • 30

OpenBezoar: Small, Cost-Effective and Open Models Trained on Mixes of Instruction Data

Paper • 2404.12195 • Published Apr 18 • 11

Reka Core, Flash, and Edge: A Series of Powerful Multimodal Language Models

Paper • 2404.12387 • Published Apr 18 • 38

How Good Are Low-bit Quantized LLaMA3 Models? An Empirical Study

Paper • 2404.14047 • Published Apr 22 • 43

Phi-3 Technical Report: A Highly Capable Language Model Locally on Your Phone

Paper • 2404.14219 • Published Apr 22 • 250

Scaling (Down) CLIP: A Comprehensive Analysis of Data, Architecture, and Training Strategies

Paper • 2404.08197 • Published Apr 12 • 27

Best Practices and Lessons Learned on Synthetic Data for Language Models

Paper • 2404.07503 • Published Apr 11 • 29

DeepSeek-VL: Towards Real-World Vision-Language Understanding

Paper • 2403.05525 • Published Mar 8 • 39

Qwen-VL: A Frontier Large Vision-Language Model with Versatile Abilities

Paper • 2308.12966 • Published Aug 24, 2023 • 6

ShareGPT4V: Improving Large Multi-Modal Models with Better Captions

Paper • 2311.12793 • Published Nov 21, 2023 • 18

OmniFusion Technical Report

Paper • 2404.06212 • Published Apr 9 • 73

SambaLingo: Teaching Large Language Models New Languages

Paper • 2404.05829 • Published Apr 8 • 12

HyperCLOVA X Technical Report

Paper • 2404.01954 • Published Apr 2 • 19

ChatGLM-Math: Improving Math Problem-Solving in Large Language Models with a Self-Critique Pipeline

Paper • 2404.02893 • Published Apr 3 • 19

Mixture-of-Depths: Dynamically allocating compute in transformer-based language models

Paper • 2404.02258 • Published Apr 2 • 103

LVLM-Intrepret: An Interpretability Tool for Large Vision-Language Models

Paper • 2404.03118 • Published Apr 3 • 23

AutoWebGLM: Bootstrap And Reinforce A Large Language Model-based Web Navigating Agent

Paper • 2404.03648 • Published Apr 4 • 23

ReFT: Representation Finetuning for Language Models

Paper • 2404.03592 • Published Apr 4 • 86

Direct Nash Optimization: Teaching Language Models to Self-Improve with General Preferences

Paper • 2404.03715 • Published Apr 4 • 59

upvoted 11 papers 6 months ago

ViTAR: Vision Transformer with Any Resolution

Paper • 2403.18361 • Published Mar 27 • 51

sDPO: Don't Use Your Data All at Once

Paper • 2403.19270 • Published Mar 28 • 38

Gecko: Versatile Text Embeddings Distilled from Large Language Models

Paper • 2403.20327 • Published Mar 29 • 47

Jamba: A Hybrid Transformer-Mamba Language Model

Paper • 2403.19887 • Published Mar 28 • 103

OmniQuant: Omnidirectionally Calibrated Quantization for Large Language Models

Paper • 2308.13137 • Published Aug 25, 2023 • 17

Qwen Technical Report

Paper • 2309.16609 • Published Sep 28, 2023 • 34

Long-form factuality in large language models

Paper • 2403.18802 • Published Mar 27 • 23

DeepSeek LLM: Scaling Open-Source Language Models with Longtermism

Paper • 2401.02954 • Published Jan 5 • 40

CulturaX: A Cleaned, Enormous, and Multilingual Dataset for Large Language Models in 167 Languages

Paper • 2309.09400 • Published Sep 17, 2023 • 82

Mora: Enabling Generalist Video Generation via A Multi-Agent Framework

Paper • 2403.13248 • Published Mar 20 • 76

LlamaFactory: Unified Efficient Fine-Tuning of 100+ Language Models

Paper • 2403.13372 • Published Mar 20 • 58

Bui Van Hop

AI & ML interests

Organizations

hllj's activity

Fine-tune Llama 3.1 Ultra-Efficiently with Unsloth