devngho's picture

devngho PRO

devngho

·

devngho

AI & ML interests

Efficient Korean NLP

Organizations

devngho's activity

upvoted a paper 19 days ago

Were RNNs All We Needed?

Paper • 2410.01201 • Published Oct 2 • 46

upvoted a collection about 2 months ago

Qwen2.5

Qwen2.5 language models, including pretrained and instruction-tuned models of 7 sizes, including 0.5B, 1.5B, 3B, 7B, 14B, 32B, and 72B. • 45 items • Updated Sep 18 • 307

upvoted a paper about 2 months ago

CulturaX: A Cleaned, Enormous, and Multilingual Dataset for Large Language Models in 167 Languages

Paper • 2309.09400 • Published Sep 17, 2023 • 82

upvoted an article about 2 months ago

Article

Mergoo: Efficiently Build Your Own MoE LLM

By

•

Jun 3

• 41

upvoted a paper 2 months ago

Instruction Pre-Training: Language Models are Supervised Multitask Learners

Paper • 2406.14491 • Published Jun 20 • 85

upvoted 3 papers 3 months ago

Pythia: A Suite for Analyzing Large Language Models Across Training and Scaling

Paper • 2304.01373 • Published Apr 3, 2023 • 8

Scaling Laws with Vocabulary: Larger Models Deserve Larger Vocabularies

Paper • 2407.13623 • Published Jul 18 • 52

Trans-Tokenization and Cross-lingual Vocabulary Transfers: Language Adaptation of LLMs for Low-Resource NLP

Paper • 2408.04303 • Published Aug 8 • 9

upvoted 2 articles 7 months ago

Article

Expanding Model Context and Creating Chat Models with a Single Click

By

•

Apr 28

• 37

Article

Can We Train Chat Models with Raw Data?

By

•

Apr 25

• 17