Longxu Dou's picture

Longxu Dou

dreamerdeo

·

https://longxudou.github.io/

longxudou

AI & ML interests

Natural Language Processing

Organizations

dreamerdeo's activity

upvoted an article 4 months ago

Article

RegMix: Data Mixture as Regression for Language Model Pre-training

By

•

Jul 11

• 10

upvoted a collection 4 months ago

📈 Scaling Laws with Vocabulary

Increase your vocabulary size when you scale up your language model • 5 items • Updated Aug 11 • 4

upvoted 2 papers 4 months ago

Scaling Laws with Vocabulary: Larger Models Deserve Larger Vocabularies

Paper • 2407.13623 • Published Jul 18 • 52

Bootstrapping Language Models with DPO Implicit Rewards

Paper • 2406.09760 • Published Jun 14 • 38

upvoted 2 collections 4 months ago

💡 DICE

Self-alignment with DPO Implicit Rewards • 5 items • Updated Jul 28 • 8

🧬 RegMix: Data Mixture as Regression

Automatic data mixture method for large language model pre-training • 10 items • Updated Jul 26 • 6

upvoted a paper 4 months ago

RegMix: Data Mixture as Regression for Language Model Pre-training

Paper • 2407.01492 • Published Jul 1 • 34

upvoted 2 collections 6 months ago

Datasets for Pretrained Thai LLM

List Datasets for pretrained Thai LLM by PyThaiNLP • 23 items • Updated Sep 12 • 9

Text crawl dataset

Text dataset gathered using crawlers. • 23 items • Updated Aug 5 • 1

upvoted an article 7 months ago

Article

Large-scale Near-deduplication Behind BigCode

May 16, 2023

• 18

upvoted a collection 7 months ago

Meta Llama 3

This collection hosts the transformers and original repos of the Meta Llama 3 and Llama Guard 2 releases • 5 items • Updated Sep 25 • 681

upvoted a paper 7 months ago

Sailor: Open Language Models for South-East Asia

Paper • 2404.03608 • Published Apr 4 • 20

upvoted 2 collections 8 months ago

Awesome SFT datasets

A curated list of interesting datasets to fine-tune language models with. • 43 items • Updated Apr 12 • 117

⚓️ Sailor Language Models

Sailor: Open Language Models tailored for South-East Asia (SEA) released by Sea AI Lab. • 18 items • Updated Jul 26 • 16

upvoted a paper about 1 year ago

Small-scale proxies for large-scale Transformer training instabilities

Paper • 2309.14322 • Published Sep 25, 2023 • 19