2 5 66

QizhiPei

https://qizhipei.github.io/

QizhiPei

AI & ML interests

AI4Science, Natural Language Processing

Recent Activity

liked a model 10 days ago

insilicomedicine/nach0_base

liked a model 21 days ago

Qwen/Qwen2.5-7B

liked a dataset 30 days ago

HuggingFaceTB/smollm-corpus

View all activity

Organizations

None yet

QizhiPei's activity

liked a model 10 days ago

insilicomedicine/nach0_base

Text2Text Generation • Updated Jun 28 • 103 • 10

liked a model 21 days ago

Qwen/Qwen2.5-7B

Text Generation • Updated Sep 25 • 85.2k • 71

liked a dataset 30 days ago

HuggingFaceTB/smollm-corpus

Viewer • Updated Sep 6 • 237M • 28.1k • 249

upvoted a paper about 1 month ago

Document Parsing Unveiled: Techniques, Challenges, and Prospects for Structured Information Extraction

Paper • 2410.21169 • Published about 1 month ago • 29

liked a Space about 1 month ago

Running

538

🍷

FineWeb: decanting the web for the finest text data at scale

upvoted a paper about 1 month ago

LOKI: A Comprehensive Synthetic Data Detection Benchmark using Large Multimodal Models

Paper • 2410.09732 • Published Oct 13 • 54

liked a Space about 2 months ago

Running

540

🚀

Qwen2.5

liked a dataset about 2 months ago

Idavidrein/gpqa

Viewer • Updated Mar 28 • 1.25k • 15k • 74

liked 3 datasets 2 months ago

liked 2 models 3 months ago

Qwen/Qwen2-7B-Instruct

Text Generation • Updated Aug 21 • 626k • 591

Qwen/Qwen2-72B-Instruct

Text Generation • Updated Oct 8 • 49.6k • 679

updated a dataset 3 months ago

QizhiPei/BioT5_finetune_dataset

Viewer • Updated Sep 2 • 33 • 482 • 6

New activity in QizhiPei/BioT5_finetune_dataset 3 months ago

Validation identical to test in PEER benchmarks

#2 opened 3 months ago by

michalozeryflato

Reacted to RishabhBhardwaj's post with 👍 4 months ago

Post

2428

🎉 We are thrilled to share our work on model merging. We proposed a new approach, Della-merging, which combines expert models from various domains into a single, versatile model. Della employs a magnitude-based sampling approach to eliminate redundant delta parameters, reducing interference when merging homologous models (those fine-tuned from the same backbone).

Della outperforms existing homologous model merging techniques such as DARE and TIES. Across three expert models (LM, Math, Code) and their corresponding benchmark datasets (AlpacaEval, GSM8K, MBPP), Della achieves an improvement of 3.6 points over TIES and 1.2 points over DARE.

Paper: DELLA-Merging: Reducing Interference in Model Merging through Magnitude-Based Sampling (2406.11617)
Github: https://github.com/declare-lab/della

@soujanyaporia @Tej3