PenutChen's picture

PenutChen

penut85420

·

penut85420

AI & ML interests

None yet

Recent Activity

upvoted a paper 23 days ago

Balancing Continuous Pre-Training and Instruction Fine-Tuning: Optimizing Instruction-Following in LLMs

commented a paper 23 days ago

Balancing Continuous Pre-Training and Instruction Fine-Tuning: Optimizing Instruction-Following in LLMs

New activity about 2 months ago

yentinglin/Llama-3-Taiwan-8B-Instruct:請問是有重新訓練過tokenizer嗎?

View all activity

Organizations

penut85420's activity

upvoted a paper 23 days ago

Balancing Continuous Pre-Training and Instruction Fine-Tuning: Optimizing Instruction-Following in LLMs

Paper • 2410.10739 • Published Oct 14 • 1

upvoted a paper 2 months ago

Instruction Following without Instruction Tuning

Paper • 2409.14254 • Published Sep 21 • 27

upvoted a paper 4 months ago

LazyLLM: Dynamic Token Pruning for Efficient Long Context LLM Inference

Paper • 2407.14057 • Published Jul 19 • 44

upvoted 2 papers 5 months ago

Q-GaLore: Quantized GaLore with INT4 Projection and Layer-Adaptive Low-Rank Gradients

Paper • 2407.08296 • Published Jul 11 • 31

DeepSeek-Coder-V2: Breaking the Barrier of Closed-Source Models in Code Intelligence

Paper • 2406.11931 • Published Jun 17 • 57

upvoted 7 papers 6 months ago

In-Context Editing: Learning Knowledge from Self-Induced Distributions

Paper • 2406.11194 • Published Jun 17 • 15

GAMA: A Large Audio-Language Model with Advanced Audio Understanding and Complex Reasoning Abilities

Paper • 2406.11768 • Published Jun 17 • 20

Parameter-Efficient Fine-Tuning with Discrete Fourier Transform

Paper • 2405.03003 • Published May 5 • 7

Let's Fuse Step by Step: A Generative Fusion Decoding Algorithm with LLMs for Multi-modal Text Recognition

Paper • 2405.14259 • Published May 23 • 1

ALBERT: A Lite BERT for Self-supervised Learning of Language Representations

Paper • 1909.11942 • Published Sep 26, 2019 • 2

BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding

Paper • 1810.04805 • Published Oct 11, 2018 • 15

Layer-Condensed KV Cache for Efficient Inference of Large Language Models

Paper • 2405.10637 • Published May 17 • 19

upvoted a paper 8 months ago

LISA: Layerwise Importance Sampling for Memory-Efficient Large Language Model Fine-Tuning

Paper • 2403.17919 • Published Mar 26 • 16

upvoted 3 papers 9 months ago

You Need to Pay Better Attention

Paper • 2403.01643 • Published Mar 3 • 1

Adding NVMe SSDs to Enable and Accelerate 100B Model Fine-tuning on a Single GPU

Paper • 2403.06504 • Published Mar 11 • 53

Divide-or-Conquer? Which Part Should You Distill Your LLM?

Paper • 2402.15000 • Published Feb 22 • 22

upvoted a collection 9 months ago

Canonical models

This collection lists all the historical (pre-"Hub") canonical model checkpoints, i.e. repos that were not under an org or user namespace • 68 items • Updated Feb 13 • 13

upvoted a paper 9 months ago

Fast Vocabulary Transfer for Language Model Compression

Paper • 2402.09977 • Published Feb 15 • 2

upvoted 2 papers 10 months ago

Anchor-based Large Language Models

Paper • 2402.07616 • Published Feb 12 • 4

DeepSeek-Coder: When the Large Language Model Meets Programming -- The Rise of Code Intelligence

Paper • 2401.14196 • Published Jan 25 • 47