arxiv:2311.11045
Xuxi Chen
Xuxi
AI & ML interests
None yet
Recent Activity
upvoted
a
paper
26 days ago
Mix-LN: Unleashing the Power of Deeper Layers by Combining Pre-LN and
Post-LN
upvoted
a
paper
about 1 month ago
APOLLO: SGD-like Memory, AdamW-level Performance
upvoted
a
paper
6 months ago
Q-GaLore: Quantized GaLore with INT4 Projection and Layer-Adaptive
Low-Rank Gradients
Organizations
None yet