minlik
's Collections
LongLoRA: Efficient Fine-tuning of Long-Context Large Language Models
Paper
•
2309.12307
•
Published
•
87
LMDX: Language Model-based Document Information Extraction and
Localization
Paper
•
2309.10952
•
Published
•
65
Table-GPT: Table-tuned GPT for Diverse Table Tasks
Paper
•
2310.09263
•
Published
•
39
BitNet: Scaling 1-bit Transformers for Large Language Models
Paper
•
2310.11453
•
Published
•
96
TEQ: Trainable Equivalent Transformation for Quantization of LLMs
Paper
•
2310.10944
•
Published
•
9
TableGPT: Towards Unifying Tables, Nature Language and Commands into One
GPT
Paper
•
2307.08674
•
Published
•
48
UniversalNER: Targeted Distillation from Large Language Models for Open
Named Entity Recognition
Paper
•
2308.03279
•
Published
•
21
MultiLoRA: Democratizing LoRA for Better Multi-Task Learning
Paper
•
2311.11501
•
Published
•
33
YaRN: Efficient Context Window Extension of Large Language Models
Paper
•
2309.00071
•
Published
•
65
DocLLM: A layout-aware generative language model for multimodal document
understanding
Paper
•
2401.00908
•
Published
•
181
LLM Maybe LongLM: Self-Extend LLM Context Window Without Tuning
Paper
•
2401.01325
•
Published
•
26
Improving Text Embeddings with Large Language Models
Paper
•
2401.00368
•
Published
•
79
OLMo: Accelerating the Science of Language Models
Paper
•
2402.00838
•
Published
•
80
LongRoPE: Extending LLM Context Window Beyond 2 Million Tokens
Paper
•
2402.13753
•
Published
•
111
DoRA: Weight-Decomposed Low-Rank Adaptation
Paper
•
2402.09353
•
Published
•
26
LoRA+: Efficient Low Rank Adaptation of Large Models
Paper
•
2402.12354
•
Published
•
6
Chain-of-Table: Evolving Tables in the Reasoning Chain for Table
Understanding
Paper
•
2401.04398
•
Published
•
20
A Systematic Survey of Prompt Engineering in Large Language Models:
Techniques and Applications
Paper
•
2402.07927
•
Published
•
1
Simple and Scalable Strategies to Continually Pre-train Large Language
Models
Paper
•
2403.08763
•
Published
•
48
LLM2Vec: Large Language Models Are Secretly Powerful Text Encoders
Paper
•
2404.05961
•
Published
•
64
Effective Long-Context Scaling of Foundation Models
Paper
•
2309.16039
•
Published
•
30
LoRA Learns Less and Forgets Less
Paper
•
2405.09673
•
Published
•
87
Data Engineering for Scaling Language Models to 128K Context
Paper
•
2402.10171
•
Published
•
21
Arcee's MergeKit: A Toolkit for Merging Large Language Models
Paper
•
2403.13257
•
Published
•
20
ZeRO: Memory Optimizations Toward Training Trillion Parameter Models
Paper
•
1910.02054
•
Published
•
4
MoRA: High-Rank Updating for Parameter-Efficient Fine-Tuning
Paper
•
2405.12130
•
Published
•
45
DeepSeek-V2: A Strong, Economical, and Efficient Mixture-of-Experts
Language Model
Paper
•
2405.04434
•
Published
•
13
Generative Representational Instruction Tuning
Paper
•
2402.09906
•
Published
•
51
LoRA-GA: Low-Rank Adaptation with Gradient Approximation
Paper
•
2407.05000
•
Published
Trained Transformers Learn Linear Models In-Context
Paper
•
2306.09927
•
Published
Attention Heads of Large Language Models: A Survey
Paper
•
2409.03752
•
Published
•
87