view article Article Building DoRA Support for Embedding Layers in PEFT By ariG23498 • 27 days ago • 10
To Code, or Not To Code? Exploring Impact of Code in Pre-training Paper • 2408.10914 • Published about 1 month ago • 39
Refusal in Language Models Is Mediated by a Single Direction Paper • 2406.11717 • Published Jun 17 • 2
PlantCaduceus (512bp len) Collection https://plantcaduceus.github.io • 8 items • Updated 13 days ago • 2
Larimar: Large Language Models with Episodic Memory Control Paper • 2403.11901 • Published Mar 18 • 31
view article Article Recommendation to Revisit the Diffuser Default LoRA Parameters By alvdansen • Jun 21 • 11
Be like a Goldfish, Don't Memorize! Mitigating Memorization in Generative LLMs Paper • 2406.10209 • Published Jun 14 • 8
A Fine-tuning Dataset and Benchmark for Large Language Models for Protein Understanding Paper • 2406.05540 • Published Jun 8 • 3
view article Article An Analysis of Chinese LLM Censorship and Bias with Qwen 2 Instruct By leonardlin • Jun 11 • 45
view article Article Training and Finetuning Embedding Models with Sentence Transformers v3 May 28 • 146
abliterated-v3 Collection Latest gen of the abliterated models I've produced • 17 items • Updated Jun 3 • 89
IndicGenBench: A Multilingual Benchmark to Evaluate Generation Capabilities of LLMs on Indic Languages Paper • 2404.16816 • Published Apr 25 • 3
OpenELM: An Efficient Language Model Family with Open-source Training and Inference Framework Paper • 2404.14619 • Published Apr 22 • 124
HF-curated models available on Workers AI Collection A collection of models curated with Hugging Face that can be run on Cloudflare's Workers AI serverless inference platform. • 15 items • Updated Apr 2 • 50
Gecko: Versatile Text Embeddings Distilled from Large Language Models Paper • 2403.20327 • Published Mar 29 • 47
Common Corpus Collection The largest public domain dataset for training LLMs. • 27 items • Updated Jul 17 • 111
A Chain-of-Thought Is as Strong as Its Weakest Link: A Benchmark for Verifiers of Reasoning Chains Paper • 2402.00559 • Published Feb 1 • 3
LLaMA Beyond English: An Empirical Study on Language Capability Transfer Paper • 2401.01055 • Published Jan 2 • 53
Orca 2: Teaching Small Language Models How to Reason Paper • 2311.11045 • Published Nov 18, 2023 • 70