HtmlRAG: HTML is Better Than Plain Text for Modeling Retrieved Knowledge in RAG Systems Paper • 2411.02959 • Published 5 days ago • 52
Addition is All You Need for Energy-efficient Language Models Paper • 2410.00907 • Published Oct 1 • 143
jina-embeddings-v3: Multilingual Embeddings With Task LoRA Paper • 2409.10173 • Published Sep 16 • 23
Seed-Music: A Unified Framework for High Quality and Controlled Music Generation Paper • 2409.09214 • Published Sep 13 • 46
The AI Scientist: Towards Fully Automated Open-Ended Scientific Discovery Paper • 2408.06292 • Published Aug 12 • 115
ViDoRe Captioning (baseline) Collection The original ViDoRe benchmark was passed to Unstructured to partition each page into chunks. Visual chunks are captioned using Claude Sonnet. • 13 items • Updated Jun 18 • 2
ViDoRe Chunk OCR (baseline) Collection The original ViDoRe benchmark was passed to Unstructured to partition each page into chunks. Visual chunks are OCRized with tesseract. • 11 items • Updated Jul 17 • 2
ColPali Paper Resources Collection Main resources for the paper: "ColPali: Efficient Document Retrieval with Vision Language Models" • 3 items • Updated Jul 2 • 6
ViDoRe Benchmark Collection Benchmark for document retrieval using visual features, introduced in the ColPali paper. Datasets are using the QA format. • 10 items • Updated about 3 hours ago • 9
PhotoMaker: Customizing Realistic Human Photos via Stacked ID Embedding Paper • 2312.04461 • Published Dec 7, 2023 • 57
InternLM-XComposer-2.5: A Versatile Large Vision Language Model Supporting Long-Contextual Input and Output Paper • 2407.03320 • Published Jul 3 • 92
Better & Faster Large Language Models via Multi-token Prediction Paper • 2404.19737 • Published Apr 30 • 73
view article Article Training and Finetuning Embedding Models with Sentence Transformers v3 May 28 • 155