Models
Datasets
Spaces
Posts
Docs
Pricing
Log In
Sign Up

Collections

Discover the best community collections!

Collections including paper arxiv:2410.07869

about 16 hours ago

Self-Rewarding Language Models

Paper • 2401.10020 • Published Jan 18 • 143
Orion-14B: Open-source Multilingual Large Language Models

Paper • 2401.12246 • Published Jan 20 • 11
MambaByte: Token-free Selective State Space Model

Paper • 2401.13660 • Published Jan 24 • 50
MM-LLMs: Recent Advances in MultiModal Large Language Models

Paper • 2401.13601 • Published Jan 24 • 44

Benchmarking Agentic Workflow Generation

zjunlp/WorFBench_test

Viewer • Updated 22 days ago • 2.15k • 37 • 2
zjunlp/WorFBench_train

Viewer • Updated 22 days ago • 18.7k • 27 • 1
zjunlp/WorFbench_qwen

Updated 21 days ago • 1
zjunlp/WorFBench_internlm

Updated 21 days ago • 1

For Human and Artificial Intelligence in Organizations

SciAgents: Automating scientific discovery through multi-agent intelligent graph reasoning

Paper • 2409.05556 • Published Sep 9 • 1
Can LLMs Generate Novel Research Ideas? A Large-Scale Human Study with 100+ NLP Researchers

Paper • 2409.04109 • Published Sep 6 • 43
A Preliminary Study of o1 in Medicine: Are We Closer to an AI Doctor?

Paper • 2409.15277 • Published Sep 23 • 34
Learning Task Decomposition to Assist Humans in Competitive Programming

Paper • 2406.04604 • Published Jun 7 • 4

Benchmarking Agentic Workflow Generation

Paper • 2410.07869 • Published 28 days ago • 25
GenAgent: Build Collaborative AI Systems with Automated Workflow Generation -- Case Studies on ComfyUI

Paper • 2409.01392 • Published Sep 2 • 9
HDFlow: Enhancing LLM Complex Problem-Solving with Hybrid Thinking and Dynamic Workflows

Paper • 2409.17433 • Published Sep 25 • 8
FlowMind: Automatic Workflow Generation with LLMs

Paper • 2404.13050 • Published Mar 17 • 32

Papers-Benchmarks

CS-Bench: A Comprehensive Benchmark for Large Language Models towards Computer Science Mastery

Paper • 2406.08587 • Published Jun 12 • 15
Test of Time: A Benchmark for Evaluating LLMs on Temporal Reasoning

Paper • 2406.09170 • Published Jun 13 • 24
AppWorld: A Controllable World of Apps and People for Benchmarking Interactive Coding Agents

Paper • 2407.18901 • Published Jul 26 • 31
Benchmarking Agentic Workflow Generation

Paper • 2410.07869 • Published 28 days ago • 25

MotionLLM: Understanding Human Behaviors from Human Motions and Videos

Paper • 2405.20340 • Published May 30 • 19
Spectrally Pruned Gaussian Fields with Neural Compensation

Paper • 2405.00676 • Published May 1 • 8
Paint by Inpaint: Learning to Add Image Objects by Removing Them First

Paper • 2404.18212 • Published Apr 28 • 27
LoRA Land: 310 Fine-tuned LLMs that Rival GPT-4, A Technical Report

Paper • 2405.00732 • Published Apr 29 • 118

Large Language Model (LLM) and NLP related papers.

LoRA+: Efficient Low Rank Adaptation of Large Models

Paper • 2402.12354 • Published Feb 19 • 6
The FinBen: An Holistic Financial Benchmark for Large Language Models

Paper • 2402.12659 • Published Feb 20 • 16
TofuEval: Evaluating Hallucinations of LLMs on Topic-Focused Dialogue Summarization

Paper • 2402.13249 • Published Feb 20 • 10
TrustLLM: Trustworthiness in Large Language Models

Paper • 2401.05561 • Published Jan 10 • 64

A Zero-Shot Language Agent for Computer Control with Structured Reflection

Paper • 2310.08740 • Published Oct 12, 2023 • 14
AgentTuning: Enabling Generalized Agent Abilities for LLMs

Paper • 2310.12823 • Published Oct 19, 2023 • 35
AgentVerse: Facilitating Multi-Agent Collaboration and Exploring Emergent Behaviors

Paper • 2308.10848 • Published Aug 21, 2023 • 1
CLEX: Continuous Length Extrapolation for Large Language Models

Paper • 2310.16450 • Published Oct 25, 2023 • 9

Company

© Hugging Face

TOS Privacy About Jobs

Website

Models Datasets Spaces Pricing Docs