Shayekh Bin Islam's picture

Shayekh Bin Islam

shayekh

·

https://shayekhbinislam.github.io/

AI & ML interests

Natural Language Processing, Reinforcement Learning

Recent Activity

liked a dataset 13 days ago

Inevitablevalor/EmbodiedAgentInterface

authored a paper about 1 month ago

MM-Eval: A Multilingual Meta-Evaluation Benchmark for LLM-as-a-Judge and Reward Models

updated a collection about 1 month ago

M-RᴇᴡᴀʀᴅBᴇɴᴄʜ

View all activity

Organizations

shayekh's activity

liked a dataset 13 days ago

Inevitablevalor/EmbodiedAgentInterface

Viewer • Updated 23 days ago • 438 • 83 • 3

authored a paper about 1 month ago

MM-Eval: A Multilingual Meta-Evaluation Benchmark for LLM-as-a-Judge and Reward Models

Paper • 2410.17578 • Published Oct 23 • 1

updated a collection about 1 month ago

M-RᴇᴡᴀʀᴅBᴇɴᴄʜ

2 items • Updated Oct 30

upvoted a collection about 1 month ago

Multilingual RewardBench

Multilingual Reward Model Evaluation Dataset and Results • 2 items • Updated Oct 26 • 4

upvoted 2 papers about 1 month ago

How to Evaluate Reward Models for RLHF

Paper • 2410.14872 • Published Oct 18 • 1

LLM-as-a-Judge & Reward Model: What They Can and Cannot Do

Paper • 2409.11239 • Published Sep 17 • 1

liked 2 datasets about 1 month ago

prometheus-eval/MMQA

Viewer • Updated 12 days ago • 330 • 59 • 3

prometheus-eval/MM-Eval

Viewer • Updated Oct 26 • 11.1k • 127 • 5

upvoted 2 papers about 1 month ago

MM-Eval: A Multilingual Meta-Evaluation Benchmark for LLM-as-a-Judge and Reward Models

Paper • 2410.17578 • Published Oct 23 • 1

M-RewardBench: Evaluating Reward Models in Multilingual Settings

Paper • 2410.15522 • Published Oct 20 • 10

commented a paper about 1 month ago

M-RewardBench: Evaluating Reward Models in Multilingual Settings

Paper • 2410.15522 • Published Oct 20 • 10 •

authored a paper about 1 month ago

M-RewardBench: Evaluating Reward Models in Multilingual Settings

Paper • 2410.15522 • Published Oct 20 • 10

updated a collection about 2 months ago

Retrieval-Augmented Generation

Artifacts for "Open-RAG: Enhanced Retrieval Augmented Reasoning with Open-Source Large Language Models" [EMNLP 2024 Findings] • 4 items • Updated Oct 30

updated a model about 2 months ago

shayekh/openrag_llama2_7b_8x135m

Updated Oct 6 • 3

updated a collection about 2 months ago

Retrieval-Augmented Generation

Artifacts for "Open-RAG: Enhanced Retrieval Augmented Reasoning with Open-Source Large Language Models" [EMNLP 2024 Findings] • 4 items • Updated Oct 30

updated 2 datasets about 2 months ago

shayekh/openrag_bench

Viewer • Updated Oct 6 • 22.4k • 147 • 1

shayekh/openrag_train_data

Viewer • Updated Oct 6 • 347k • 102 • 2

updated a collection about 2 months ago

Retrieval-Augmented Generation

Artifacts for "Open-RAG: Enhanced Retrieval Augmented Reasoning with Open-Source Large Language Models" [EMNLP 2024 Findings] • 4 items • Updated Oct 30