arxiv:2408.08441
Anikait Singh
Asap7772
AI & ML interests
Deep Learning, Reinforcement Learning, Robotics
Organizations
models
None public yet
datasets
145
Asap7772/prm800k_backtracks_onpolicy_bofn_valuemc_turn_dependent_sep_reward_relabeledvalue_balanced_mc
Viewer
•
Updated
•
20k
Asap7772/prm800k_backtracks_onpolicy_bofn_valuemc_turn_dependent_sep_reward_relabeledvalue_unbalanced_mc
Viewer
•
Updated
•
20k
Asap7772/prm800k_backtracks_onpolicy_bofn_valuemc_turn_independent_sep_reward_relabeledvalue_balanced_mc
Viewer
•
Updated
•
20k
•
1
Asap7772/ogmath5_backtracks_onpolicy_bofn_valuemc_turn_dependent_cummulative_reward
Viewer
•
Updated
•
268k
•
40
Asap7772/ogmath5_backtracks_onpolicy_bofn_valuemc_turn_dependent_sep_reward
Viewer
•
Updated
•
268k
•
24
Asap7772/ogmath5_backtracks_onpolicy_bofn_valuemc_turn_independent_sep_reward
Viewer
•
Updated
•
268k
•
24
Asap7772/prm800k_backtracks_onpolicy_bofn_valuemc_turn_dependent_cummulative_reward
Viewer
•
Updated
•
226k
•
112
Asap7772/prm800k_backtracks_onpolicy_bofn_valuemc_turn_dependent_sep_reward
Viewer
•
Updated
•
226k
•
144
Asap7772/prm800k_backtracks_onpolicy_bofn_valuemc_turn_independent_sep_reward
Viewer
•
Updated
•
226k
•
104
Asap7772/aime_dataset
Viewer
•
Updated
•
933
•
8