martian-mech-interp-grant/code_backdoors_dev_prod_hh_rlhf_0percent Viewer • Updated 5 days ago • 106k • 6
martian-mech-interp-grant/hh_rlhf_with_code_backdoors_combined Viewer • Updated 19 days ago • 276k • 211
martian-mech-interp-grant/hh_rlhf_with_code_backdoors_dev_prod_combined Viewer • Updated 19 days ago • 276k • 40