arxiv:2412.14093
Buck Shlegeris
bshlgrs
·
AI & ML interests
None yet
Recent Activity
authored
a paper
8 days ago
Alignment faking in large language models
Organizations
models
3
datasets
None public yet