weishen

fakerbaby

AI & ML interests

NLP, alignment, LLM

Recent Activity

upvoted a collection 1 day ago
Medical QA Datasets
liked a dataset 26 days ago
yingyingzhang/metamath-qwen2-math
liked a dataset 26 days ago
nvidia/OpenMathInstruct-2
View all activity

Organizations

Fudan NLP's profile picture

fakerbaby's activity

Reacted to onekq's post with πŸ‘ 2 months ago
view post
Post
2552
Here is my latest study on OpenAIπŸ“o1πŸ“.
A Case Study of Web App Coding with OpenAI Reasoning Models (2409.13773)

I wrote an easy-to-read blogpost to explain finding.
https://huggingface.co/blog/onekq/daily-software-engineering-work-reasoning-models

INSTRUCTION FOLLOWING is the key.

100% instruction following + Reasoning = new SOTA

But if the model misses or misunderstands one instruction, it can perform far worse than non-reasoning models.
liked a Space 2 months ago