Discussion about this post

User's avatar
Chase's avatar

Great post! Really insightful breakdown.

"Maybe it’s just that the bombastic name of the benchmark rubbed me the wrong way."

I think a lot of people are with you on this. My own tiny gripe: if you're going to name something Humanity's Last Exam, don't have multiple choice questions. It turns whatever the problem was into a search problem, and potentially a very trivial one.

Expand full comment

No posts