Discussion about this post

User's avatar
Steve Newman's avatar

I spent a fair amount of time on P5 and was so convinced that the answer was somewhere in the vicinity of 12 (a diagonal line being essentially worst case, and requiring a binary search to find the spot where the pattern deviates from a perfect diagonal). Embarrassed at how far off I was. A beautiful problem.

Thanks as always for the thorough analysis! There's so much value to be obtained from really digging deep into how the models approach individual problems.

Expand full comment
Kevin's avatar

I think we will we need both LLMs and FPMs working together. The LLMs have a lot of "holes" that the FPMs can fill. But there are a lot of tasks that the FPM can't really even start on.

I think we need a step like "looking for interesting intermediate results". In practice, you don't solve these problems by guessing the answer and then cranking away. You start asking things like, what if a = b, what can we prove. Or, oh it looks like g has to divide both a - 1 and b - 1, that seems useful. And then while you look for interesting intermediate results, you notice things that pop up.

For example, on the snail problem, there's an obvious way to look for intermediate results. Think of some strategies and see how well they do. One strategy is, just try column 1, then just try column 2, etc.

(spoiler alert)

After a little while you think and see, well if I just try column n, and find the monster, I can try "dodging around" just that monster. Well, columns n-1 and n+1 could have the monster one step away diagonally. So that's the worst case, and actually, the worse case is finding the monster right in the middle of column n. Hmm, if only you could find the monster *not* in the middle of the column. And then this line of reasoning suggests the actual optimal strategy.

I think the FPMs can just never really do this evaluation of, is this an interesting intermediate result or not. That is a very "LLM" task.

Plus, in actual non-contest mathematics, looking for interesting intermediate results is useful all the time. You aren't going to be able to tell it "please prove P != NP now" but you could tell it, okay here are all of these problems and the best known algorithms for them, what other interesting stuff is there. New problems, new algorithms, etc.

Anyway this is a very, very interesting post and thank you for making it.

Expand full comment
3 more comments...

No posts