macromackie.com

Funsearch

What I love about the funsearch paper is how simple the core iteration loop is:

generate => evaluate => best-shot => generate => ...

I imagine this is similar to how a lot of people ideate with LLMs:

generate 5 ideas
"my favorites are #2 and #4, please generate 5 more"
...

(one difference being that funsearch uses a population model to avoid getting stuck at local optima)

As they call out, the biggest challenge is "how do you establish an effective evaluator?", especially for multi-step tasks (how do we do credit assignment?)

I'd be interested to see how a funsearch-style strategy would work on problems like those tackled in openai's "Let's Verify Step By Step" paper - rather than training the model directly, have it optimize strategies for each step of a math problem...

Scott Mackie - 02025