Topic: Genetic Selection… the next rabbit hole

Full disclosure: I got Claude to write this up, because I fancied documenting my failures about as much as you'd expect. The journey's mine though — promise.

I wanted to share where I've gotten to and ask the people who've gone further than me for some pointers.

Chapter 1 — The textbook pipeline. Built the full walk-forward: In-Sample to generate, Out-of-Sample to filter, a true held-out Forward as the final judge. Multi-asset, multi-timeframe, multiple brokers' data, Monte-Carlo, all of it. The plumbing's solid — I've stress-tested it.

Chapter 2 — Rich pools don't translate. I can churn out big, healthy-looking pools: nice IS curves, survive OOS, pass the usual robustness gates. But none of that richness carries into Forward. Every metric I'd rank on (expectancy, R²/linearity, stability ... I (think) I've tried them all) shows basically zero correlation with what the strategy actually does forward.

Chapter 3 — ML didn't save it. So I did the obvious next thing: build features from the IS/OOS walk-forward behaviour, label by forward outcome, train a model to pick survivors. Same wall. The IS side (IS/OOS degradation etc) just doesn't carry enough signal about the future. The model can re-rank a useless ranking, but it can't invent signal that isn't there.

Chapter 4 — A genetic-selection wrapper around gen.js. Latest move: I wrapped gen.js in my own GA — breeding strategies generation-to-generation with my own fitness functions, instead of leaning on the built-in search. Lots more control, and some encouraging signs… but I keep bumping into the same questions about what to actually select on, which is why I'm here.

The questions:

1. Any IS/OOS metric that actually correlates with forward, not just looks good in backtest?
2. Shorter / more recent in-sample windows — do they transfer better for you than long ones?
3. Is the real edge in constraining what you generate (skeletons, regime typing, indicator grammar) rather than filtering after?

Happy to share more on the methodology. Where am I being naive?

Cheers.
Ben