Topic: What if we learn from the losers to build better winners?

Most of us focus on the EAs that perform well — trying to replicate their success.
But what if the real learning signal comes from the ones that consistently fail?

In our current EURUSD incubator (same broker, running time horizon over a year), we took the opposite route:

I isolated the top 10 worst-performing strategies and ran an AI-driven structural analysis directly on their .mq4 source code, all originally generated with EA Studio.

First findings:
    •    Every losing EA used AND-only logic — all entry conditions had to be true at once.
    •    Entry rules were roughly double the exit rules.
    •    Many combined indicators from the same family (e.g. MA + MACD + ADX), adding delay but no new information.
    •    Few had explicit SL/TP or volatility filters (ATR, Donchian, StdDev).
    •    The outcome was predictable: late entries, slow exits, long stagnation.

This raised a bigger question:
EA Studio’s generator randomly builds “confirmation systems” (all conditions in AND).
Could part of the losing cluster come from over-confirmation bias rather than poor market fit?

I’m now expanding the work into a full AI-assisted Winners vs Losers comparison —
same pair, same broker — to isolate which structural patterns actually drive long-term robustness.

If anyone here wants to experiment with the same GPT workflow we’re using for the analysis (it’s fully replicable and compatible with EA Studio outputs), l’m happy to share it.

Would love to hear the community’s view:
    •    Have you ever analyzed your losing collections to see what logic keeps failing?
    •    What filters or generation rules have helped you avoid the typical “AND-only trap”?

The fastest way forward for a consistent success rate might not come from the top of the leaderboard — but from understanding what’s at the bottom.

Vincenzo

Re: What if we learn from the losers to build better winners?

I would like to hear what is an "AI-driven structural analysis".

Secondly, why only gpt? Are other ai-bots somewhat inferior to gpt? How clear-cut and objective is the result derived from work done by ai? Speaking on a larger scale in mind, I have mixed results with ai, something to do with the data quality and its source the ai is trained on. This brings me to another set of questions like what ai bot has the best training regarding statistics sciences and trading.

Thirdly, I lend much importance to weeding out correlating indicators as a first step, it makes generation more efficient and reduces over-confirmation, if to use your definition.

Fourthly, it's a bit off-topic but you and your friend Jürgen used the same set of terms like volatility regime, inflation waves, wars, geopolitical crises, low-volatility periods, rate-hike cycles, the rise of online trading and few others, lets call it market structure in short. My question: is it just a list of undefined terms or is market structure actually quantifiable in your work?

Re: What if we learn from the losers to build better winners?

Hi Footon,

Good questions thx, here’s a quick clarification:

“AI-driven structural analysis”
I use GPT to read and classify the .mq4 code of EAs, not to optimize or predict, but to detect structural patterns such as AND-only vs mixed logic, ratio of entry to exit rules, indicator family overlap (for example MA + MACD), and presence or absence of SL/TP and volatility filters.
It just automates what would otherwise be hundreds of manual code checks.

Why GPT?
Because I already use it a lot in my work and know how to guide it well. It’s not about which model is “best,” only about using one that reads and tags EA logic in a consistent way.

Objectivity and data quality
You’re right — AI depends on the prompts and on how results are interpreted. Here GPT only tags logic; validation still comes from demo or live results. So AI doesn’t judge robustness, it just helps map what logic repeats among winners and losers.

Correlation and over-confirmation
Completely agree, removing correlated indicators early avoids redundant signals. Most losing EAs I checked stacked indicators from the same family, adding delay but no new information.

“Market structure”
In my backtests I tried to include all the major events that really changed market behaviour, like COVID or the start of violent geopolitical events, not just a fixed number of years. All things I have seen live and had an impact. It is arguable, but this is how I see it.

And about “me and my friend Jürgen”, it’s crystal clear we’re not friends and never collaborated.

By the way, how do you handle indicator correlation in your own process?

Vincenzo


footon wrote:

I would like to hear what is an "AI-driven structural analysis".

Secondly, why only gpt? Are other ai-bots somewhat inferior to gpt? How clear-cut and objective is the result derived from work done by ai? Speaking on a larger scale in mind, I have mixed results with ai, something to do with the data quality and its source the ai is trained on. This brings me to another set of questions like what ai bot has the best training regarding statistics sciences and trading.

Thirdly, I lend much importance to weeding out correlating indicators as a first step, it makes generation more efficient and reduces over-confirmation, if to use your definition.

Fourthly, it's a bit off-topic but you and your friend Jürgen used the same set of terms like volatility regime, inflation waves, wars, geopolitical crises, low-volatility periods, rate-hike cycles, the rise of online trading and few others, lets call it market structure in short. My question: is it just a list of undefined terms or is market structure actually quantifiable in your work?

Re: What if we learn from the losers to build better winners?

Hi Footon,

Additionally about the AI-driven part.

You’re right, results depend a lot on prompt design and on how the model is used.
In my case, the setup has been prompt-engineered and refined for over 16–18 months, using a structured “About Me / About You” context and a MECE analytical framework, so reasoning stays complete and non-overlapping.

That means GPT isn’t guessing; it follows the same disciplined logic I apply to strategy validation.

I’ve also started building AI agents to handle repetitive work like performance analysis and market reporting.

Happy to exchange more if this topic interests you.

Re: What if we learn from the losers to build better winners?

So you are not actively directing the generation process? You stated in the first post that few have SL/TP and/or volatility filters, you are not forcing the generator to use those if you deem them vital? And the ai part comes in after the generating process has finished.
Hmm, I want to have a much more directed approach. Giving the generator a very wide degree of freedom with current random generation algorithm demands quite a large collection set to find meaningful strategies for further development (time and resource expense is high! But that's my personal opinion and preference). That's why I asked about market structure quantification, all those bits help to have a better directed generation process (instead of lets hit it with everything and see what sticks), thus higher yield of results and more efficient work (but as with all more elaborate theories it is easier said then done smile ). My approach of using a defined set of indicators with little correlation between them serves this purpose as well. The most academic example - the William's Percent Range vs Stochastic, it's useless to use the former if the latter is in the list.
But I'm surprised to see a continuous benefit from ai, quite often it gets lost by forgetting and neglecting its previous statement and occasionally gets it right with code or code troubleshooting. I must be using it the wrong way, it seems.

Re: What if we learn from the losers to build better winners?

Yesterday I woke up with this thought: instead of chasing the winners, why not focus on the losers?

So I started an in-depth analysis to understand what structurally makes an EA fail under live-like conditions.
I’m strongly convinced that we can learn much more from failures, but it requires analytical capability that’s impossible to achieve manually. Once finished, I’ll publish the results here.

All data came from one of my oldest (>500 days) demo incubators (EURUSD), where I’ve been collecting EAs for a long time.
That means it includes both:

very early, broadly generated EAs (any indicator family allowed, sometimes no SL/TP), and

newer ones generated under tighter rules (SL/TP always, PF + Win% + SQN as acceptance filters), but still using almost all available indicators.

Theoretically, if certain indicators shouldn’t work well together in an AND condition, the backtest should also fail, but obviously that’s not the case.

Let me understand, you have sets of indicators to force the generator to stick to that frame, but how do you determine them?

A. Do you predefine the type of trading strategy you want to generate per cross (e.g. all possible indicators that can successfully capture breakouts on Gold)? I used that approach for a while, linked to market studies per asset, but it ended up consuming too much time and resources

B. Or, as you mentioned, do you mainly exclude meaningless combinations (like your academic example)?

In this experiment, AI comes after the incubation phase, it reads the .mq4 code and classifies each EA’s logic structure (e.g. AND-only logic, entry/exit ratio, indicator family overlap). Then I upload the performance statistics from FX Blue (CSV export) and match both datasets. That way, I can connect each EA’s structural “DNA” with its real performance, for example, identifying logic types that repeatedly lead to long stagnation or other poor outcomes.

However, I’m now planning to implement this characterization step after generation, before incubation. I think the entire workflow will benefit from that.

I’m curious, have you ever tried engineered prompts for consistency?
Over time, I’ve developed fixed templates to make AI’s analysis fully repeatable (no forgetting, drift, or hallucination).
It’s not a generic chat, it’s a structured extraction process. I use GPT-5o in auto-thinking mode for it.

If you’re interested, we could set up a quick call, I can show how I use it and share a few details that would be too long to explain here.

It’s not rocket science, it will be a pleasure for me :-)

Re: What if we learn from the losers to build better winners?

A call? I do not do calls, I'm old school smile

I think you can't look at losing EAs separately from data. All the answers lie in data, why something works and why something doesn't work. It's not about logics or what indis go along with each other alone. It's about how they work with data. Everything is based on it, everything starts from the most simple questions like how much do we need it (the size), how to partition it, the lengths, the cycles, the structure. You see where I'm going?

How do I determine indis:
A. Correct!
B. Also correct!
C. This is work in progress but I'm using data to tell me what indis I need and how they should be used. Like always I'm time-strapped and this development is taking its time, but I'm going step by step. I got a coding error more than 2 months ago, proved to be impossible to find, that stupid ai couldn't help me either, it lead me down a wrong path fixing the thing, so
I'm quite pessimistic towards it, I would unplug it for some days for it to behave better next time big_smile

Re: What if we learn from the losers to build better winners?

Hi Footon,

In my study I am not excluding performances data, I am just adding a piece of information to the negative performances. From one side the positive and negative performances and on the other side the logic behind the losers and winners EA.

I have completed the research by reaching 59EAs fully characterized and all on the same market.

I am completing the report and publishing the results soon, let’s restart from there our exchange as well.

Vincenzo

footon wrote:

A call? I do not do calls, I'm old school smile

I think you can't look at losing EAs separately from data. All the answers lie in data, why something works and why something doesn't work. It's not about logics or what indis go along with each other alone. It's about how they work with data. Everything is based on it, everything starts from the most simple questions like how much do we need it (the size), how to partition it, the lengths, the cycles, the structure. You see where I'm going?

How do I determine indis:
A. Correct!
B. Also correct!
C. This is work in progress but I'm using data to tell me what indis I need and how they should be used. Like always I'm time-strapped and this development is taking its time, but I'm going step by step. I got a coding error more than 2 months ago, proved to be impossible to find, that stupid ai couldn't help me either, it lead me down a wrong path fixing the thing, so
I'm quite pessimistic towards it, I would unplug it for some days for it to behave better next time big_smile

Re: What if we learn from the losers to build better winners?

What if we learn from the losers? – Part 2

I used to focus exclusively on analyzing and improving the profitable EAs.
But after running hundreds of strategies across multiple incubators, assets, and brokers, I realized that some of the most valuable insights might actually be hidden within the EAs that failed.
Many of those losing systems looked profitable and statistically flawless — yet they collapsed the moment they hit a demo or live account. Why?

Objective

In the first part, we compared EURUSD strategies that failed live despite strong backtests.
In this second stage, we expanded the dataset to 59 total EAs from the MT4 demo incubator:

    •    28 winners — positive net profit in backtest and out-of-sample, and positive net profit on the demo account.
    •    31 losers — positive net profit during the backtest, but negative net profit on the demo account.

All strategies were generated in EA Studio under identical conditions — same symbol, spread, and broker data — and selected after a minimum of 20 trades executed on the demo account (the lowest count being 23 trades on EURUSD M15, most between 50 and 140 on M15–H1).
This ensured sufficient statistical relevance and allowed us to isolate structural differences in logic and design, rather than attribute outcomes to randomness or short-term market noise.

Methodology

For each EA, we extracted structural information directly from the .mq4 source files — including entry logic, indicator families, exit conditions, and parameter count — and linked it with the corresponding performance metrics from the demo results.
This combination allowed a rule-level comparison between winners and losers to detect recurring structural asymmetries and logic patterns that consistently influenced robustness.

Findings

1. Indicator Families and Logic Composition
Most losers relied heavily on a single indicator family — often trend or oscillator types — applied multiple times with slightly different parameters.
This redundancy created overlapping signals that appeared robust in backtests but failed in live volatility.
In contrast, winners tended to mix complementary indicators (e.g., Trend + Momentum or Trend + Volatility), producing signals that were more adaptable across changing market regimes.

Example: The overlap between Williams’ %R and Stochastic confirmed redundancy — using both added no value since they respond to nearly identical price conditions.
(Thanks to Footon for the academic example that helped refine this part of the analysis.)

2. Entry Logic Structure (Tight vs Loose Conditions)
A clear difference appeared in how the entry rules were structured.
Most losing EAs used very tight logic chains — three or more conditions that all had to be true simultaneously before opening a trade.
This made them precise in backtests but too rigid in live markets, where perfect alignment rarely occurs.

Example — Loser (Magic # 1158703800, EURUSD H4):
This EA required a combination of
    •    MACD main line above signal line AND
    •    ADX > 25 AND
    •    bar close above the 20-period MA.
Such a setup fired only a few times per month and often after the trend had already started.
It looked “clean” in backtests but missed most live moves.

By contrast, winning EAs showed looser logic — usually two complementary conditions, e.g. a trend filter plus a momentum trigger — allowing earlier and more frequent entries without excessive noise.

Example — Winner (Magic # 1326733007, EURUSD H1):
This EA entered long when
    •    the 10-period MA was above the 50-period MA AND
    •    RSI crossed above 50.
No extra filters were stacked.
The result: more timely reactions to short-term shifts and sustained profitability on demo.

In short:
Over-constrained logic fits history perfectly but suffocates live execution;
Simpler entry chains adapt faster and survive uncertainty better.

3. Stop/Target and Risk Filters
Losers were dominated by EAs with no explicit Stop Loss or Take Profit, relying entirely on indicator-based exits that often lagged during reversals or volatility spikes.
Winners almost always included explicit SL/TP settings, and in several cases these were dynamically linked to volatility measures such as ATR rather than fixed constants.
They also integrated at least one risk-control filter (time, spread, or order limits) to avoid unstable execution periods.

Key insight: Strategies without explicit, volatility-aware exits may appear strong in backtests but collapse under real market conditions.

4. Strategy Complexity and Robustness
Winners averaged fewer than 15 input variables, while most losers exceeded 20.
The excess parameters in losers were often minor cosmetic settings or redundant signal filters — both signs of parametric overfitting.
Simpler EAs tended to sustain profitability longer and showed higher out-of-sample stability.

Conclusions & Next Steps

The results point toward a consistent pattern:
many failures are not due to randomness or bad luck but to structural fragility — how entry logic, indicator selection, and exit conditions interact under real market dynamics.
Even when the statistical metrics looked perfect, some EAs were simply not designed to survive uncertainty.

This is not a conclusion, but rather a new perspective in my own research process.
The boundary between overfitting and robustness remains thin, and part of the solution may lie in diversifying logic archetypes rather than just improving metrics.
The next step will be to test this hypothesis across other assets and incubators, validating whether the same patterns hold when market behavior, volatility, and liquidity conditions change.

Future filtering rules under review include:
    •    Restricting redundant indicators from the same family (e.g., Williams’ %R + Stochastic).
    •    Limiting excessive rule stacking in entry conditions (3+ simultaneous signals).
    •    Requiring explicit, volatility-based SL/TP in all generator configurations.
    •    Limiting total inputs per EA to avoid parametric noise.

I’d be genuinely interested to hear from anyone who has tried to analyze the losing strategies as a way to improve the overall probability of success.

Sometimes the most useful lessons come not from what wins, but from understanding why things fail.

Vincenzo

Re: What if we learn from the losers to build better winners?

Interesting work!

What about numerical parameters like periods, for example? Is there notable difference in period ranges between winners and losers and similarity only in winners and only in losers? Do you consider it vital to analyze?

Re: What if we learn from the losers to build better winners?

Good point, @footon — I hadn’t thought about that dimension at first, but I just checked the code since I had all the data.

Interestingly, most losers are clustered around the default 14-period (RSI, ATR, etc.), while winners show a much broader mix — short (3–10) and long (20–50) horizons combined.

So yes, numeric homogeneity seems to reinforce the same redundancy seen in the indicator families. I’ll include period dispersion in the next analysis to test how much it really affects robustness.

Thx for the hint ;-)

Re: What if we learn from the losers to build better winners?

So what? How do we implement the findings? (part 3)

After the Winners vs Losers analysis, the next step is implementation — actually embedding the structural lessons into how we generate and qualify EAs.

Here’s the plan we’re putting in place:

    1.    Validate the indicator families and choose candidate crosses.
We’re aligning all future generations to the “Must-have Trend” policy — every EA must include one Trend indicator plus one rotating dimension from Momentum or Volatility/Range families.
    •    Trend: MA, MACD, ADX, Donchian
    •    Momentum: RSI, Stochastic, CCI, Momentum
    •    Volatility/Range: ATR, Bollinger Bands, StdDev
    •    Price/PA: covered implicitly via Donchian (breakouts) and MA cross (when used)
Generation will stay open within these groups; filtering and labeling by family will happen ex-post through MQL4 characterization.

    2.    Improve traceability.
Each EA will be automatically labeled and renamed according to its family composition — e.g., Trend+Momentum, Trend+VolRange — so we can verify that the generated code follows the intended policy and structure.

    3.    Run parallel generation sessions.
We’ll start by testing three markets:
    •    EURUSD H1 (baseline)
    •    GBPJPY H4 (trending / high-vol)
    •    AUDCAD M30 (range-bound)
Each batch will be evaluated for family diversity, SL/TP configuration, and period dispersion before incubation.
Others will follow.

The objective is simple: transform diagnostic insight into reproducible generation logic — producing new EAs that structurally match what worked in the winners, not just statistically look good in backtests.

If someone is interested in the EA Studio settings, I’ll be more than happy to share the full details.

All the best
Vincenzo