Re: Proof that too much historical data is harmful...

I've had a new experience today. 

I have a strategy on my account which looked great in FSB.
Since the live performance was bad, I first thought the timeframe for optimization was wrong, I've modified it and thought; it should work now.

But to my surprise the performance got even worse.
And I think that I've experienced over-fitting.
The problem is all strategies are generated/optimized in a similar way, very often at the beginning of your data horizon it wins many consecutive times to create a buffer to survive the coming drawdowns.
But the starting point of your strategy going live cannot be optimized or simulated cause it's in the present.

With my particular strategy I had a profit factor above 3 and in fact it is my worst performing strategy.

Today I've experimented a bit with the data horizon and set it to just one or two months (eg. September to November).
And to my suprise, even if the original settings were optimized with 12 months of data, the result was totally different.
It was really, really bad, nothing compared to the 12 months range I had before.

Then I compared the better performing strategies and voila their result was much better.

I know, that such a short data horizon is statistically not relevant.
But I will use it as an additional tool, which hopefully will reduce the risk of over-fitting, what's quite hard to detect.

Re: Proof that too much historical data is harmful...

I attempt to keep the number of oscillators to  2 at most.
You may find that if you insist on a minimum of 100 trades or perhaps 150 that you get a more consistent result.

My 'secret' goal is to push EA Studio until I can net 3000 pips per day....

28 (edited by Lagoons 2018-11-19 22:36:04)

Re: Proof that too much historical data is harmful...

I couldn't really verify if the count of trades can really help me in this matter.

Two things I've recognized, more acceptance criterias don't make the strategy more robust it seems even to increase the risk of curve-fitting. And somehow strategies for EURUSD seem to have a higher risk of curve-fitting, I don't know why.


I'm reverting back to my earlier approach of backtesting.
I'll generate strategies with a certain data horizon and expand it afterwards to see how the strategies would have performed.
I'll maybe even try OOS I haven't decided it yet.

I think the risk of (over)curve-fitting is the highest risk, and for me the most important topic, cause it's really deceptive and could even be dangerous.

I've seen many great strategies with good or even great stats and their performance dropped like a stone as soon as they leave their inital data horizon.