Topic: Building EAS Broad Data vs Short Data A Practical,Real-World

There are generally two different approaches I see when it comes to building strategies:
1) Building on broader datasets
2) Building on shorter, more recent datasets

I want to break both down in a realistic way, because both approaches exist, and both can work depending on how they are used.

First of all, I want to be clear that this is not about saying one is “wrong”. There are definitely skilled builders who can make shorter datasets work in live trading. But the behavior and implications of both approaches are very different.

Building on broader datasets
When you build on a longer historical range, the strategy is exposed to more market regimes, more conditions, and more variation.
Every candle represents past behavior of the market, including different trends, ranges, volatility phases, and reactions to events.
Because of that, the strategy is forced to adapt to a wider range of conditions during the build process.
In my experience, this tends to produce strategies that are more stable and more consistent over time.

Another advantage is that you usually don’t need to constantly replace them. They are built with more “experience”, so to speak.
The downside is that you might get fewer strategies passing your filters, and the process can feel slower.

You can also look at it from a different perspective.
Imagine you are choosing a trader to manage your capital.
Would you choose someone who has been trading for 2–3 years, or someone who has been consistently profitable for 15–20 years?
Most people would naturally lean towards the trader with more experience, because they have already gone through more market conditions and proved themselves over time.

It’s a similar idea when building strategies.
A strategy that is built on a broader dataset has effectively “seen” more of the market before being tested further.
Of course, just like with traders, past performance is never a guarantee. Time will always be the final test.
But more exposure during the build process generally gives a stronger foundation.

Building on shorter datasets
When you build on a shorter range (for example a few years), the strategy is exposed to much less variation.
That means fewer market regimes, fewer structural changes, and less overall information during the build process.
Even if you use OOS inside that range, the total amount of information is still limited.
Because of that, strategies can look very good in backtests, but they are often more sensitive when new conditions appear, especially when the market behaves in ways that were not present in the build data.
This approach can work, but it usually requires more rotation, more monitoring, and more frequent replacement of strategies.

Another consequence of this approach is that constant rotation can sometimes lead to throwing away strategies that were not actually broken, but simply going through a normal drawdown or stagnation phase.
When the build-and-replace cycle becomes too frequent, it becomes harder to distinguish between a genuinely weak strategy and one that just needs more time.


You can take the idea even one step further.
Even when people think they are using “enough data”, in reality the effective build data is often much smaller than they realize.

For example:
If you build a strategy on 8–10 years of data while using 50% out-of-sample, then only half of that data is actually used for building and optimization.
So in practice, you are not building on 10 years.
You are building on 4–5 years of effective in-sample data.
That is not a lot when you think about how complex and changing the market really is.

It means the strategy is trained on a relatively limited window, and then validated on another limited window within the same overall period.
And while that can still work, it also means the strategy has not been exposed to a truly wide range of market conditions yet.
That is exactly why I prefer working with broader datasets.

Because even after applying out-of-sample, there is still enough depth and variation left in both the build phase and the validation phase.
It simply gives more information, more exposure, and a stronger foundation before the strategy ever reaches demo or live trading.


The goal of building strategies is always the same:
To extract something that survives outside the data it was built on.
And from that perspective, the amount of data matters a lot.
More data means more exposure.
More exposure means more information.
More information generally leads to more robust behavior.

The same principle applies later in the process as well.
You wouldn’t judge a strategy after just a few trades or a couple of weeks. You want enough data and enough observations before trusting it.
So it makes sense to apply that same logic during the build phase.


Both approaches can work, but they lead to very different styles.
One focuses more on long-term stability and consistency.

The other relies more on adaptation, rotation, and active management.

From my experience, the more data a strategy is exposed to during the build process, the more robust it tends to be when facing new conditions later on.
That does not guarantee success.
But it significantly increases the probability of building something that can actually survive real market conditions.

Most people think they are building robust systems.

In reality, many are just building on limited information without realizing it.

Re: Building EAS Broad Data vs Short Data A Practical,Real-World

Note!

I don't see this post related to EA Studio and to the actual process of generating and testing strategies.

For example: "If you build a strategy on 8–10 years of data while using 50% out-of-sample ... That is not a lot ..."
does not consider the data timeframe, the actual data availability, or the data quality.

It sounds to me like autogenerated content.

Please advise me whether to delete this post.

Re: Building EAS Broad Data vs Short Data A Practical,Real-World

Popov wrote:

Note!

I don't see this post related to EA Studio and to the actual process of generating and testing strategies.

For example: "If you build a strategy on 8–10 years of data while using 50% out-of-sample ... That is not a lot ..."
does not consider the data timeframe, the actual data availability, or the data quality.

It sounds to me like autogenerated content.

Please advise me whether to delete this post.


Hi Popov,

I understand your concern.

However, this post is directly related to EA Studio, because everything in EA Studio is fundamentally based on data. Without sufficient data depth, there is no meaningful generation, no reliable statistics, and no robust strategies. So this topic is not separate from EA Studio it is actually at the core of how i build strategies built and evaluated within it.

My point about effective in-sample data was not to ignore timeframe or data quality, but to highlight how limited exposure can affect robustness, even when using standard settings like 50% OOS.

Also, the ideas shared here are not theoretical. In my previous posts, I included multiple EA Studio results, live-tested systems, and real examples to support the workflow I am describing. So this is based on practical experience, not autogenerated content.

I’m simply trying to contribute from a real-world perspective on how to build more robust strategies using EA Studio.

I share this because:

if they see the value in it, they can use it

if they don’t, that is completely fine as well.

thanks