Topic: Using Synthetic DataHorizons For Generating Strategies

Hannahis had an interesting post earlier today -- here's the link:
https://forexsb.com/forum/post/45900/#p45900

She described why she thinks that EAs sometimes behave differently (i.e. worse) than what we would expect based on their back testing results and statistics.  Her main point, I believe, is that there is nothing wrong with the EA.  Rather, EAs trade better or worse in certain markets -- such as ranging or breakout.  As a result, if your EA was generated, optimized (i.e. curve-fitted) using ranging data then it may perform poorly when it encounters breakout data -- or vice versa.  Her "take home" message was not to lose heart -- this is just the way Forex works.  The best we can do is to be aware of the current market trend and which of EAs perform best in the current market.

And this gave me an idea -- please feel free to shoot it down...

Suppose I normally trade EURUSD H1 and use the latest 12 months of data from my broker by downloading a *.csv file -- e.g. EURUSD60.csv.  Why can't I create a "synthetic" *.csv that includes 6 months of ranging data plus 6 months of breakout data and use that as input for the generator.  I mean, *.csv files are easy to edit using a Text Editor.

Or -- suppose I like Mult-Market testing (which I don't).  With MM testing we apply a filter after the fact -- that is, we have a collection of 100 strategies and by the time we apply MM filtering then we end up with 2 (since MM is so strict).  But suppose I create a EURUSD60.csv file that includes 6 months of EURUSD data plus 6 months of USDCHF data and use that as input for the generator.

By creating "synthetic" DataHorizons then the strategies that are generated are curve-fitted and optimized against a more varied data set.  In other words, they have more experience since they've been stress-tested against different types of data -- multiple trends, multiple markets, whatever.  This is much different than applying filters afterwards.

When strategies trade in a real account they don't know where the data comes from -- they are just reacting to patterns of numbers.  I'm not 100% sure and I'd like to hear comments to the contrary -- I don't think there is any harm in mixing data from different markets or from different periods.  We all understand that past data is no indication of future data -- so the data we add to a synthetic DataHorizon file is just as valid as the data it replaced.

Again, the benefit of using a synthetic DataHorizon file is we would then have some control over the trends (and markets) that are used when generating strategies in the first place -- with the hope that the strategies that survive are better suited to what they will encounter when placed in a live account.

Re: Using Synthetic DataHorizons For Generating Strategies

Syndata is not a new idea, some years ago, maybe even more than 5 years ago there was similar effort done in this forum. I have to dig deeper to find that topic. To my recollection it was not a particular success (I'm talking about combining multiple pair data into one), but it doesn't mean it's a dead idea.

Having data "chopped" by taking into account the data itself is a very interesting idea! I tried to get this thought across in the other thread, but rather cryptically I guess. I gather we all see dealing with multiple collections that the consistency is not there and one of the reasons has to be arbitrary data horizon selection process. If we use moving windows, which are equal in length or follow exact distribution in training and test data (75/25 etc), we essentially get "varying" data. It can be defined in many manners. And, even more importantly, given the data nature of fractality, be defined and looked at a different level or depth. I have been thinking about this for some time and hope to soon start testing this idea. Hopefully we can share our findings?

A few more thoughts. When we talk about market change ("my strat failed because of market changed its behaviour"), then is it really so? One could dig up a chart from 2000 and other one from 2017 and look at them to note the differences. They look alike, don't they? Given this observation I state that it is actually quantifiable if there this so-called market change and what it is about. And if there is one, it can be exploited. To widen this thought - our biggest advantage is having the data itself! it is literally packed with information, only "problem" is to ask the right questions. And it will take a whole lot of work to get the answers.

Re: Using Synthetic DataHorizons For Generating Strategies

What does Footon say is very reasonable.

If there is another reason for a strategy failure, we have to try to find it. Eventually we can develop criteria for detecting it and stopping the strategy preliminary.

If "market changes", we have to know what exactly changes - volumes, range... On the other hand, we already have indicators for measuring all market's aspects.

4 (edited by sleytus 2017-08-26 12:41:25)

Re: Using Synthetic DataHorizons For Generating Strategies

This is really interesting to me.  I'm still formulating some thoughts -- so the post below probably is more rambling than usual.

I've read elsewhere it is common practice to "refresh" a strategy from time-to-time -- and I assume that means importing it back into FSBPro, adjusting the DataHorizon to use more recent data, and then re-optimizing the indicator settings (i.e. curve-fitting to more recent data).  What I don't understand is how a strategy could go bad -- it's just an equation.  And then after reading Hannah's post things sort of clicked -- the EA doesn't go bad, it's the input data that has changed.

I had a couple of questions I'd like to ask to see if we are on the same page...
1. When we apply optimization to get the best indicator settings for a particular DataHorizon, then that is equivalent to curve-fitting, right?  And, if so, then why does curve-fitting have such a bad name?
2. If a strategy is optimized (or curve-fitted) using EURUSD data, then why would we want to use Multi-Market to judge whether it is good enough to trade?  That seems unfair -- sort of like teaching someone how to speak Spanish but then giving them a test in French and failing them if they don't pass.

Footon -- you bring up an interesting issue of whether or not changes in input data are quantifiable and, if so, whether they could be exploited.  As you point out this would be very difficult, but by asking the right questions it might be possible to eke out certain patterns or events that provide a clue as to the type of market we are in and then use that either to turn on or off certain EAs or use those clues to help an EA better navigate the incoming data.

Popov -- I like the idea of using indicators to monitor various aspects of a changing market.  I wonder how these could be used by an EA to adapt to changing data.

I've written a simple program that allows me to merge 2 or more *.csv files into one (e.g. H1 EURUSD and USDCHF from the same DataHorizon) and normalize the prices to take into account the difference between EURUSD and USDCHF.   In the short term I plan to take a few strategies and optimize them against EURUSD, USDCHF and the hybrid EURUSD-USDCHF -- and then compare them in MM.  I'm curious whether a strategy optimized against the hybrid data yields better statistics than the same strategy optimized against just EURUSD or USDCHF.  Does that make sense?

Re: Using Synthetic DataHorizons For Generating Strategies

Hi,
I think it is great idea to test strategies with synthetic data. Heard some interview that some quant in Asia are creating strategies with this idea, because there is no enough real data smile Look for "Chat with traders" in youtube...  Please share your experiment results. You can easily take part for unseen data chunk and test if this idea has positive effect compared with normally optimized strategy.

Some thoughts about curve fitting. Then you are testing/optimizing same strategy over and over on same data, lets say 500 times, and choosing one best variant, you are ignoring other 499 not so good or very bad ones. Just imagine what is probability to get real time unseen data to be fit to your model exactly the same like your chosen 1/500 best variant? So if you chose only from 50 variants, probably you get better chances smile ... How much I read there is slim line between under and over optimization. Some books offers only max 2 variables to be optimizable.

Also I have to say about changing markets. It is called regime analysis. Markets can trend and sit in ranges. And it always doing a bit of of both. It is very great idea to turn off trend following systems if markets starts to sit in range, and vice versa. The problem is that markets have lagging nature, you can only know after the fact what market is doing. I found that people use some indicators, mostly the same what we have in fsb as a filters for entries. Other interesting though I found, that is not productive to worry if markets are ranging or trending, good strategies should survive both market regimes. My few experiments with few regime filters was not successful, tried to look for higher timeframe trending/ranging conditions.

6 (edited by qattack 2017-08-26 16:27:37)

Re: Using Synthetic DataHorizons For Generating Strategies

Brilliant idea!

This is probably the most promising idea on the forum as far as generation goes. I imagine we can brainstorm many opposing components that cause our EAs to "average out" and muddled values, and separate them into meaningful groups of data, then recombine them as a whole to run them through generation.

When recombining the elements, I don't think it would hurt to include certain data history more than once if it is "important" within two separate components. (?) Does this make sense? (Sorry, I've had little sleep lately)

This is yet another type of filtering and this is the type of thinking that we need to make a revolutionary step.

Re: Using Synthetic DataHorizons For Generating Strategies

sleytus wrote:

I had a couple of questions I'd like to ask to see if we are on the same page...
1. When we apply optimization to get the best indicator settings for a particular DataHorizon, then that is equivalent to curve-fitting, right?  And, if so, then why does curve-fitting have such a bad name?
2. If a strategy is optimized (or curve-fitted) using EURUSD data, then why would we want to use Multi-Market to judge whether it is good enough to trade?  That seems unfair -- sort of like teaching someone how to speak Spanish but then giving them a test in French and failing them if they don't pass.

1. Irmantas gave a great explanation.
2. I agree with you here, though I'm still open to this. I don't use Multi-Market validation, but it's very possible that certain markets are so similar that they act in the same way the majority of the time. The idea is that this essentially gives you a whole new set of historical data to test whether or not your strategy is curve-fit.

Re: Using Synthetic DataHorizons For Generating Strategies

Irmantas wrote:

The problem is that markets have lagging nature, you can only know after the fact what market is doing. I found that people use some indicators, mostly the same what we have in fsb as a filters for entries. Other interesting though I found, that is not productive to worry if markets are ranging or trending, good strategies should survive both market regimes. My few experiments with few regime filters was not successful, tried to look for higher timeframe trending/ranging conditions.

Indicators, by their very nature, lag behind the market. It seems logical to me that if indicators do indeed work as intended, that identifying trending/ranging markets (even though our information is lagging) can only boost our efforts significantly.

Re: Using Synthetic DataHorizons For Generating Strategies

Found the old topic - https://forexsb.com/forum/topic/3060/ge … -one-pair/

Re: Using Synthetic DataHorizons For Generating Strategies

footon wrote:

Found the old topic - https://forexsb.com/forum/topic/3060/ge … -one-pair/

Thanks for digging this up -- interesting reading.  In the end it seemed like that thread just died -- never came to any conclusion.  I also noticed that they really got hung-up on the details of how to "stitch" together the data -- like worrying about leapyear -- seriously?

Stitching together data from multiple *.csv files is relatively straightforward, but I did have to come up with a way to make the date and timestamps continuous -- otherwise FSBPro wasn't happy.  C# has a few DateTime methods that make that easy.  I didn't worry about using the original timestamps -- I started at 2010 and just worked forward.

I stitched together data from EURUSD and USDCHF.  At the transition point (i.e. at the stitch) there is a HUGE drop.  Interestingly, some of my strategies handled that gracefully, while others dropped off the planet.  I don't think it is necessarily important to have a smooth transition -- it is only one point -- but then I found a way to make the transition smooth by simply taking the "delta" between the last EURUSD and first USDCHF and adding it to all the USDCHF prices.

For now I'm just curious to see whether a strategy optimized using hybrid EURUSD-USDCHF data yields better Multi-Market results than a strategy optimized using only EURUSD data.  I'll let you know what I find.

Re: Using Synthetic DataHorizons For Generating Strategies

Irmantas wrote:

So if you chose only from 50 variants, probably you get better chances

Yes -- I think that something along those lines makes a lot of sense.  Hannahis had a recent post where she described her procedure of "building-up" a strategy and saving intermediate results along the way.  I've been doing something similar, but always throwing away intermediate results and only keeping the best one.  But now I'm saving intermediate results along the way -- similar to your idea using 50 variants since it could be that some of those will perform better than the most highly optimized one.  I haven't gotten very far in live testing them -- so, I can't yet say whether this approach will work better for me.  But it does seem like a reasonable approach.

Also, thanks for the information about changing markets and "regime analysis".  I'd never heard about that before.  I'm interested to learn more and will do some Googling.

12 (edited by qattack 2017-08-26 22:23:32)

Re: Using Synthetic DataHorizons For Generating Strategies

So if I understand correctly, if you use 50 variants, you would patch together data from all 50 variants? (EDIT: Sorry, I'm thinking about this all wrong; running on little sleep this week!)

To avoid "splice points", you could simply add or subtract value to the entire data segment you are splicing to give a smooth transition, perhaps using the data point immediately previously to where it was spliced as a reference. In other words, if you are splicing EURUSD, and the last point of a data segment is 1.19 and the next starts at 1.41, simply add 0.22 to every bar in the next data segment.

EDIT: Yes of course that's adding the delta value the sleytus has already referred to. I'm glad one idea I had is valid today, even if it's already been espoused.

Re: Using Synthetic DataHorizons For Generating Strategies

qattack wrote:

So if I understand correctly, if you use 50 variants, you would patch together data from all 50 variants?

I don't think that is what he meant.  I believe he meant that instead of keeping just the one, most highly optimized strategy, that it may also make sense to keep some of the others that were generated along the way but which didn't make the final cut.


qattack wrote:

To avoid "splice points"

In my earlier post I refer to stitching together the transition from one data set to another by adding a "delta" value.  I think that is what you were also suggesting.

14 (edited by sleytus 2017-08-27 00:33:45)

Re: Using Synthetic DataHorizons For Generating Strategies

I have some preliminary results that I would like to share.  In general I'm not prone to hyperbole, but these results are pretty amazing.  Please keep in mind this was a single, quick and dirty experiment -- so, tread carefully.

I created two data sets -- the first one was the most recent 6 months of EURUSD H1 that was duplicated (a total of around 7600 bars).  The second data set was a hybrid of the most recent 6 months of EURUSD H1 plus the most recent 6 months of USDCHF H1.

Using FSBPro I cloned one of my strategies -- I'll call them clone A and clone B.  Clone A was optimized using the duplicated EURUSD data set, and clone B was optimized using the EURUSD-USDCHF hybrid data set.

I then used Multi-Market and tested clone A and clone B against both the duplicated EURUSD and hybrid EURUSD-USDCHF data sets.  I've attached an image that shows the results.

I'll cut to the chase -- the most striking result is that clone B's Multi-Market result is *much* better than clone A's.  Upon reflection this result is not unexpected -- but what it does show is input data used to optimize (curve-fit) a strategy can have a significant affect on a strategy's "robustness".   This is consistent with hannahis' earlier post that there is no such thing as a bad strategy -- I'm exaggerating but that was sort of the take home message.

Another point to observe is the statistics for clone A are better than clone B.  So, even though clone B is more "robust", clone A may be more profitable -- at least in the short term while it is trading in an environment that it is comfortable in.  So, I'm not sure yet how exactly I would take advantage of this observation.  If you accept the premise there is no bad EA -- then perhaps the best (and simplest) approach would be to do a better job of managing which EAs are running since they are sensitive to the current data environment.  A simple management policy might be to temporarily disable a strategy for a couple of weeks if it loses two trades in a row -- and then after a couple of weeks turn it back on to give it another chance.  That is just a very simple example -- of course, more complex policies could be applied.

Tomorrow I'm leaving the country for a couple of weeks.  I should have access to the Internet from time-to-time -- but I may be slow to respond to questions or comments about these results.  When I return I plan to delve into this more deeply.

I also wanted to add, again, this is a testimony to Popov's amazing software.  Without his software there is no way we could experiment like this.

Post's attachments

Results-Using-Synthetic-Data.png 160.11 kb, file has never been downloaded. 

You don't have the permssions to download the attachments of this post.

15 (edited by hannahis 2017-08-27 07:37:20)

Re: Using Synthetic DataHorizons For Generating Strategies

Hi Steve,

Thanks for starting this topic because it will lead to "alternative optimization" method I've been asking all these years!



1. We mostly think that the "problem" lies in either

a) not having "adequate data set" - so we solve it by using large historical data

b) not having adequate "varied" data set - so we solve it by using multi market data via MC


2. We assume we are looking at the "SAME" data when we are using the "SAME" period.

Let say we all have a historical data from 2016 to 2017 (1yr historical data) and we then assumed we are all using the "same" data.

But have it occurred to you that there are possibility of 112 "data set" from this same 1 year data?

For Example, When we use MA cross over indicator, we need to choose 3 options, 1) Method of Fast MA, 2) Method for Slow MA and 3) Base Price and out of these 3 options, there are 112 combinations.

These 112 combinations present 112 data sets for that same given historical period.

It is not how long (the data set) that will help you find the "right" equation for your EA.  If your EA's equation is "faulty/unstable" right at the start, no matter how much data you are subjecting it to, it will always be "unstable".

In my earlier post in my own "blog" Hannah's Trade and Management Tips.  I mentioned that we need to "introduce" more stability to our EA.  By using certain combinations, we can increase our EA's "stability" https://forexsb.com/forum/post/45589/#p45589

In order to provide a statistical proof to my "theory", I conducted this experiment.  I create with the same opening conditions, 112 EA for each combination.  My point is to prove that if one happened to choose the wrong combination, it would resulted to a failed EA (even if you use the "correct" parameter) but if one happened to choose the right combination, the EA is going to be very profitable. 

We often thought we need to optimize our EA to find the correct parameters.  WRONG!!!!, we need to optimize our EA to find the correct combinations. (of cos the "correct" parameter does play a part, but after you got the "right" one, you can still be far away from finding a profitable EA when the combination you use is wrong.

Click this link  https://www.fxblue.com/users/1111728889 … 2#overview and go to the "stategy" tab and you will see that with these 112 combinations/EA, just in 1 week's trading, we got very extreme trading results.  The Best EA made $10,950 profit and the worst EA loss $14,755 (that's a $25,705 difference, just by choosing the correct combination).

So imagine, if you happened to choose the losing EA's combination and start using this equation and subject it to various historical period and MC testing, do you think you are likely to turn such losing EA into a winning one?  You already started with a wrong "foot".

Wouldn't you have a better chance of improving the EA if you happened to choose the EA's combination that made $10,950 in the 1st place?  Which one would you choose to improve your EA further?

So the question is not about how much historical data and how to interchange these data set from one market to another market, changing the market conditions, won't change your Equation's behaviour.  If the combination you use aren't going to produce any "robustness" in your equation, subjecting it to any further testing aren't going to add any stability to it.


That's why I've been asking for an alternative "optimization" method.  Before I want to run my EA through various data set, I 1st want to make sure, I'm using the most optimal combination to begin with.  If you look at the list of EA from the web link I posted above, which EA will you pick to run through all your data testing.  Will you pick the 1st and best combination EA that made $10,950 profit or will you pick the last and worst EA's combination that loss $14,755?

Now with this "new and proposed" optimization tool, we begin with finding the best combination to start with and thereafter, subject our EA through all the other data testing we want.  At least we start with the right footing.

Hannah

Re: Using Synthetic DataHorizons For Generating Strategies

Hannah,

I'm not sure the result I reported necessarily supports "alternative optimization".  Using the example you give of the MA crossover indicator -- yes, there are 3 additional options that each have more settings.  What you would like is for FSB Pro to also take these additional 112 combinations into account when it performs its optimization / curve-fitting / training -- whatever you wish to call it.  Fine -- I have no qualm with that -- but that is not the point of this thread.

I'll make a statement (based on something I learned from you) and it is an exaggeration -- but only to make a point:
"There is no such thing as a bad EA."

Every strategy that Popov's software generates has been optimized, curve-fitted, trained using a particular data set.  Actually, currently I think "trained" is a good description.  It will trade well when it is placed in a data environment similar to the one it trained for -- let's refer to that as its "comfort zone".  When it is forced to process data it wasn't trained for, then it will fail.  An analogy -- you train to speak French and in France you do perfectly fine.  And then you find yourself in Bulgaria -- and it doesn't go so well since you don't speak Bulgarian.  That doesn't mean there is anything wrong with you and you will do fine again once you return to France.  That's what my results would seem to indicate.

The way I see it now -- nothing needs to change in the way strategies are generated.  What I've learned from this simple experiment (and also reading your earlier post and taking into account some of the trading experiences you described) is that we should perhaps pay more attention to the current data environment.  You are way more familiar than I when it comes to identifying trends -- ranging, breakout, etc -- and you have a good idea as to which of your EAs perform better with different trends.  I'm not there yet.  But, I can do something almost as good.  And that is turn my EAs on / off based on some "policy" -- where a policy could be something as simple as "turn the EA off for two weeks if it loses two trades in a row, and then turn it back on".  Since we are now dealing with portfolios of many EAs then it means at any particular time some will be active and some will be in "time out".  Do you know what I mean?

17 (edited by hannahis 2017-08-27 09:32:45)

Re: Using Synthetic DataHorizons For Generating Strategies

Hi Steve,

I'm not trying to "hijack" your topic.  But provide a different perspective of using data set.

As to come up with a "policy" to determine when to switch on and off a EA, isn't that what the trading rules is all about.

We input our trading rules (using indicators) to specify only under certain circumstances/market situations, switch EA on when these rules are "true" and if these trading rules are false, switch off "don't trade".

What I'm trying to emphasize is, that even if we think we got the "correct" rule (parameters) or "policy", if we don't select the correct set of combination to use.  These "policy"/trading rules can't be carried out in it's most optimal manner.

I'm sorry if in any ways I've offended you by "mis-directed" your topics.  I'm just trying to highlight an important point before we go around a wild goose chase and missed the important concept of getting the combination right before all the other data "training" we want to subject our EA to.

Lastly, there is such thing as a Bad EA.  And even the Best EA can't fit well in ALL market situations.  The point is to understand what our EA is built for what kind of market situations.  And when an EA fails, there are many reasons and one of is can be it's fundamentally wrong in it's trading rules (bad EA) yet at the same time, a fundamental correct EA can also be "wrong" when we use the wrong combinations.

Hannah

Re: Using Synthetic DataHorizons For Generating Strategies

hannahis wrote:

it will lead to "alternative optimization" method

Sorry -- another point occurred to me -- it may not go over so well.

I don't think an alternative optimization is necessary.  The FSB Pro generator creates millions of strategies by fiddling with various indicator settings.  Including the additional options you mention only increases the number of combinations that need to be checked -- it will not necessarily result in more or better EAs.  If there were a scarcity of good EAs then, yes, I would say it might be valuable to include additional combinations of indicator settings to see if more, good EAs could be generated.  But since there is no scarcity, then I don't see the benefit.

I will grant you there might be some good EAs that are missed because the generator doesn't take into account some of the options you mentioned -- but other EAs take their place. 

I am now of the belief that once you have an EA that exhibits good statistics then to get the most out of it we should take into account whether the current data environment matches with the one the EA trained for.

Re: Using Synthetic DataHorizons For Generating Strategies

Hannah,

I'm not offended at all.  I value your perspective and insight -- and the points you make are very relevant to the topic.  You bring a lot of experience to the table.

As to come up with a "policy" to determine when to switch on and off a EA, isn't that what the trading rules is all about.

My answer -- "no".  I guess at some level you could say that everything is part of the trading rules, but I'm describing something different.  The "policy" I'm referring to is at a higher level.  Indicators do the grunt work -- they are the ones getting their hands dirty dealing with signals and mathematics.

The "policy" I'm referring to is at a higher level -- managing EAs.  A policy does not care about trading signals.  Rather, it takes a step back and has a broader view.  It assumes that if an EA is not doing well it may well be because it is currently out of its element (i.e. is not well-trained for the current data environment) -- so, give it a rest and re-enable it at a later time.  It's the current data environment that dictates which EAs will succeed and which will fail.  Since we can't look into the future, then I would claim the best we can do is to better manage our portfolio of EAs according to policies based on our preferences and/or observations.  Again -- a simple example of a "policy" would be -- "after two successive losses disable the EA, and then re-enable after two weeks have passed".  This is just a very simple example of how one might take into account the current data environment.  This is different than the world of "trading rules" where indicators operate.

Re: Using Synthetic DataHorizons For Generating Strategies

Sleytus keep in mind that automated trading is not about getting more better looking backtest, but it is about making money with unseen data. So all that robustness what we are looking for beats single over optimized better looking system statistic

For me your results is as expected. You get better results with what data you optimized. Real test would be to check on unseen data chunk or at least with different multi market pairs.

1. Optimize on hybrid data -> test it on unseen EURUSD and unseen USDCHF
2. Optimize same strategy separately on EURUSD and USDCHF data -> test it on unseen data with the same pairs

If unseen data 1 beats 2 then you have success smile

Re: Using Synthetic DataHorizons For Generating Strategies

Here http://www.traderslaboratory.com/forums … g-rsi.html is some discussion and example how to use RSI like regime filter. Hope it helps. We maybe can try to make something similar smile

22 (edited by sleytus 2017-08-27 18:10:16)

Re: Using Synthetic DataHorizons For Generating Strategies

Irmantas wrote:

Here http://www.traderslaboratory.com/forums … g-rsi.html is some discussion and example how to use RSI like regime filter

Thanks for the link -- it is interesting reading.  Though I'm not sure exactly how I would incorporate it.

Back to Synthetic Data -- the result I posted, though preliminary, has some very interesting implications.  I started with one of my better strategies and cloned it.  So, I began with two *identical* strategies.  I then optimized/curve-fitted/trained each of those against two different data sets.  The result was two different EAs -- one that was very robust, and one that was less robust but more profitable.

Take this a few steps further.  I'm going to create 100 different data sets -- stitching together data from different pairs, different trends, etc -- whatever I can think of.  And then I'll take my favorite strategy and make 100 *identical* clones.  And then I'll optimize/curve-fit/train each clone against a different data set.  In the end I will have a portfolio of 100 strategies -- each using the same combination of indicators -- but trained against a different data set.

Using carefully designed data sets -- rather than arbitrary DataHorizons -- can be used to generate a portfolio of EAs starting from one (your favorite) EA.  Think about it...

Or -- another approach.  You have a library of 100 data sets.  Then you cycle through your EAs and pick the ones that do well against the most data sets.

No other software -- only Popov's FSBPro -- could allow me to even consider such a wild approach...

Re: Using Synthetic DataHorizons For Generating Strategies

Steve, you're going to demo your collection to get some results? I went the easier way, introduced unseen dataset for testing. Generated on 40k syndata, then optimized, then tested on "clean" 18k bars of one pair. Well, straight out of the generator collection had 26% winners, optimized collection had 36% winners. I think I'm going to leave the syndata idea as it is.

Re: Using Synthetic DataHorizons For Generating Strategies

footon wrote:

I went the easier way, introduced unseen dataset for testing.

I'm not sure we're talking about the same thing -- maybe we are.

I don't know what you mean by "unseen dataset".  Also, I don't understand 26% versus 36% winners.  It's not about the number of winners.   I created a hybrid EURUSD+USDCHF data set and used it to optimize one of my favorite strategies.  I then used that optimized strategy in a MM test -- and the results were very compelling.  Did you look at the images I attached a few posts back?  I don't know how else you could explain it other than the data set used to optimize/ curve fit / train (whatever one chooses to call it) can make a huge difference in the MM test.  If the MM test doesn't mean anything then, okay, it is not worth pursuing.  I haven't been a big fan of the MM test mostly because so few strategies pass with flying colors.  But if one can create more strategies that pass the MM test, I think it is worth pursuing. 

I really haven't thought this through completely -- so I'm not going to push it further.  The last thing I'll say about it is I think the test I described above demonstrated that the data set plays an important role in "training" a strategy -- at least in terms of its ability to survive the MM test.  I'll let others decide for themselves what, if any, the implications are.   And as to how one would exploit this to gain an "edge" -- I'm not sure.  I'm still experimenting...

Re: Using Synthetic DataHorizons For Generating Strategies

sleytus wrote:

I'm not sure we're talking about the same thing -- maybe we are.

I don't know what you mean by "unseen dataset".  Also, I don't understand 26% versus 36% winners.  It's not about the number of winners.   I created a hybrid EURUSD+USDCHF data set and used it to optimize one of my favorite strategies.  I then used that optimized strategy in a MM test -- and the results were very compelling.  Did you look at the images I attached a few posts back?  I don't know how else you could explain it other than the data set used to optimize/ curve fit / train (whatever one chooses to call it) can make a huge difference in the MM test.  If the MM test doesn't mean anything then, okay, it is not worth pursuing.  I haven't been a big fan of the MM test mostly because so few strategies pass with flying colors.  But if one can create more strategies that pass the MM test, I think it is worth pursuing. 

I really haven't thought this through completely -- so I'm not going to push it further.  The last thing I'll say about it is I think the test I described above demonstrated that the data set plays an important role in "training" a strategy -- at least in terms of its ability to survive the MM test.  I'll let others decide for themselves what, if any, the implications are.   And as to how one would exploit this to gain an "edge" -- I'm not sure.  I'm still experimenting...

Many of you guys and gals use extensive demo trading or nano live trading to check how the generated collections fare. It takes a lot of time, after that the pruning process takes place etc. What I was saying was that I took my collection and backtested it on unseen data chunk, an out of sample dataset. I developed the 100 strat collection on 40k bars and my OOS data was 18k bars. I skipped the demoing by using OOS data, that was my easier way.

So, my collection of 100 strats came together from thousands of strats during generation. After the latter I re-optimized the collection. Then I uploaded my unseen dataset and recalculated the collection. Essentially this gave me enough to assess its performance. Re-optimized collection produced 36 strats of 100, meaning 36% percent strats of initial collection were profitable for the next 18k bars. A very simple approach or a method, if you will - first generate a collection, then put it on a  live trading account (or check it on OOS data to validate the method in general). My questions - would this method work and does synthetic dataset produce more profitable strats. This little test gave me no's for answers. Furthermore, the results are pretty much in line with my observations so far, that's why I let this idea be as it is.

Hope I made myself more clear.

My comments on MM if you don't mind. Every other trading book emphasizes that a good robust trading strategy must work on different pairs and markets. That's the idea behind MM - to help finding/validating strats which remain profitable in live trading. My current research tells me that MM validation is not particularly useful for validation, in fact it gives pretty similar results if MM is not used. What I must say is that I haven't covered all possibilities, which MM offers. In other words I'm not saying (yet?) it is useless, but I haven't found an approach where the use of MM would give a significant impact.

One more thing - end result matters! That is a profitable collection on a live account. That's why I'll repeat myself once more - one should not fall in love with backtesting stats, it's the end result which matters. A better pass through MM must result in larger number of profitable strats as well, I think you agree with me. And I agree on the importance of data, that's my next task.