Topic: FSB: estimation period, validation period (OOS), and the future
A great way to test a trading strategy and to realistically compare its forecasting performance is to perform out-of-sample validation, which means to withhold some of the sample market data from the strategy generation process, then use the generated strategy to make predictions for the hold-out data in order to see how accurate they are and to determine whether the statistics data of this strategy are similar to those that the strategy made within the sample of market data that was fitted. In the FSB pro software, you are given the option to specify a percentage of data points to hold out for validation (Out of Sample in the generator panel).The data which are not held out are used to generate strategies which past the acceptance criteria of the control panel. The strategy is then tested on data in the validation period. Unfortunately FSB pro lacks the ability to make forecasts beyond the end of the estimation and validation periods. This ability is known as “forecasts into the future”. But this feature may be possible to be Incorporated in a future release.
In general, the data in the estimation period are used to help select the best strategy and its parameters. Forecasts made in this period are not completely "credible" because data on both sides of each observation are used to help determine the best strategy. The one-step-ahead forecasts made in this period are called fitted values. (They are said to be "fitted" because the software estimates the strategy so as to "fit" them with less error as possible in a mean-squared-error sense or similar criteria.) The corresponding strategy forecast errors are called residuals. The residual statistics may underestimate the magnitudes of the errors that will be made when the strategy is used in live, because it is possible that the data have been over-fitted. Over-fitting is especially likely to occur when either (I) a strategy with a large number of parameters (and/or slots) has been fitted to a small sample of market data and/or (II) the strategy has been selected from a large set of potential strategies precisely by minimizing the mean squared error in the estimation period (e.g., all possible subsets regression has been used with a large set of regressors). All of these are aspects well known by experienced traders.
The data in the validation period are held out during strategy parameter estimation. One-step-ahead forecasts made in this period are what we called back-tests. Ideally, these are true forecasts and their error statistics are representative of errors that will be made in forecasting the future. However, if you test a great number of strategies and choose the strategy which errors are smallest in the validation period, you may end up over-fitting the data within the validation period as well as in the estimation period.
So here are my first suggestion for FSB improvement:
A) The statistics of the strategy in the validation period (in-sample data) should be reported alongside the statistics of the strategy in the estimation period (out-of-sample data), so that we could compare them easily. Furthermore, this statistics should be included as acceptance criteria in the generation of strategies. With this in place we could get really sound strategies.
Estimation Period Validation Period
Profit per day, 170 145
Sharpe ratio, 0.27 0.19
System quality number, 2.18 1.17
Win/loss ratio… 0.51 0.46
...
If the data have not been too much over-fitted, the statistics in the validation period should be similar to those in the estimation period, although they are often at least slightly worst.
Certainly, FSB can perform this calculation actually but we need to make them separately and obviously it is not possible to use them over the thousands of generated strategies.
B) The linearity of the equity/balance should be an acceptance criteria too. At present time FSB
can include the filter for non-linear balance pattern which is some kind of deviation measure from fictitious lineal behavior (a sort of lineal regression line). This is very useful , because the naked eye can distinguish a strategy that performs equally well during the estimation and validation periods. The problem with the actual feature is that FSB take into account linearity of the strategy into the estimation period but does not performs any test of this linearity after the generation of the strategy in the validation period. Thus we find that after generation of strategies, the vast majority of them do not meet this criterion of linearity and thus must be rejected manually , rather than automatically rejected by the software. Obviously, the generator must not take into account the non linearity of the validation period to perform any parameter calculation. It only should use the non-linearity filter of the validation period at the end before presentation of the strategy to the trader. In other words, FSB should “look at” the final strategy instead of the trader.
Here below is an example of what I mean. These are two strategies generated unattended (great for FSB!). Green color is the validation period (OOS), before green area we have the estimation period.
The first may be good enough an be ready only for further refinements.
The second is simple useless, over-fitted one.
Obviously this last strategy should be rejected by FSB directly in automatic mode.
Well, these are just some thoughts that maybe be of some assistance in the development of this really good software.
Thank you