Topic: Improvement Idea - Backtesting Samples : Cross Validation
While researching some Python functionality, I came across an interesting statistical method which seems it might have an application with forex.
Currently, we have the the option to create In Sample and Out of Sample sets of data. As we know, the IS data is used to train the model and the OOS sample is used to validate the model and there is the option to split the dataset into two across varying lines of percentage. Although this reduces the over-fitting it doesnt eliminate it entirely.
Here, I think K Fold Cross Validation might help.
This approach involves randomly dividing the set of observations into k groups, or folds, of approximately equal size. The first fold is treated as a validation set, and the method is fit on the remaining k − 1 folds.
You can see my attached graphic but as an example, the data is split into 10; the 1st set is used to train; parts 2-9 are used to validate and an average is taken at the end. Second sweep is where the 2nd part is used to train and parts 1 and 3-9 are used for validation and an average is taken. All the way through until all the parts are completed. The average performance over all the parts is then computed.
I think this would give a much closer approximation of the efficacy of the strategy.
What do we think of this?