Hi Michael,
Yes -- what you say does make sense, and it very well could be a bug.
However, what I am trying to explain is that "trial-and-error" (i.e. empirical) calibration algorithms could very well have more going on under the covers than you suppose.
I've written calibration-type software that requires comparing billions of combinations of numbers. If the application were to actually compare the same billions of combinations each time a user pressed "Recalculate" it would take a couple of years. So, instead, each time "Recalculate" is pressed then a different subset of combinations is tested.
Just like in real life everything comes down to trade-offs. If you test all billion combinations each time then the upside is you covered all possibilities and each time you get the exact same result, but the downside is each calculation might take two years. If you test only a subset then the upside is it completes quickly, but the downside is you haven't tested all possibilities. Those are the trade-offs.
One assumption we used to make is if a user pressed the "Recalculate" button a second time it often meant they weren't satisfied with the initial result, so it made sense to test a different subset of possibilities.
You guys are not programmers. You don't think like programmers and have no clue as to the amount of data that needs to be processed and the millions of combinations that have to be tested. The fact that Popov found a clever way to do this that takes only a few seconds is quite amazing. Have you ever tried back testing using MT4 Strategy Tester -- it's about 10000x slower. Yet it never dawns on you guys that perhaps there were some optimization speed-ups that Popov implemented. I don't know about you, but I would take 10000x faster at the expense of 5% less accurate.
Furthermore, back testing statistics are rarely matched in a Real account, so you shouldn't become so dependent on them. Yes -- it provides a clue that one strategy may be better than another, but it certainly isn't quantifiable. If back testing statistics were so reliable I should be able to say that Strategy A will perform 23.76% better than Strategy B. But, of course, no one can claim that. That's why obsessing over back testing statistics is the booby prize -- i.e. it's not really a prize, but it feels good. And, of course, spending your time generating, calibrating, back testing, demo'ing, re-optimzing, re-generating, etc. is certainly much safer than trading a Real account.
Since G-d has a sense of humor, this post will probably be followed by one from Popov saying he just found a bug in EA Studio's optimizer, thereby proving I'm an idiot. That's okay -- it wouldn't be the first time...