2. You mean like for strategies? Good strategies from an equity curve point of view?

3. Almost. P-value shows statistical significance with a degree of probability. The threshold is actually more or less arbitrary (sorry statisticians for loose understanding and use of words). It depends on the field of study and the test procedure and so on, but usually the statistical significance starts from 0.05 or 5%, the lower it is, the better. In essence, yes, the idea is to reject strats, which luck into winning and/or are heavily biased.

4. There's no metric in this sense, it is strat's mean in relation to bootstrapped distribution. If the mean location is on the edge of the distribution curve and the area, which follows it, is less than 5% of the remaining area, one can conclude the results are statistically significant, in other words it cannot be due to luck or chance to have such outstanding result. That's the idea.

5. It is fresh out of the oven, I haven't had time to put it through its paces. I know it works, it produces numbers I deem ok. But I haven't made proper evaluation on selected collections and further analysis. Firstly, 0.05 mark is the starting point, everything above it should be rubbish, so that will be an interesting test to see and find out.

6. Last answer applies to this as well.

7. Currently, only relation is the underlying data that is equity curve. Certainly, a statist. significant start will produce a high sqn also, so there is this correlation. Let me put it this way - it remains to be seen if the concept of statistical significance is of value to algorithmic strategy mining

I have some coding tasks to finish first and then I'm delving deeper in testing and then we can compare notes, hopefully.

]]>I've updated my AccountStatistics.cs and started poking around. And I have a several questions:

1. The T-Ratio that I'm getting is *exactly* the same as the SQN. Do you get that also?

2. My P-values are always 0.00.

3. I have a pretty good feeling for how to use and compare SQN and sharpe values -- though I couldn't come up with the formulas on the top of my head. From previous posts, it sounds like the "simple-man's" description for a p-value is the probability a result is strictly due to chance -- i.e. they should range between [0,100]. Is that correct?

4. And if it is a probability, then which metric or metrics does it refer to? Does it refer to the overall result?

5. In your hands, what kind of P-values are you seeing. I have no reference. I have no idea what it means to have a good or bad P-value or how to use the value in making a decision about a strategy.

6. Suppose you had a winning strategy -- i.e. it made money. But the P-value was high, indicating it was likely due to luck. Would you stop trading that strategy? And, if not, then I'm curious how one would use this metric to make decisions.

7. The P-value seems related to SQN -- how would you compare them?

Sorry for all the questions -- feel free to tell me to do my own Googling and I'd understand. I guess the most important questions for me at the moment are (a) why all the P-values are 0.00, and (b) what kind of range of P-values are you seeing?

Thanks

]]>I read through quite a number of statistical articles and to some questions I found answers, but only to realize that those answers raise even more questions. Anyhow, after number of tests I settled with the current p-value calculation. I didn't use t-ratio for the calc as it uses a table of corresponding p-values. Instead it resamples the returns using bootstrapping method. Some papers, which actually lack scientific validity, use returns as they are, but for the sake of more reliable results I zero-centered the returns.

]]>AND I LOVE CAPITAL IQ "and I've solved it without problems. I'm waiting for your reading to discuss. I appreciate this document as the authors refer to some quant, I think it's a good starting point.]]>

My theoretical thinking leads me to an opinion that the mean value should be compared to a benchmark, I'm talking about this formula:

If now it is "mean - 0", I think it should be "mean - benchmark". Would like to hear what other statisticians are thinking

]]>File goes into Code folder, custom code reload has to be enabled in control panel.

]]>I really appreciate this discussion.

thank you

]]>When fewer prices or trades are used in a distribution, we can expect the shape of the curve to be more variable. For example, it may be spread out so that the peak of the distribution will be lower and the tails will be higher.

How do you know if your trading system has an edge or whether it's just random luck?

A way of measuring how close the sample distribution of a smaller set is to the normal distribution is to use the t-statistic.

When testing a trading system, degrees of freedom can be the number of trades produced by the strategy. When you have few trades, the results may not represent what you will get using a longer trading history. When testing a strategy, you will find a similar relationship between the number of trades and the number of parameters, or variables, used in the strategy: more variables used, the more trades are needed to create expectations with an acceptable confi dence level.

When using the t-test to find the consistency of profi ts and losses generated by a trading system, replace the data items by the net returns of each trade, the number of data items by the number of trades.

"The Art and Science of Technical Analysis" (Adam Grimes) modified by Grove Under

The second step is to get the p-value, why? Once the p-value is obtained, it is simply a matter of deciding which threshold qualifies for statistical significance. Scientists usually determine the statistical significance threshold at 0.05 (ie, the null hypothesis would be rejected for any p-value less than or equal to 0.05). How do you calculate?

For example, if your p value is 0.01, that means based on the data set analyzed, there's a 1% chance of seeing the analyzed results due to random chance or luck. If your p value is 0.50, then there's a 50% chance your results are based on luck.

These purely statistical tools do not guarantee profits, but can be used to exclude overfitting strategies or provide greater selection.

I've seen that I'm already available online tools to quickly calculate p-value by having t and n available.

I've already tried to test some strategy but I'm just beginning.]]>

So -- I have an idea. Suppose I have a strategy and 30 arbitrary data sets. I then test my strategy against the 30 data sets. And let's say my overall Sharpe value is an average (or median) from the 30 data sets. Then,

T-stat = Sharpe x (square root of trades), where Sharpe is the average from the 30 data sets and 'square root of trades' is the sqrt of all the trades from the 30 data sets.

Would this allow me to calculate a T-stat (or something similar) for this particular strategy?

]]>Regarding tomorrow's bar -- yes, 3 possibilities. But what about their probabilities? When tossing a coin I know the next toss has two possible outcomes -- so, in that respect I can sort of predict the future. Furthermore, I know if I toss a coin enough times the outcome will tend to 50 / 50. But that is where I begin to lose the connection with applying statistics to Forex data or returns. Even though you know there are 3 possible outcomes -- up, down, doji -- you have no clue about their probabilities. All you can do is hope the pattern continues. Come to think of it -- perhaps there is a statistic that can provide a clue as to the probability of a pattern continuing. Is that where the p-value would come in?

]]>One thing about next bar - I disagree about having no clue. I know that tomorrow's daily bar can be one of three: up, down or doji bar

]]>I have a fundamental question about applying statistics such as this to Forex data -- because I think Forex is a different beast. When considering the S&P 500 the actual data is tied to companies whose only goal in life is to increase their value. So, there is a built-in bias for the numbers to increase. But in Forex there is no built-in bias. Doesn't the nature of the data have some influence over which statistics are valuable and what insight they provide?

Now -- back to walking my dog...

]]>1. I do believe in the midst of all those "random" price fluctuation, there are some "repetitive" patterns that doesn't arise due to chances. I see them all the time in my charts, whenever these "conditions" happened, there are usually a significant trend. The problem or challenge is how to exit timely as these trends are sometimes weaker than another. Hence, I do believe we can statistically find these repetitive conditions that are not due to random chances. Just like Google analytics, it tracks buyers profiles and buying patterns and by understanding their repetitive search or purchase patterns to predict what kind of advertisement would "suits" these users. Though most retail purchases may seem to be an issue of impulses, there is some "predictive" behaviours behind the sum of all these random behaviours.

2. As much as we often think that forex is so "unpredictable", it is still often governed by basic economic factors or market forces such as Demand and supply. It's the psychological part of speculation that fuel the volatility.

3. There are big players who dominate/dictate forex prices/trends and I believe these people's decisions are more often ruled by economic factors rather then whimsical and psychological (which most retail traders tend to react). I don't think they wake up one day and decided to sell because they "feel" like it. Since their decisions are more likely governed by forces of demand and supply (or fundamental issues), therefore it is possible to "see" a pattern behind every breakout or big movement made by these big players. Hence, RSI is a good indicator to use to lookout for these "pressure" points. Imagine if you happened to use the same "indicators" or whatever tools they use for their decision making, it's like "insider" trader whereby you have a close estimate of when/timing (not in terms of day and time but via indicators), whereby these big players are going to enter/exit the market.

Hence, in a nutshell, I do believe there is predictive value (if there isn't, then we are all gamblers here) in forex and usually the shorter the time interval, the "stronger" the predictive value. For example, which is easier to predict? To predict whether EURUSD prices will go up or down 1 year from now or 1 day from now? Of cos, it's much "easier" to predict the prices for the next minute than for the next hour. Hence, I believe if we focus on how much we can scalp out of the market at shorter intervals, we have a better chance of increasing our EA predictability because the longer we hold, the less certain we are about when the trend will start to change. Thus, most of my EA trade very frequently at shorter intervals.

However, from a position trader's point of view, there is also predictive value if you look at the market on a "bigger" picture. And use D1 and higher chart as reference point. Such trading method isn't my cup of tea because one need to have really large SL to stomach all those high and lows. And the larger the SL, the smaller the lot size, relative to your capital, you are able to trade "safely". Eg if your SL is 3000 pips and you only want to risk 10% of your capital, then you can only enter 1 lot size per $30k capital. I personally believe such method isn't "economical" use of capital for optimum return and if you divide the returns over the long holding time needed, it may not be really that attractive investment considering the high element of risk vs return.

]]>Lastly, about competence. Having a degree in humanitarian sciences makes work involving math harder than it should normally be I know I'll give my best but it remains to be seen if it'll be enough.

]]>