# Correlation and Probability, from Philip J. McDonnell

March 30, 2007 |

Winning percentage = 50 + 32*(correlation)

Because the constant is 50 (~50%) it would appear that the numbers input to this were basic coin flip probabilities. To a good approximation most markets do obey coin flip odds so this is very useful. I would, however, conjecture that if the odds were different, say 20%, then a new approximation might be needed with different parameters.

## Charles Pennington notes:

I'm not sure what Dr. Phil means when he says that "it would appear that the numbers input to this were basic coin flip probabilities."

1. Generate 2 series, A and B, each having 10,000 random numbers from a normal distribution with average 0 and standard deviation 1.
2. To each element of B, add alpha times the corresponding element in A, to generate a new series C. (I will end up trying alpha values ranging from much less than one to much greater than one.)
3. For the first 5,000 elements of A and C, run a regression of C versus A. This gives a correlation, slope, and intercept.
4. For the remaining 5,000 elements of A and C, use the regression, and the A values, to predict the C values.
5. Count the fraction of instances in which the prediction gets the sign right–that's the "winning percentage".
6. Now you have a correlation, and a winning percentage.
7. Repeat for a different values of alpha to generate a table of winning percentage versus correlation. (When alpha is small, much less than one, then alpha and the correlation that emerges are very close. When alpha becomes much greater than one, the correlation approaches 100%.)

For correlations approaching 0, the winning percentage approached 50, as of course it should. For large correlations, it approached 100%, as also it should. For small but non-zero correlations, I found this result, as stated earlier:Winning percentage = 50 + 32*(correlation).

Seems like a reasonable answer for a guy who wanted a rule-of-thumb mapping of correlation onto winning percentage. Obviously if there is a big drift term, or any other number of things are true, it could be significantly wrong.

`SELECT * FROM wp_comments WHERE comment_post_ID = '1222' AND comment_approved = '1' ORDER BY comment_date`