Oct

21

There have been comments from analysts recently about changing correlations, for example, this from Morgan Stanley:

US Equity Derivative Strategy
Getting the Best and Worst of Correlation
October 09, 2007
By Peter Polanskyj, Christopher Metli

Correlation between sectors remains high: Correlation among the various sectors of the S&P 500 remains elevated on average, although off the peaks of late August/early September. Among sectors, recent short-term relationships have in some cases differed meaningfully from longer-term relationships.

Correlations among single stocks within specific sectors are a mixed bag: Several sectors have seen correlations among their constituents drop to relatively low levels, including Healthcare, Food/Beverage/Tobacco, Technology, Media, Software, Consumer Services and Food/Staples Retailing. Several sectors have continued to be highly correlated on an absolute and relative basis. Financials are prominent in that landscape.. Full text

Three years ago, Dr. Castaldo commented that changes in correlation may relate to changes in volatility, due to technical reasons and not to changes in the underlying stochastic process. More recently, an article by Harry Kat at City College of London referred to some of the same research around changes in correlation.

In the spirit of "know your tools" (and correlation is an important tool), these three papers seem most often cited:

Pitfalls In Tests For Changes In Correlations

Brian H. Boyer, Michael S. Gibson and Mico Loretan

Evaluating "correlation breakdowns" during periods of market volatility [pdf]

Mico Loretan and William B English

No Contagion, Only Interdependence: Measuring Stock Market Co-Movements [pdf]

Kristin Forbes, Roberto Rigobon

Boyer, Gibson, Loretan (BGL) make algebraic and empirical arguments. They create two randomly-generated series, x and y, with a correlation coefficient p : 0<p<1, and show that the correlation coefficient of a subsample of the two series is proportional to the overall p, as the variance of the subsample is proportional to the overall variance of x. That is, as the volatility of x increases, so does correlation between x and y. Here is a graph that is one take on the overall argument. The graph shows how both volatility (measured as sd of x) and correlation vary together over two randomly-generated and positively-correlated series (x and y).

Bruno Ombreux adds:

I have been reading generalist books on statistics written by biostatisticians and social scientists. As a rule, they don't like correlation coefficients. Since these coefficients are symmetric/non-causal, they are useless in advancing scientific knowledge.

In finance too, some people don't like correlation coefficients. See for instance these two articles by Embrechts and Alexander [pdf] . They are making points that are in addition to the ones in the links you kindly provided. Some issues are very ivory-towerish: elliptical distributions or joint covariance stationnarity. Others are more down to Earth: extremes creating "ghost effects" in coefficient estimation.

Anyway, the consensus is that correlation coefficients are not a panacea. Actually, it is better not to use them. If one absolutely wants to use them, rank correlation is not as bad as linear correlation. I feel the first article is dismissing rank correlation a bit too fast on the grounds that analytical complexity hinders further mathematical derivations.

What to use instead of correlation? Both articles are promoting their modern alternative pet method for measuring dependencies (copulas and cointegration, respectively). I prefer instead to follow the social scientists' suggestion and build regression models. The nice thing with regression is that assumptions are clear and easy to check. When assumptions are violated, there is a whole slew of more complicated regressions that can be applied.

Phil McDonnell suggests:

To use regression instead of correlation is misguided. They are the same! After all the square of rho is the same as R^2 from the regression.

Bruno Ombreux counters:

Yes, but isn't there more information in regression than in correlation? R-squared only gives the proportion of Y explained by X. The regression coefficients together with their standard errors add more information.

In addition, correlation is symmetric: cor(X,Y) = cor (Y,X). X and Y are playing the same role in regard to any possible explanation or causality. Whereas the regression of Y against X is not the same as the regression of X against Y. These are two regressions lines with different slopes. These create a difference between X and Y, there is an explained variable and an explaining variable. Then adding a time dimension one can introduce causality, like Granger causality .

I think that regression contains correlation but it is not the same concept.  Regression is a procedure that examines a number of different statistics, checks residuals, reformulates the equation if necessary.

Sam Humbert comments:

Another correlation quirk, from Rene Carmona, "Statistical Analysis of Financial Data in S-Plus" Springer-Verlag 2004, pg 99:

"Problem 2.4 This elementary exercise is intended to give an example showing that lack of correlation does not necessarily mean independence!"

Carmona defines X as N(0,1) and shows that Y, a simple function of abs(X) (thus entirely determined by X) with mean 0, variance 1, is uncorrelated with X.

A did a quick R-script to demonstrate; every run will have a slightly different result, but X and Y are always ~0 correlated -

X<- rnorm(100000)

Y<- (abs(X)-sqrt(2/pi))/(sqrt(1-(2/pi)))

cbind(X,Y)[1:10,]

mean(X); mean(Y)

var(X); var(Y)

cor(X,Y)

Sample run -

> X<- rnorm(100000)

> Y<- (abs(X)-sqrt(2/pi))/(sqrt(1-(2/pi)))

> cbind(X,Y)[1:10,]

X           Y

[1,] -0.7878436 -0.01665691

[2,] -0.4779746 -0.53069754

[3,]  1.3390446  0.89772859

[4,]  0.3362482 -0.76580698

[5,]  1.3081312  0.84644648

[6,]  1.1859110  0.64369580

[7,] -1.6717642  1.44967611

[8,] -0.3082874 -0.81219113

[9,]  0.5582608 -0.39751106

[10,] -0.2235637 -0.95273902

> mean(X); mean(Y)

[1] 0.001646453

[1] -0.00657469

> var(X); var(Y)

[1] 0.9952368

[1] 1.004244

> cor(X,Y)

[1] -0.001873391

>

Also, Carmona does a good job of introducing the copula (mentioned in Dr Ombreux's post) as a generalized correlation, and, earlier in the book, nicely motivates kernel density estimation as a generalized histogram, a tool for exploratory data analysis.

At an S-Plus seminar I attended 7ish years ago, Carmona, one of the instructors, spent much time on the copula. Soon afterward, the concept became "famous" via the work of Dr Li  and others.


Comments

Name

Email

Website

Speak your mind

Archives

Resources & Links

Search