Victor and Laurel note: A heated debate regarding Joel Greenblatt’s “The Little Book That Beats the Market” recently cropped up among our colleagues. Below is some detailed follow-up work from one of our eminent researchers who is as adroit at analysis of single crystal NMR of high temperature superconductors as he is at uncorking the seemingly suggestive system work of hedge fund managers with putative 40% returns. Please note our response, which follows, as well as earlier intriguing commentary which began in early November and is found further down on the site.I’ll report here the results of a study that I did that addresses the results in Joel Greenblatt’s book. This study focuses on the large cap stocks that make up the S&P 500 index. Just as in Greenblatt’s work, I used the Compustat Point-in-Time database, in which the fundamental data are listed as they were at the time, and not restated.

Greenblatt’s ranking method involved both “earnings to price” ratio and “return on capital”. For “earnings to price”, he actually uses “EBIT” (earnings before interest and taxes) divided by “enterprise value” (market cap + debt + preferred stock), and for “return on capital” (”ROC”) he uses EBIT/(working capital + property, plant, and equipment). All these items can be specified using Compustat Point-in-Time.

After ranking stocks separately by E/P and ROC, he then takes these two ranking numbers and literally adds them together, and then finally ranks again based on that sum. He finds that the stocks that have both high E/P and high ROC tend to do well.

Here are the ground rules for my study. Stocks are ranked and then purchased at the end of each quarter, and held in that decile until the next quarter, when stocks are re-ranked. The most recent trailing four quarters of EBIT are summed to find the trailing yearly EBIT. In order to be purchased, stocks must have been components of the S&P 500 as of the start of the calendar year under consideration. As of the purchase dates, their share price must be greater than $2.

I checked and found that yes, the study did include Enron and WorldCom. Enron was bought on 9/28/2001 at $27.23 and sold at $0.60 for a loss of 97%. It was not re-purchased the next quarter because its share price had fallen below $2. At the time of purchase, Enron was near the middle of the rankings in terms of both E/P and ROC.

For each stock the “total return” was calculated, including dividends, using data from what we believe to be a reputable commercial vendor. However, I confess that I need to check on what the exact algorithm is for computing total return when there is something complicated, such as a merger or a spinoff.

At the start of each quarter the stocks were sorted into deciles according to Greenblatt’s ranking method. For each decile, the average of the forward 1-quarter fractional total returns for the approximately 50 stocks was calculated. Calling that number “R”, we then calculated 100*ln(1+R) for that decile and that quarter, and I’ll let Dr. Phil McDonnell (a frequent site contributor, trader and academic) explain why we did that. (As long as that number is not too big or small, it’s going to be pretty close to the percentage change in the portfolio.)

Our study covers 1992 to present, 59 quarters of data. The reason that we went back to 1992 was simply that we happen to already have had a convenient file listing the S&P components year-by-year back to 1992.

For each decile there are 59 quarterly returns. Below we give the results of our study, the average and standard deviation of those 59 numbers for each decile.

Decile 1 is the one with high E/P and high ROC; decile 10 is the one with low E/P and low ROC. The last column is the average divided by the standard deviation. Multiply that number by two and you have the annualized “Sharpe ratio” for that decile, if I understand the definitions correctly.

1    3.84    7.89     49%
2    3.33    8.57     39%
3    3.07    8.23     37%
4    3.69    7.46     49%
5    3.34    6.79     49%
6    3.04    7.40     41%
7    2.44    7.32     33%
8    2.47    7.46     33%
9    2.35    9.98     24%
10  2.51   13.27     19%

The Greenblatt “favorites” portfolio averages 3.84% per quarter with a standard deviation of 7.89%, with an average/standard deviation of 49%. The Greenblatt “bad guys” decile, decile 10, averages 2.51% with a standard deviation of 13.27%. So this confirms that the Greenblatt strategy has worked reasonably well since 1992 on the kinds of large-cap stocks that make up the S&P 500.

An investment of $1 in decile 1 stocks grew to $9.63; $1 invested in decile 10 stocks grew $4.39, and it was more volatile along the way.

Greenblatt’s data end at the end of year 2004, so below I will show you how this S&P 500 version of Greenblatt has performed since then. However, first, I will show you how some other strategies fared during the same 59 quarter period since 1992.

First, here are the results for a ranking based solely on E/P:

Avg      SD     Avg/SD
3.54    8.91     40%
3.88    8.20     47%
3.50    8.55     41%
3.01    7.00     43%
3.04    6.62     46%
2.77    6.78     41%
2.65    6.97     38%
2.40    7.76     31%
2.88   10.05     29%
2.36   13.78     17%

(First row: Highest E/P, Last row: Lowest E/P)

The results are similar to Greenblatt’s, though perhaps not quite as good. All that’s not surprising (if you believe Greenblatt’s thesis), since E/P is one of Greenblatt’s two ranking factors.

ROC is Greenblatt’s other ranking factor, and below is the performance of deciles sorted based on ROC alone:

Avg      SD    Avg/SD
3.72    8.03       46%
3.65    7.92       46%
3.08    6.69       46%
2.97    7.68       39%
2.68    7.81       34%
2.77    7.72       36%
2.72    8.25       33%
3.09    8.16       38%
2.85    8.76       33%
2.62   13.02       20%

(First row: Highest ROC; Last row: Lowest ROC)

Again, the highest ranked ROC deciles performed better than the lowest ROC deciles.

So it seems that both E/P and ROC each have some independent value as ranking criteria (though we haven’t examined the extent to which E/P are correlated or anti-correlated).

Finally, here are a few other ranking methods.

First, here’s another “value” ranking method. Many value investors claim that it’s bullish if a company has a high ratio of cash-and-equivalent on hand to market-value-plus-debt. Below is the performance according to that ranking:

Avg      SD   Avg/SD
3.96   10.40      38%
3.42   11.43      30%
3.14    9.68      32%
2.73    8.95      31%
3.44    7.72      45%
2.81    7.47      38%
2.91    6.73      43%
2.81    6.79      41%
2.49    6.26      40%
2.57    6.05      42%

First row: Highest cash/(market value plus debt); Last row: Lowest..

Here the firms with the highest cash had the highest average return, but they also had a relatively high standard deviation, and there is no clear trend in the Sharpe ratio vs. decile number. I would argue therefore that this “cash” ranking did not have much value.

Others have suggested that the Greenblatt effect might be some artifact of share price and/or market capitalization. So here are studies of those factors.

First, share price:

3.04   14.94    20%
3.14   10.12    31%
3.39    9.11    37%
2.62    7.43    35%
3.35    7.58    44%
3.19    7.14    45%
2.76    7.75    36%
2.75    6.37    43%
2.71    6.74    40%
3.03    6.76    45%

First row: Lowest share price; Last row: Highest share price

This table shows no trend in return vs. share price. The lower share prices, however, do have higher standard deviations in their returns, so arguably one should focus on higher share priced stocks for a smoother ride.

Next here are the results for a decile ranking based on market capitalization:

3.33   12.14    27%
3.52    9.94    35%
2.89    9.36    31%
3.35    8.21    41%
3.46    7.48    46%
3.36    6.49    52%
2.62    7.71    34%
2.45    7.19    34%
2.57    7.22    36%
2.66    7.67    35%

First row: Lowest market cap; Last row: Highest market cap

The lowest market caps did outperform the highest market caps by a small amount. However, their volatility was much higher, and their Sharpe ratios were about the same or lower. So it is not plausible to think that the Greenblatt effect, as observed in this study, is an artifact of small market capitalization.

Victor and Laurel compliment and caution:

We would just add that the “Minister’s” study leaves out the performance since the retrospective data ran out and it ain’t pretty. The Minister is complimented on the perfect study for DailySpec: totally good methodology suggesting fruitful lines of inquiry, but nothing that violates his mandate as “Minister of Non-Predictive Studies”.

Professor Pennington returns with updated figures:

Here is an update of the recent performance of the Greenblatt ranking system applied to S&P stocks. Greenblatt’s book gives data through the end of 2004. Shown below is data since 2004.

10      9        8        7        6        5        4       3        2       1
12/31/2000  -7.6   -3.4    -0.9    -5.3    -2.0     0.8    -0.7    -0.1    -2.2   -1.5
03/31/2005   5.6    0.9     4.7     1.1     3.0     2.6     3.1     0.9    -0.2    5.3
06/30/2005   7.1    7.8     3.9     6.3     3.7     0.8     4.8     4.0     4.7    3.5
09/30/2005  -2.9    2.1     1.6     2.4     1.6     3.6     3.2     4.8     1.9    5.9
12/31/2005  10.2    6.6     7.8     4.4     7.3     6.7     5.9     4.9     5.9    1.8
03/31/2006  -7.9   -4.5     1.4    -1.8    -0.1    -0.9    -1.2    -1.8     0.4   -1.4
06/30/2006   0.9    3.3     2.7     4.8     3.9     7.9     1.6     5.7     3.2    4.9
09/30/2006   3.5    3.8     3.6     4.5     2.6     4.3     4.4     3.6     3.6    1.4
Avg                1.1    2.1     3.1     2.0     2.5     3.2     2.7     2.7     2.2    2.5
SD                 6.7    4.3     2.6     3.9     2.8     3.0     2.5     2.7     2.7    2.9
Avg/SD           17%  48%   120%  52%   89%   106%  104%  101%  80%  85%

Short story is that the high ranked decile, decile 1 (high E/P, high ROC), gained an average 2.5% per quarter since 2005 with standard deviation 2.9%, and the least favored decile, decile 10 (low E/P, low ROC) returned an average 1.1% per quarter with standard deviation 6.7%.

In such a short time frame, this one’s probably a coin toss, but it looks like it did go in Greenblatt’s favor.

Dr. Phil McDonnell lauds and extends:

Kudos to Prof. Pennington for his thorough review of the Greenblatt study. His use of the log of the price relative is exactly the right way to go to take into account compounding.

In my opinion the best time period to study is the out of sample post publication time frame from 12/2004 to the present. Using this period eliminates most of the concerns and biases which I feared including the post publication bias.

Based upon that period I looked at the Spearman rank correlation coefficient for the mean and the Sharpe Ratio(*). The basic idea is to see if there is an overall correlation beyond just a differential between the top decile and the bottom. In this case we would expect a negative correlation simply because of the arbitrary ordering of the deciles by Dr. Pennington. The following R code gives us our answer:

# Test the Pennington-Greenblatt data using robust Spearman rank correlation
cor.test( av,n,method="spearman" )
cor.test( sr,n,method="spearman" )

With respect to the average we get:

Spearman's rank correlation rho

data: avg and n S = 226.3731, p-value = 0.2899 alternative hypothesis: true rho is not equal to 0 sample estimates: rho -0.3719581

Here the rho is -37% and has an insignificant p value of 29%

With respect to the Sharpe Ratio(*) we get:

Spearman's rank correlation rho

data: sr and n S = 218, p-value = 0.3677 alternative hypothesis: true rho is not equal to 0 sample estimates: rho -0.3212121

Here rho is 32% and the p value is 37% also non-significant.

(*) Minor quibble on the Sharpe Ratio: The usual formula for the Sharpe Ratio is:

SR = (average - tBillRate) / stdev

The idea is that it purports to measure excess return over and above the riskless tbill rate. It is thus the excess return one received for taking on risk. However in the present case making this adjustment would not change the ranking of the deciles at all since each average is being adjusted by the same thing. Thus the Spearman rank correlation test is robust even to this factor.

Victor and Laurel rejoin:

We suspect, as does Russell Sears, who ran a four minute mile and is always on target, that Greenblatt isn’t as careful with his data as he would lead us to believe, and that a student did it for him, and that there are millions of multiple comparisons involved in his original work. it doesn’t make sense that you could make a profit without a forward earnings estimate, and that you would be paid just for assuming things so close to cost, with little risk.

Robert Pinchuk adds:

I concur with the essence of your doubts (What!, no expectations!?) even with Prof. Pennington’s detailed validation. Haugen also re-did Greenblatt’s work verbatim on his (cleaner? better?) database (written up in Barron’s some time ago) and derived some different numbers — but not wildly different. But then Haugen is touting advice-for-profit of nearly the same kind, so there are caveats. But, Haugen is not dishonest, and the advice he sells also does carry expectational measures that help him squeeze more alpha with less variance (so he says), as we would both expect.

I am hesitant to disagree with you that the market rarely offers “freebies” for naively assuming risk, but I cannot help but ruminate upon the question: “Do the results make sense?” Bogus data, future information, dredging and questionable strategy heuristics aside, “loss-aversion” and “disposition effects” are powerful anomaly creators, especially in combination with feedback trading. I will grant you that “The Price Is (rather often) Right”, especially when conflicted with sparse non-price time-series data. Maybe elevated short-interest levels will soon make these disappear too, or at least delay gratification for a sufficiently demoralizing period of time.

One nagging thought: Is there really, as you suggest, such “little risk” in the undertaking? I think one might be surprised by the qualitative “risk”, when anecdotally assessed over time. Someone like Lakonishok might answer: “How can they be riskier if they produce more return?” But this seems insufficient. Risk, like HIV, can hide or remain dormant for extended periods (e.g. inflation in the 90s). I posit that there is risk being shouldered, but perhaps it’s different (i.e., a different array of factor risks) in each epoch, so it’s hard if not impossible to systematically isolate, let alone forecast. How can one measure the risk of buying a Chapter 11 candidate concurrent to potential deflation? It’s binary. Perhaps it’s just this embedded tail risk for which, like a reinsurance company, is good business to write if properly priced (The Reversion Trader?). And perhaps one day, the inherent risk will manifest itself and thereafter, disabuse anyone from naively pursuing The Magic Formula. Then again, maybe there are just a preponderance of traders with differing forms of myopia.

By the way, Prof. Pennington’s high/low return spread numbers for RoC seem elevated. The E/P spreads look about right, but it remains the inferior value proxy. “Quality” in general seems more efficiently priced.


Resources & Links