Inferring the Statistics of Buffett's Alpha

First, we need 10,000 investors that have survived. We mean this both in the metaphorical sense and the literal, morbid sense.  In the metaphorical sense, we would need 10,000 investors that all stuck with their process to target that idiosyncratic volatility and never gave up along the way.  For those running money professionally, they would have to stay in business through all the ups and downs.4  In the more morbid sense, we would also need 10,000 investors who literally survived the entire period. Warren Buffet was 49 at the beginning of 1980. At that age, the average life expectancy at that time was 30 years, and there is a less than 25% chance of making it the full 38.5

The second condition of staying in business might be the biggest catch.  Based upon how we conducted our test, we need 10,000 investors all generating a consistent 19.26% idiosyncratic volatility and whose idiosyncratic returns are independent from one another.  Investors may be able to generate that level of idiosyncratic volatility through highly concentrated portfolios, but it might be quite difficult to make the portfolios all generate truly independent returns.  Especially if we are constrained to long-only implementations.

Now, it is worth pointing out that the returns of BRK-A are not those of your typical long-only mutual fund.  Private holdings, leverage via re-insurance float, dabbling in special situations, and derivatives use means that we would expect a very different profile than that of your normal mutual fund picking stocks.

Nevertheless, given that the Russell 3000 represents 98% of all U.S. public equity securities, it would be impossible to get 10,000 independent bets.  Even if each portfolio held a single stock and the excess returns of all those stocks were truly independent from one another (hint: they are not), we’re still 7,000 portfolios short.

So where does this leave us?


Confused. That’s where we’re left.

But that’s rather the point of this entire commentary.  Market and performance analyses are not black-and-white.  When evaluated as a probability based on the final results, Buffett’s performance looks exceptional.  When evaluated in the context of a large number of investors, it looks inevitable.

Yet, when we add a bit more realistic texture to our assumptions, we realize that what made the results seem inevitable is infeasible.

Numbers such as p-values and long-term average returns can seem incredibly precise, but precision and accuracy are two distinct concepts.  We should be careful not to mistake the former for the latter. We can quote decimal points out to ridiculous precision and still be completely off target.

Exact numbers have more meaning when they refer to realized outcomes or deterministic (i.e. non-random) future events. Any statistical prediction where randomness is involved is best interpreted in the context of uncertainty.

As in our Buffett example, sometimes the qualitative aspects can be more informative than the seemingly accurate quantitative statistics.

For instance, even the two sample investors that beat Buffett over the long-run underperformed him over different 1-, 3-, and 5-year periods. If you were one of those investors, would you have abandoned your method in those years?  If you were a manager, would investors have stuck with you?

Our intention in this commentary is not to come down on either side of the debate about Buffett’s alpha.  Rather, the point is only to serve as a reminder that statistical models inherently rely on a large number of assumptions.  Without understanding what those assumptions are, why they are made, and what the implications mean, the results should be taken with a grain of salt.

By relying on a poorly understood model or a statistic taken out of context, a decision could be made that might prevent the next Warren Buffett from emerging in the investment community.

Corey Hoffstein is the Co-founder & CIO at Newfound Research, a participant in the ETF Strategist Channel.

  1. Our data goes back to 1980 and we use data from the Kenneth French data library to run our regression.
  2. We are using a simple CAPM regression instead of a less parsimonious Fama-French 3-Factor, Fama-French 5-Factor, or q-factor model simply because these models did not exist in 1980, and therefore an active investor would not have discerned between what was an active investment choice and a factor.
  3. Assuming returns are log-normally distributed.  We are aware of the interpretability issues surrounding p-values.  In this case, however, we are specifically saying under the assumption of the null hypothesis being true.  We think it is worth bearing in mind, however, that p-values often lose their value as a measurement of false positive risk when we do not know whether the null hypothesis is true or not.  Which, we should point out, is almost always the case: otherwise, why would we bother running the test?  We do not believe this meaningfully affects the point of this commentary, but a more nuanced discussion can be read at
  4. A private capital structure where investors cannot redeem may be one of Buffett’s greatest tricks allowing him to compound wealth.  We’d speculate that were he running a public fund, his assets would have dwindled greatly on many an occasion.