By Corey Hoffstein, Newfound Research

When an event with the smallest odds meets a long enough horizon, or large enough sample size, the occurrence becomes a near certainty.

Flipping 50 heads in a row with a fair coin has the infinitesimally small probability of 8.9e-16 (smaller than a millionth of a millionth of a percent).  Try it 5,184,960,683,398,419 times, however, and there is a large degree of certainty you’ll see the magic 50 heads (<1% chance you will not).  Whether you see it the first trial, the tenth, the 900billionth – or get unlucky and do not see it at all – has nothing to do with skill, just randomness.

With this in mind, we ask a simple question: Is Warren Buffett’s track record evidence of inefficient markets or was it eventually guaranteed by randomness?

Let’s put some numbers behind it. Specifically, we will take the monthly returns of BRK-A1 and perform a standard CAPM regression to identify Buffett’s alpha (i.e. his excess return), his beta (i.e. sensitivity to the market), and his idiosyncratic risk (i.e. the random stuff left over).2

With a long-term beta of just 0.65, Buffett has generated an annualized alpha of 9.86% with idiosyncratic volatility of 19.26%.

If we assume a null hypothesis that his actualalpha is 0% – implying that his realized 9.86% was the result of luck and not skill – the probability of seeing a result this good, or better, is a mere 0.07%.3

Which seems exceedingly low.  By conventional statistical standards, we would rejectthe null hypothesis that Buffett’s alpha is equal to zero with 99.93% confidence.  Note that we did not say that we accept that Buffett has a positive alpha.  In statistics we only rejector fail to reject.  This is a nuance, but we believe the language holds important meaning: there is always a probability our interpretation of the data is wrong.

Another way of looking at this data, however, is saying that if another investor had started back in 1980 on the same date and invested randomly with a constant 19.26% idiosyncratic volatility, there is a 99.93% chance they would have underperformed Buffett’s track record through today.

Tiny probabilities compound rapidly in large groups when the samples are independent, however.  So how many investors would need to invest in such a manner before there is a <1% chance that they all underperform Buffett?

Just 5,875.

All of a sudden, Buffett does not seem quite as special.

Below we simulate exactly this exact concept. Starting in 1980, we simulate out the excess returns of 10,000 investors who have zero expected alpha but an idiosyncratic volatility of 19.26%.  Those investors that outperformed Buffett’s track record (blue) are highlighted in orange.

Surely there were more than 10,000 investors in 1980. So where are the other lucky multi-billionaires?

First, we need 10,000 investors that have survived. We mean this both in the metaphorical sense and the literal, morbid sense.  In the metaphorical sense, we would need 10,000 investors that all stuck with their process to target that idiosyncratic volatility and never gave up along the way.  For those running money professionally, they would have to stay in business through all the ups and downs.4  In the more morbid sense, we would also need 10,000 investors who literally survived the entire period. Warren Buffet was 49 at the beginning of 1980. At that age, the average life expectancy at that time was 30 years, and there is a less than 25% chance of making it the full 38.5

The second condition of staying in business might be the biggest catch.  Based upon how we conducted our test, we need 10,000 investors all generating a consistent 19.26% idiosyncratic volatility and whose idiosyncratic returns are independent from one another.  Investors may be able to generate that level of idiosyncratic volatility through highly concentrated portfolios, but it might be quite difficult to make the portfolios all generate truly independent returns.  Especially if we are constrained to long-only implementations.

Now, it is worth pointing out that the returns of BRK-A are not those of your typical long-only mutual fund.  Private holdings, leverage via re-insurance float, dabbling in special situations, and derivatives use means that we would expect a very different profile than that of your normal mutual fund picking stocks.

Nevertheless, given that the Russell 3000 represents 98% of all U.S. public equity securities, it would be impossible to get 10,000 independent bets.  Even if each portfolio held a single stock and the excess returns of all those stocks were truly independent from one another (hint: they are not), we’re still 7,000 portfolios short.

So where does this leave us?

Conclusion

Confused. That’s where we’re left.

But that’s rather the point of this entire commentary.  Market and performance analyses are not black-and-white.  When evaluated as a probability based on the final results, Buffett’s performance looks exceptional.  When evaluated in the context of a large number of investors, it looks inevitable.

Yet, when we add a bit more realistic texture to our assumptions, we realize that what made the results seem inevitable is infeasible.

Numbers such as p-values and long-term average returns can seem incredibly precise, but precision and accuracy are two distinct concepts.  We should be careful not to mistake the former for the latter. We can quote decimal points out to ridiculous precision and still be completely off target.

Exact numbers have more meaning when they refer to realized outcomes or deterministic (i.e. non-random) future events. Any statistical prediction where randomness is involved is best interpreted in the context of uncertainty.

As in our Buffett example, sometimes the qualitative aspects can be more informative than the seemingly accurate quantitative statistics.

For instance, even the two sample investors that beat Buffett over the long-run underperformed him over different 1-, 3-, and 5-year periods. If you were one of those investors, would you have abandoned your method in those years?  If you were a manager, would investors have stuck with you?

Our intention in this commentary is not to come down on either side of the debate about Buffett’s alpha.  Rather, the point is only to serve as a reminder that statistical models inherently rely on a large number of assumptions.  Without understanding what those assumptions are, why they are made, and what the implications mean, the results should be taken with a grain of salt.

By relying on a poorly understood model or a statistic taken out of context, a decision could be made that might prevent the next Warren Buffett from emerging in the investment community.

Corey Hoffstein is the Co-founder & CIO at Newfound Research, a participant in the ETF Strategist Channel.

  1. Our data goes back to 1980 and we use data from the Kenneth French data library to run our regression.
  2. We are using a simple CAPM regression instead of a less parsimonious Fama-French 3-Factor, Fama-French 5-Factor, or q-factor model simply because these models did not exist in 1980, and therefore an active investor would not have discerned between what was an active investment choice and a factor.
  3. Assuming returns are log-normally distributed.  We are aware of the interpretability issues surrounding p-values.  In this case, however, we are specifically saying under the assumption of the null hypothesis being true.  We think it is worth bearing in mind, however, that p-values often lose their value as a measurement of false positive risk when we do not know whether the null hypothesis is true or not.  Which, we should point out, is almost always the case: otherwise, why would we bother running the test?  We do not believe this meaningfully affects the point of this commentary, but a more nuanced discussion can be read at https://lucklab.ucdavis.edu/blog/2018/4/19/why-i-lost-faith-in-p-values.
  4. A private capital structure where investors cannot redeem may be one of Buffett’s greatest tricks allowing him to compound wealth.  We’d speculate that were he running a public fund, his assets would have dwindled greatly on many an occasion.
  5. https://www.cdc.gov/nchs/data/lifetables/life80_2acc.pdf