Factor Investors Strongly Believe in History

If you’re at all interested in investing, you’ve probably heard something about “factor investing”.  It’s not a term I’ve used much in Mindfully Investing posts, but many of the investing questions I’ve addressed could be re-framed as questions about investing “factors”.  In this two-part series, I’ll describe factor investing and discuss whether the mindful investor should pursue this strategy.

What’s a Factor?

A factor is any characteristic of a company or its stock that might have a predictable relationship to that stock’s returns over time.  Although “factor” is a relatively recent term of art in the investing world, the roots of factor investing go back to at least the early 20th century, when John Burr created a method to estimate stock prices based on a company’s intrinsic value.  Famous investors like Benjamin Graham and Warren Buffet used characteristics like hidden cash on the books and predictable cash-flows to help select stocks.

The name “factors” came from several foundational studies starting in the 1970s that identified multiple characteristics with an apparent linkage to superior stock returns.  Funds that focus on stocks with similar factors now account for 10% of the U.S. stock market capitalization.  Most of these funds attempt to achieve returns or risk-adjusted returns that are superior to the broader market, which is often loosely referred to as a factor “premium”.

Five of the most commonly cited factor premiums are shown in this table.

Factor Name Hypothesis Type Common Metrics
Size Small cap stocks perform better than large cap stocks Fundamental Market Capitalization
Value Stocks with low price per value metrics perform better than those with high metrics Fundamental Price/Book Value, Price/Earnings, Price/Free Cash Flow
Quality Stocks with healthy accounting metrics perform better than those with unhealthy metrics Fundamental Profitability, Margins, Leverage, Financial Constraints and Distress, Earnings Stability, Accounting Quality
Momentum Stocks with recent price increases will perform better than those with recent price decreases Price Movement Stock Prices Over Time
Volatility Stocks with lower price volatility perform better than those with higher volatility Price Movement Stock Prices Over Time

If factors like these provide an exciting premium above the boring broad-market return, why doesn’t everyone invest in factor funds?  One clue is that factor premiums are also generally referred to as a “risk premiums”.  That is, increasing your portfolio exposure to any of these factors to obtain increased returns will also likely increase your risks, particularly in the form of increased volatility.  But mindful investors aren’t particularly worried about small increases in volatility.  So, wouldn’t that make factor investing ideal for the mindful investor?  To answer that question, we need to consider the historical evidence supporting the existence of factor premiums.

Limited Historical Data

I’ve written in the past about the difficulty of using historical data to predict the future.  For good reason, every investing brochure has the disclaimer, “past results do not guarantee future returns”.  The total reliable stock market history in the U.S. is about 147 years.  Nowadays, a person’s retirement investing time frame can easily exceed 50 years (such as, starting at age 30 and living to 90).  So, the total U.S. stock history represents as little as three unique lifetime investing periods.  That’s a pretty small sample size to make predictions from, but that doesn’t stop people from trying.

Further, much of the historical data has been cobbled together in ways that introduce more variables to the analysis.  Consider Edward Mcquarrie’s stunning example of the history of the small cap data from the Center for Research in Security Prices (CRSP) database, which is one of the most commonly used data sets for factor analyses.  Rarely mentioned facts about the CRSP small cap data include:

  • Prior to 1972, 95% of small cap stocks are not included in the data set!
  • Most of the 5% that are included were actually former large-cap companies that had fallen on hard times.
  • Much of the legendary long-term superior performance of small cap stocks before 1945 was due to a few of these once-large companies recovering from the depths of the depression.

In other words, the data on “small caps” before 1972 is of questionable value for predicting small cap performance in the future, and for this reason, many of the size factor analyses you’ll see go back no further than 45 years.  This represents just a single lifetime investment period, or a sample size of exactly one.  Not much to work with.

Although I didn’t attempt to drill into the data history of every factor, I tend to agree with Jack Bogle’s view that there are inherent difficulties with assembling consistent historical factor data, particularly the further back you go in history.

Data Mining at the Factor Zoo

Faced with these data limitations and strong business incentives to “improve” investing methods, questionable factor studies have flourished in the last decade.  Wesley Gray recently pointed out some of the more problematic methods that are particularly prevalent in finance and factor research:

  • Researchers and publishers are most excited by positive results and rarely publish negative results.
  • P-hacking is a broad term for various statistical manipulations that can cause factor premiums to appear more robust and pervasive.
  • Although these problems exist in many scientific fields, economics, business, and social science research are among some of the worst offenders, and most factor research falls into one or more of these categories.
  • Data mining is another broadly used term that’s sometimes used interchangeably with p-hacking.  In my view, data mining is more about torturing the data to confess any possible positive relationship, without considering whether such a relationship is reasonable or not.

A recent paper explicitly engaged in large-scale data mining, in part to illustrate how absurd the results can be.  The researchers evaluated two million stock factors and produced over 20,000 “significant” factors using standard statistical thresholds.  Within that group they identified 17 factors (many of which had never been described before) that showed much stronger statistical relationships, but all of which, had no theoretical or common-sense underpinning.  One example is a premium based on sorting stocks into common versus ordinary stocks, subtracting out retained earnings and other adjustments, and then dividing by advertising expense.  If that makes sense to you as a cause of excess stock returns, please leave a comment and explain it to me.  In other words, these “new factors” were just lucky high correlations between two million random combinations of historical stock metrics.  There’s no underlying reason to believe these premiums really exist now or will persist into the future.

All this p-hacking and data mining is adding to an ever-growing “factor zoo”.  A recent study reviewed 447 factor premiums in the research literature and found that:

  • 286 (64%) of the 447 factors don’t meet standard statistical thresholds for validity
  • Of the 161 remaining statistically significant factors, the magnitudes of the premiums are often much less than originally reported.
  • Using a different factor model, which the researchers claim is more robust, causes 115 of the 161 factor premiums to fail tests of statistical significance.
  • That leaves 46 factors (only 10%) that are replicable.

Campbell Harvey is one researcher sounding the alarm at the factor zoo.  He wrote in 2014 that:

  • “Most of the empirical research in finance is likely false.  This implies that half the financial products (promising out performance) that companies are selling to clients are false.”

Other researchers have made similar broad criticisms that much of the recent finance research and the resulting investing products are suspect.

Factor Persistence

Even when the historical data are relatively consistent and the statistical methods used are replicable, I’ve noted before that factor premiums are erratic and have waxed and waned over their short history.  This table summarizes estimates from a few researchers of the percentage of time that positive factor premiums occurred during various time frames.

Study Factor 1-Year 10-Year Type Span Years
Morningstar Value-Large Cap 50% 65% Rolling 1990-2015 25
Morningstar Value-Small Cap 54% 82% Rolling 1990-2015 25
AAII Size (Small) 56% 66% Rolling 1926-1996 70
The BAM Alliance Size (Small) 59% 77% Discreet 1927-2017 90
The BAM Alliance Value 63% 86% Discreet 1927-2017 90
The BAM Alliance Momentum 72% 97% Discreet 1927-2017 90

One implication from these studies is that the longer you invest, the greater your chances of realizing the desired factor premium.  However, Charlie Biello points out that small cap outperformed large cap in about 78% of the years from 1979 to 2015, but small caps had a lower annualized return of 11.4% to 11.7% for large caps over that same period.  So, the percentage of years with positive premiums is no guarantee of better overall returns.

Chances of a Premium – Using annual returns data from Portfolio Visualizer, I calculated the annualized returns (Compound Annual Growth Rate or CAGR) for all possible investing start and stop dates from 1972 to 2018 for large cap, small cap, and large cap value stocks.  The resulting investing periods vary in length from 1 year to the entire 46-year span of this data set, with every possible combination of time spans in between.  This resulted in annualized returns for a total of 1128 possible investing periods.  I like this method of looking across many time spans because it gives a more realistic assessment of the probabilities of success.  It’s easy to plan to invest for 10 or 20 years, but life often intervenes with emergencies, new business opportunities, and surprise expenses that cause well-intentioned investors to sell before they planned.

Here are histograms of the premiums (factor annualized returns minus large cap annualized returns) for the size factor (small caps) and the value factor.  In both cases, the most likely historical result was a factor premium of between 0 and 1%.  (Note these are annualized premiums over the entire investing period, not annual premiums received each year.)  For small cap, 28% of the outcomes were a negative premium, and for value, 25% of the outcomes were negative.

Put another way, a negative annualized premium between 0 and -1% was the third most likely historical outcome for value and the fourth most likely outcome for small cap.

Negative Premium Durations – Another way to look at the historical persistence of premiums is to see how long past investors had to endure negative premiums.  Using the same data set of annualized returns from above, I looked at how long the returns of the small cap and value factors were less than the annualized returns for large cap stocks (negative premiums) as shown in these two graphs.  Each date on the horizontal axis of the graphs represents the results for someone buying into the market on that date and holding through 2018.  So, the investing periods get shorter as you move to the left side of the graphs.