By Corey Hoffstein, Newfound Research

We like to think that, as quants, our beliefs are ultimately swayed by the weight of the evidence. But there are a few artifacts that arise from the data that are, quite simply, just hard for us to get behind.

Seasonality is one of those effects.

Candidly, we had not given seasonality the time of day until recently. Earlier this year, I had the pleasure of speaking at the Democratize Quant conference, hosted by our friends at Alpha Architect. My presentation was on the effects of timing luck, an idea that quite simply says, “when you form your portfolio can have a significant impact on your results.” Our research has largely been around the randomness of this impact, with the assumption that when is not a source of edge, but rather a source of noise that should be diversified away.

After presenting, Alpha Architect’s co-CIO’s Wes Gray and Jack Vogel shared with me some of the literature on seasonality, which argue that when may actually be a significant source of edge. Many of these papers highlighted the more well-established anomalies – like the turn-of-month, turn-of-quarter, and January effects – which many argue are tied to investor behavior (window dressing and tax-loss harvesting).

These are not insignificant effects. Consider the following image and table from the recently published Fact, Fiction, and the Size Effect by Alquist, Israel, and Moskowitz (2018).

We can see that the size effect is subject to a large seasonality effect.  In fact, the entire premium comes from the month of January.

Another popular anomaly is the Halloween indicator, commonly known by the rhyme scheme “sell in May and go away.”  Jacobsen and Zhang (2014) fail to reject the existence of the anomaly in 65 different countries and replicate a 2002 study to demonstrate that in the 37 countries studied, the effect remains present out-of-sample. Despite significant effort to explain the effect, no critique survives analysis other than the fact that everybody is on vacation.

Perhaps the most curious effect, however, is from a paper by Keloharju, Linnainmaa, and Nyberg (2015), in which they find that a strategy which selects stocks based upon their historical same-calendar-month returns earns a significant excess premium.

As an example, at the beginning of January, the strategy would look at the current universe of stocks and compare how they performed, on average, in prior Januaries.  The strategy would then buy those stocks that had performed well and short-sell those that had performed poorly.

They also document similar seasonality effects in anomalies (e.g. accruals, equity issuances, et cetera), commodities, and international stock market indices.  Further, seasonality effects found in different assets are weakly correlated with one another, indicating the potential to diversify across seasonality strategies that employ the same process.

Indeed, this anomaly proves to be economically significant, incredibly robust, and rather pervasive.  So we’ll call it the anomaly that broke the camel’s back, because it has convinced us that seasonality warrants further investigation.

But does it work for sectors?

In this commentary, we will explore whether the anomaly discovered by KLN (2015) proves successful in the context of U.S. sectors.  Using data from the Kenneth French data library, we explore three approaches.

  • Expanding Window: Select sectors each month based upon full set of prior available data.
  • Rolling Window: Select sectors each month based upon the prior 30 years of data.
  • EWM: Select sectors each month based upon the full set of prior available data, but weight the influence of the data exponentially (with a center-of-mass at 15 years).

We employ three approaches in effort to determine (1) if the effect is robust to specification, and (2) the stability of the sector recommendations over time.

At the beginning of each month, we construct a long/short portfolio that goes long the three sectors with the highest historical average during that month and short the three sectors with the lowest historical average during that month.

We employ an overlapping portfolio implementation that allows us to account for the potentially arbitrary nature of month definitions.  While the traditional Gregorian calendar is one way of dividing the year into 12 months, there is no reason we could not arbitrarily shift the definition forward or backward several days.  For example, instead of defining a month using the traditional calendar approach, we could define a month as the 4th to the 4th, or the 21st to the 21st.

By normalizing each month to assume 21 trading days, we can use this approach to create 21 variations of the strategy and determine whether this effect is truly calendar based effect or whether there is a more nuanced seasonality. For instance, there could be an effect that hinges on the period from April 15th to May 15th because of tax day.

Below we plot the 21 variations for the rolling window implementation.  While we can see significant variation between the different definitions, all appear to be economically significant over the long-run.

The overlapping portfolios’ implementation can be thought of as, quite simply, the average result of the above variations.  We create an overlapping portfolio implementation for each of the three variations and plot the results below.

We can see that while all three variations appear to be economically significant (e.g. the rolling variation generates an annualized return of 4.4% with an annualized volatility of 8.4%), both the rolling and exponentially-weighted approaches significantly outperform the expanding window approach post-1995, indicating that the sectors selected in a given month may not be stable over time.

Our choice of using a 30-year lookback window is somewhat arbitrary, informed only by prior literature and an attempt to have a meaningful number of observations for deriving rankings.  That said, there is no reason we could not explore the stability of this seasonality effect and its sensitivity to perturbations in the lookback parameter.  Below we plot the rolling variation using a range of lookbacks from 5-to-50 years.

We can see that all seem to offer an economically large edge.  Due to the fact that the available history for each lookback differs, we can only compare performance across their shared history.  In this case, the 25-year lookback maximizes both annualized return and Sharpe ratio over the period.  The decline in both return and Sharpe are not purely monotonic as the lookback period increases and decreases, but the 25-year horizon does appear to be a fairly stable maximum.

Recommendation changes over time

To explore the stability of sector selection over time, we look at the identified ranks of each sector for the rolling window implementation in both 1947 and 2017.

We can see that in several cases, the ranks have taken a complete turn.  For example, while in 1947 the Energy sector would have been a short in July, in 2017 it is a long.  Similarly, in 1947 Energy was a long in September and October, while in 2017 it was a short.  We see similar changes for Durables and Utilities in July and August and Health Care in May and June.

One hypothesis as to why the ranks are not stable is that the composition of the sectors has changed dramatically over time.  Consider, for example, that a technology company in 1947 would be meaningfully different than a technology company in 2017.  In many ways, these sector definitions are somewhat arbitrary (what is Amazon classified as, again?).  Therefore, as new sectors develop over time and compositions change, we might not expect seasonality-based rank order to remain consistent.

Is it significant?

We can begin our analysis of significance by first exploring the different quintile rankings of the strategy (all analysis henceforth is performed using the 30-year rolling window implementation variation).

The purpose is two-fold.  First, we want to determine how the quintiles do relative to each other.  Ideally, we would see a monotonic decrease in annualized performance and a monotonic increase in Sharpe ratio.  We do not plot the Sharpe ratio above, but we can see the stead decrease in return with a fairly consistent volatility profile, indicating a decreasing Sharpe ratio as well.

Second, we want to determine how much “alpha” is available for a long-only implementation.  The first quintile outperformed the market by 430 basis points (“bps”) while the fifth quintile underperformed by 230 bps.  The interpretation here is that while a long-only variation would give up potential alpha (from the Q1-to-Q5 spread), it may still be able to outperform the broad U.S. equity market.

While the seasonality strategy appears economically significant, it is important to ask whether it is simply a variation of other well-known and previously discovered anomalies.  We can do this by regressing the strategy’s returns against the returns of existing, documented factors.

In the table below, we plot the results of regressing against the traditional Fama-French 3-Factor model.

We see that while the HML factor (i.e. value) is statistically significant (with a small negative loading), the intercept remains economically large and significant, indicating that seasonality remains unexplained.

We also wanted to explore whether the anomaly was subsumed by sector-specific implementations of different factor strategies. Specifically, we construct a value long/short, a momentum long/short, and a volatility long/short.

While the volatility anomaly appears significant, the intercept remains large and the anomaly remains unexplained.

Evaluating correlations, we can see that for sector-based traders, seasonality may be a potentially valuable diversifier from traditional value, momentum, and volatility anomalies.

Finally, we wanted to explore whether the seasonality effect simply data-mined an advantageous average sector weighting over time?  To answer this question, we regress the monthly returns of the long/short against the market’s excess returns as well as sector returns in excess of the market. This allows us to extract broad beta effects as well as industry specific effects.

We see that the anomaly remains both large and significant, and only the Manufacturing and Technology sectors offer any explanatory significance, with a negative loading on the former and a positive loading on the latter.  This indicates that at least part of the strategy’s edge may have come from being net short Manufacturing and net long Technology, on average.

Conclusion

We have demonstrated that seasonality in the U.S. sectors has been economically significant when measuring it using historical returns in previous months over the past 75 years. The effect is robust to the lookback window length and the lookback window type. It is also significant when accounting for other known anomalies (such as value, momentum, and size), sector-specific factors, and industry weights.

Despite this empirical evidence, we still do not know why seasonality is significant. Why should seasonality work in the future? What specifically has caused it to work in the past?

Without knowing the theoretical reason for an anomaly, be it behavioral, structural, or otherwise, it is difficult to rely on it going forward.

This is not to say that it won’t work – if it does, it may even diversify other investing styles. But until we can see more tests done with out-of-sample data, any seasonality strategy should likely garner only a small allocation within a well-diversified factor portfolio.

Corey Hoffstein is the Co-founder & CIO at Newfound Research, a participant in the ETF Strategist Channel.

  1. https://papers.ssrn.com/sol3/papers.cfm?abstract_id=3177539
  2. https://papers.ssrn.com/sol3/papers.cfm?abstract_id=2154873
  3. https://papers.ssrn.com/sol3/papers.cfm?abstract_id=2224246
  4. Long/short strategies go long the top 3 ranked sectors and short the bottom 3 ranked sectors. Value ranks are based upon 5-year z-scores of trailing twelve-month yields. Momentum ranks are based upon 12-1 month returns. Volatility ranks are based upon 63-day exponentially weighted realized volatility. Value and momentum are implemented in a dollar-neutral fashion, while volatility is implemented such that both long and short legs are of equal volatility. Each strategy is implemented using an overlapping portfolio approach, with value assuming 36 overlapping portfolios (rebalanced monthly), momentum assuming 4 overlapping portfolios (rebalanced weekly) and volatility assuming 26 overlapping portfolios (rebalanced weekly).