For starters, linear regression is a machine learning algorithm. When I took my first ML class at MIT, we spent a week studying it. So, from my point of view, linear regression is one of many different machine learning techniques, each of which has its own strengths and weaknesses.
The amount of data needed is one dimension along which ML techniques differ. Linear regression typically needs about 10 times as many examples as features in the model, so it often works well with low to moderate amounts of data. Deep learning seems to really excel when you have enormous amounts of data. And at the other extreme, techniques like support vector machines and lasso regression work well when you have very small amounts of data. So, while the best technique may differ by problem, machine learning methods are useful no matter how much data you have.
Like any other machine learning technique, there will be problems where linear regression works well and others where it does not. If you want to be able to tackle a wide range of problems, you’ll want to include linear regression in your tool belt and not be predisposed toward or against it.
Jack: Two of the common criticisms I hear of machine learning from people who use more traditional methods is that it is effectively data mining and that it has no way of identifying whether there is an economic reason for the relationship between two variables. On the first point, there is a saying that if you torture the data enough, it will eventually confess, and some argue that machine learning is essentially just running as many tests as it takes to produce the desired result. On the second, if you fed a machine learning system something like the results of every NFL game in a season as a feature and stock returns as an output, it could potentially find a relationship, but that relationship would be spurious. What do you think of these criticisms?
Kevin: I disagree with the first point.
Applying machine learning is a constant battle against data mining (or, as we call it, “over-fitting”), so users of machine learning are quite aware of it, and the basic techniques of machine learning focus on how to prevent that.
In fact, users of traditional methods may be more susceptible to data mining, due to a false sense of security. (Results on the non-replicability of past research on hundreds of different alleged “factors” seems to indicate this.) Users of machine learning, in contrast, are always on guard against data mining.
On the second point, I must admit that I’m suspicious of the importance of humans finding a “true economic reason” for a statistical relationship.
Human beings, even smart ones, are perfectly capable of finding explanations for relationships that don’t exist. People thinking vaccines are the cause of autism is a recent example. And on the other side, the fact that we can’t think of an explanation for something doesn’t mean it isn’t true. For example, over a decade before Pasteur uncovered the link between bacteria and diseases, a Hungarian doctor had already noticed that doctors washing their hands before delivering babies led to fewer deaths in child birth, even if he didn’t know the true reason why.
I’d love to see a blind study that tested out how well human analysts are at separating true economic relationships from false ones. Perhaps we’d see some alpha from that. Or perhaps we’d see the same thing that Joel Greenblatt saw when he let his clients choose from the stocks that his quantitative model liked: that filtering through their reasoning reduced performance. (Of course, in that case, you really should be using information from the human analysts — by taking the ones they don’t pick!)
Finally, it’s worth pointing out that finding a true economic relationship is not a prerequisite for making money. Jim Simons, from Renaissance Technology, has said that they use many signals that do not make sense economically, and it appears they are making a lot of money.
Stepping back, I think the reason people prefer to stick to statistical relationships that they believe are based on true economic relationships is that they believe those statistical relationships are less likely to change in the future. However, I haven’t seen evidence that this is true. (Again, someone should do a study!) Alternatively, you can just let the data tell you that the relationship has changed. In fact, you must do that anyway because even “true economic relationships” can disappear in the future.
Jack: One of the biggest issues faced by those of us who follow value models is the issue of value traps. Whenever you buy cheap stocks, some of them are going to be cheap for a reason. Adding filters for quality can help with this, but also can have side effects of eliminating potentially successful investments along with the bad ones. I am wondering if machine learning may be able to help with this problem by analyzing historical winners and losers to find characteristics that are common among value traps, but are not present with successful investments. Do you think this is a promising use of the technology and how would you go about using it for that purpose?
Kevin: Absolutely, machine learning is well suited to that problem.
I would start by deciding how you want to define value traps.
Let’s say you are only worried about companies that are end up going bankrupt. In that case, it’s probably only a small fraction of the companies in your data set that fall into that category, so you may end up in a situation as in fantasy football where there is a small set of examples to learn from. In that case, I’d try out support vector machines or regularized logistic regression — methods that work well with limited data. Try those out and see if they are accurate on out-of-sample data. If they are, then you can feel confident applying them to new companies.
Next, you need to figure out how to incorporate that model into your process. The model will give you an estimated probability of bankruptcy, so you could do something like exclude any of the companies whose estimate is above some threshold or in the top decile by probability of bankruptcy. However, you may find that those filters give worse returns because, as you suggested, they might throw out too many good investments with the bad ones.
In that case, there are more knobs that you can try. The models I mentioned will let you set weights on the examples. You could add weight to the big winners, essentially telling the model to work harder to not misclassify those ones. That might solve your problem, or it might create some other problem. Perhaps standard deviation of returns becomes too high. Well, you can look for knobs to adjust for that.
And on and on it goes… There’s a reason why these ML projects at Google and other places take 10 PhDs three years to complete. It’s hard work!
Jack: Thank you again for taking the time to talk to us today. If investors want to find out more about you and your research, where are the best places to go?
Kevin: I’m on Twitter as @kczat. I’m always happy to chat there.
If the reader is also interested in learning more about ML, I’m planning to have more articles published at OSAM Research in the future that will hopefully be informative.
For more news on robotics, AI and more, visit the Robotics & AI Channel.