Forget The Hype — Focus on Lower Costs and Better Results
If you’ve been listening to the Big Data hype, you may have run across some of the following terms: Decision Support, Predictive Analytics and Machine learning. This is all so much clearer with an example, so let’s just start with the example, and then return to definitions at the end. Let’s do a credit card example.
About 15 to 20 years ago, when you applied for a credit card, the bank would pull your credit report and decide how much credit, if any, to give you. This is an example of decision support. The computer ran a report and gave it to a person and the person made the decision.
Today, when you apply, a computer looks at that credit report and, using an algorithm developed over many years, generates a credit score which predicts your future repayment likelihood. If the score is high enough, the computer approves your credit application and decides how much credit to give you. This is an example of predictive analytics. Importantly, no human was involved in deciding to grant or deny the credit application (even though humans were needed to build the algorithm).
Even more automated would be to automate not just the credit decision, but also automate the credit score algorithm. And this would be machine learning and (in simplified form) would look something like this:
- you feed back into the computer the actual payment histories of people the computer granted credit over the last several years.
- tell the computer: for each credit score (say, 680, then 681, then 682, and so on), look at all the people who paid as predicted and those who didn’t pay as predicted, and if you find a pattern that payers consistently have that non-payers don’t, remember that pattern.
- next time you get an application with a credit score of 680 (or 681, or 682 and so on), look to see if the new applicant’s credit history more closely matches the payers or the nonpayer and grant or deny credit (or the amount of credit) based on the applicant’s similarity to payers or nonpayers.
In a nutshell, that’s it. In reality, it’s more complicated, but the logic is exactly the same.
Ignore The Hype, Stick To The Results
Back to those technical definition I mentioned up top. From a technical point of view, the above technologies each have their own origins that have nothing to do with each other.
- Decision support was invented at Harvard in 1996 by a PhD student, Michael S. Scott Morton as part of his PhD thesis, now an MIT professor. Call it what you will (it’s been called many things over the years: OLAP, Business Intelligence, Data Viz, etc…), almost all companies are already using decision support.
- Predictive analytics comes from the early days of the insurance industry (England in the 1600’s) when predicting the aggregate risk of something was essential to setting premiums. But it took until the 1950’s for such analytics to be done on computers.
- And lastly we have machine learning, which began in 1952 when Arthur Samuel at IBM wrote a program that taught itself how to play checkers.
But don’t be fooled. Just because machine learning come about 15 years before decision support systems doesn’t mean it comes first for you. The right order to adopt these technologies is based on cost and skill. Start with the one that is least costly and requires the least technical and statistical knowledge, which is decision support. Then graduate to Predictive Analytics, and finally, when you have the Big Data talent and resources, to Machine Learning.