Why These 5 Reasons Will Convince You Machine Learning Predictions Are Screwed
Artificial Intelligence, Machine Learning, Data Science
Don’t be fooled by fancy predictions
Machine learning has improved our lives.
In general, we no longer need to decide what to watch, listen to and even eat.
But, these are only probabilistic recommendations and not really recommendations as personalized as you think.
After thinking about machine learning for a while, I realized that machine learning predictions were just the recommendations you got, like you would from McDonald’s counter staff – “Do you want fries with that ?”
Sometimes yes. Sometimes not.
On another note, I would go so far as to say that our confidence in machine learning predictions is like how King David went to bed with Bathsheba. It seemed like a good idea at the time because Bathsheba was beautiful, but David didn’t really know who she was and the consequences of her act.
Machine learning results seem like magic, but we don’t know much about the training data or the algorithms that lead to the result, and we certainly don’t know the consequences of blindly following a machine learning recommendation.
The ability of a machine learning algorithm to make a recommendation is based on its training set. Basically, an algorithm will spit out whatever is put into it.
For example, let’s say you want to predict the next election winner. If a certain party has won a particular seat for three consecutive terms and your data is based on previous votes dating back 12 years, the algorithm will naturally predict that same party as the next winner of the seat.
But, let’s say 5 years of public development has changed the demographics of the public voting in that particular seat. How would you predict the next winning party?
You really can’t.
A demographic change does not necessarily mean a change of allegiance from one party to another for a region. You also can’t take a poll and guarantee that if someone wants to vote right when you ask them, they won’t suddenly vote left on Election Day.
Ultimately, machine learning can predict the past using past data, but it can’t really predict the future with past data.
I once heard on a podcast that someone had created a machine learning system to help judges determine sentences for offending criminals. However, what the system did was it recommended the incarceration of African Americans at a higher rate than white Americans.
Why is it?
How can a model be biased?
Mathematical models should not take race into account.
And, no, they shouldn’t unless there is already a bias in the training set.
And that’s what happened.
The historic training set contained biases from the past as African Americans were incarcerated at a higher rate than white Americans. Or, to put it in other words, the results of a machine learning algorithm will most likely lead to the most common scenario in the dataset.
If the story was biased, the machine learning algorithm will continue to be biased because it doesn’t know any better.
Personally, I don’t think machine learning predicts anything, rather it automates what is already known.
I’m probably making a definition here to explain my point.
A prediction is a forecast – a guess of the future.
However, what machine learning does well is that it takes past data and gives you expected results from the past.
So what machine learning does really well is automate past thought. By that I mean, if the dataset is already a few seconds old, does it reflect the present or does it represent the past?
Take, for example, house price forecasts. What is the prediction really based on? You would say interest rates, land size, location, etc.
But, really, that prediction is just how much someone might pay for that house, based on how others have paid for houses in the past. You can’t really assume that all buyers are rational in their decision-making. For example, people do not buy certain houses because of their numbers while some buy a house because they believe they can predict the future cash flow of the house while others because the house is close to the family.
Also, remember that supervised machine learning is nothing more than weighting variables to get an estimate of the target output.
So really a house prediction is a computer weighing variables independently and guessing how much a certain variable contributes to a target output, and it automatically makes that same guess based on the weighting of the variables whenever a data variation is introduced into the model.
Also, machine learning models cannot predict anything when entirely new data that does not exist in the training set enters the model.
I read in this book called Antifragile by Nasim Taleb. The author writes that radio waves are harder to pick up in complete silence than with some noise in the background. I’m not too sure of the reasoning, but I think it has something to do with the equipment being better able to differentiate when there is noise.
This brings to mind the example that a computer vision model labeled an African man as a gorilla instead of an African man. The machine learning model was unintentionally offensive, but in hindsight, machine learning engineering should have known better.
This tells us that computer vision and machine learning models not only require examples of true positives of something, but also false positives, false negatives, and true negatives.
Why is this important?
Before relying on machine learning predictions, you need to be sure that the underlying dataset reflects the real world. In other words, did the machine learning engineer select only good results that your overfitting probability is ridiculously high with unseen data.
For example, the discontinued Google Flu Trends failed because the machine learning algorithm ultimately correlated flu outbreaks with the upcoming winter season. He couldn’t distinguish between false positive events and real events that lead to flu outbreaks. “The Data Detective” by Tim Harford gives this example, which I will paraphrase: “A flu outbreak the previous year at a particular basketball game does not necessarily mean that a basketball game the following year in the same area at the same time will also produce a flu outbreak.”
Machine learning is not a dynamic prediction machine. It works best in a static environment. By this I mean that if you understand how machine learning works, you will realize that a model has to train on a training set before it can predict.
The problem is that the training set is static at any given time, and to keep up with the ever-changing landscape, the model must be constantly run with a new training set.
For larger companies, this may not be a big deal. They likely have the computing power for machine learning algorithms to persistently learn on new training sets and alternate “production” algorithms as training datasets become stale. * There are probably other ways to do machine learning online that could solve this problem, but that’s beyond the scope of this article.
Instead, let’s say you want to create your own home value prediction system, then your results are at the mercy of the data set at the time you trained your model. As the number of home sales occurs, it will change your forecast each time, leading to the conundrum, what is the real fair value of the home?
You may or may not have noticed it, but home value predictor apps usually give you a wide range of what a home might be worth. Usually this is a range of $100,000. Trust me, you don’t need an algorithm to tell you that. You can easily research the history of homes sold in the area to make this kind of guess.
As you can see, I don’t believe in taking machine learning results too seriously.
Knowing the underlying training dataset and sometimes even the machine learning algorithm used is really important. *Maybe the type of machine learning algorithm doesn’t really matter if you subscribe to the free lunch theorem.
I believe that machine learning has its place in society. For example, it’s much easier to get recommendations on what to watch next on YouTube, even if it’s a long shot. But, in reality, the machine learning algorithm is just automatically recommending similar types of videos for you to watch, rather than predicting what you like to watch. I know because I have been recommended videos that I have no interest in watching many times.
To end this article, take machine learning results with a grain of salt and always ask yourself how the results were obtained.