Source – insidebigdata.com
Shhh … No one else can hear this.
The secret of artificial intelligence is …
Yes, you read that correctly.
Organizations – vendors and companies alike – are getting lots of kudos, publicity, and, perhaps most importantly, investments, because they are offering AI solutions.
Scratch below the surface, though, and this is what you’ll find. Most of these companies are using public algorithms that have been out there for many years. They are generally optimizing some hyper parameters and very rarely adding any of their own proprietary algorithms, complex formulas, and big data analytics. They are then using those public algorithms to deliver one or all of these three items into a specific use cases or a few:
Anomalies are simply items that deviate from the norm, the outliers, no matter the category.
Classification is the distribution of data into predefined categories, so it can be most effectively used. As a simple example you can create special algorithms to classify the elements into solids, liquids and gases – or you can look at the periodic table and accomplish the same result.
Admittedly, predictions are the cool factor in the artificial intelligence triad, especially when it comes to preventative medicine. We can analyze someone’s propensity for heart disease based on the histories of close family members and genetic markers.
In the business world, of course, artificial intelligence can help organizations make critical business decisions – What are the anomalies, so I can prevent fraud? How can I classify these customers, so I can make the appropriate loan offers? Based on previous behaviors of this classified cohort, how many of them do we think will default on a loan above $10,000?
No matter how great the formulas are to create these anomalies, classifications, and predictions, they are ultimately based on data. The issue here is the validity of that data – are you testing what you think you’re testing?
First, you must eliminate the outliers that could create a completely skewed response.
Classifying the elements into solid, liquid and gas at room temperature is easy. How do you manage the data when the elements are added in at different temperatures? What about the plasma state?
Predictions, too, are skewed when the data set isn’t clean. The bank offers car loans of $10,000 to everyone walking in the door. If the prediction is based on the general population, the bank will determine that it isn’t profitable to offer those loans anymore. If the data set is changed to include only clients with a previous motor vehicle loan, then the predictions’ accuracy changes significantly.
Modern computing power and higher-level mathematics have opened enormous opportunities for discovery. Even though most of these algorithms have existed for 30 and sometimes even 100 years, the four reasons it’s all possible now are
- Processing power and availability
- Easy-to-use AI frameworks and platforms
- Most critically: the availability of enormous amounts of data as well as the budget and technology to make it easy to keep it
- The easiest way to combine all the above points: Just one amazing word – Cloud
Finding anomalies, classifications, and predictions are critical to ensuring the future of most organizations – with an enormous caveat.
What does the data look like? With clean and optimized dataset, your AI can achieve wonders. With the same data sets you’ve been using all along, those anomalies, classifications, and predictions are just another way to make bad decisions. You might as well trust your gut. It will be right as often and with a much lower investment.
All in all, AI is as fine as any tool in your business operations. But before you play with it, you need to ensure your target is set with a clear, well-defined challenge. Be aware, though, that for any “AI” path you choose, its accuracy and value is entirely dependent on how clean and organized your data is – and that’s what makes your AI results deliver wisdom – or leave you making critical business decisions on flawed results.