With deep learning, the data-rich get richer
The greatest potential for deep learning is in adding business-relevant structure to less-structured, sense-like data — such as images, audio and other sensor data.
How quickly does the tone and affect of a support call from a frustrated customer change, broken down by support rep? It’s that time-to-mollification that matters to your business, not the raw sound data.
Generally when training machine learning algorithms (and deep nets are an extreme example of this), the more data the better. There’s a persistent danger of « overfitting » your data — performing very well on the training set, but poorly on new data. If the algorithm has overfit, it has failed to generalize and is thus not that useful.
Practitioners guard against this by holding out some of the data set before model training and then confirming that the model performs about as well on this hold-out set as it did on the rest of the samples. The more data you have available for training, and the more diverse it is, the less likely you are to overfit.
So if you’re sitting on a massive pile of sense-like data, you’re in a good position to build quality deep learning models.
When it comes to the most broadly applicable deep learning problems — object recognition in images, identification of people and their activities in video, natural language processing — companies like Google and Facebook already sit atop a tremendous amount of relevant image, video, audio and text data. The average Fortune 500 company has nowhere near this data scale.
Thus, I expect artificial intelligence as a service (AIaaS) to be the dominant delivery vehicle for these high-value, broadly applicable use cases for deep learning.
Homegrown algorithms will generally be expensive to build and perform more poorly than their cloud-borne peers. If you just need to know how many people entered your store, which areas they visited and how long they spent in each area, you’re much better off piping your video feeds to an AIaaS offering than rolling that analysis yourself.
It’s not just Google and Facebook that get to have all the fun, however. Any company that naturally aggregates sense-like data has the opportunity to dominate the A.I. use cases trained on those data. Have a vast trove of medical imaging data, or decades of sensor readings from tunnel-boring machines? Then you’re in an advantaged position for building A.I. services in those domains.
This is why Tesla, with tens of thousands of vehicles on the road phoning data home, has a significant leg up in the autonomous vehicle race over competitors like Google and Apple.
Typically, though, companies store much more structured data — financial transactions, purchase histories, customer lists. And here, classic machine learning techniques for classification, regression or clustering are more appropriate.
The already data-rich are well positioned to get richer in the emerging field of AIaaS. For most companies, though, the interesting A.I. work starts once business-relevant information has been extracted out of raw sensor streams; it’s not particularly valuable or efficient to build the deep learning component itself.
And that business-relevant information is potentially powerful: A single image will soon be sufficient to determine a vending machine restocking order, eliminating the need for more complicated sensors or manual stock checks. As the difficulty of extracting these data approaches zero with AIaaS, all businesses stand to reap significant benefit from deep learning.
InnoValeur | Data Science | Smart Data | Machine Learning | AI