The wizard, and a look behind the curtain of language & machine Learning

Researchers, and large corporations (IBM, Google, Facebook) are all searching for the holy grail in machine learning – understanding and responding to the basics of language. A deep learning algorithm created by Google has already been able to learn from their extensive technical support logs and through this learning is now able to create responses to basic support inquiries. This is a large step in language learning by computers providing a mechanism for creating greater ability to respond to their customers’ needs and lowering operational costs. This basic understanding of language and application of machine learning is The Wizard.

As the fairy tale of the Wizard of Oz teaches – there is a curtain which often obscures the relevant. In the case of machine learning and language analysis the industry is seeing benefit; however, there is a hurdle yet to be overcome in the artificial intelligence realm – relevance and the abstract capture of meaning. As “AI’s Language Problem” author Will Knight highlights when discussing a neural network language algorithm; the system can identify “certain combinations of symbols go together, but it has no appreciation of the real world.” This difference between the structure of language and the meaning of language is the great challenge that remains to solve. The endeavor to solve these problems will benefit corporations, researchers and the Big Data world. And, it will require not only structural analysis, part of speech analysis but also the ability to “…mimic human learning, mental model building, and psychology.”

The ongoing advances of machine learning, deep learning, and language analysis will benefit all users of Big Data. Of critical importance is understanding the domain in which these technologies and advances are being applied. Responding to a technical support request is far different than attempting to capture meaning, intent and relevance from millions of words a day and using that meaning to inform larger processes. In the financial markets the analysis of language is not about predicting the next word or phrase or capturing the sentence structure of hundreds of thousands of regulatory filings each year. Domain specific meaning, focused sentiment analysis and transformation into quantitative values from qualitative content is what will power success and drive alpha in the capital markets.