How Kingland uses Natural Language Processing to Give you Comprehension, Context, and Control of your Data

Posted by Brian Noyama on 10/23/17 6:42 AM

There is a lot of hype around Cognitive Computing. In the past month, a Google search discovered nearly 16,000 articles devoted to the topic. For perspective on the hype, Forbes contributor Bernard Marr wrote, "Today another revolution is underway with potentially even further reaching consequences... Cognitive computing, machine learning, natural language processing - different terms have emerged as development of the technology has progressed in recent years. But they all encapsulated the idea that machines could one day be taught to learn how to adapt by themselves..." 

Marr and many others believe that Cognitive Computing will forever change how we do work. Here at Kingland, we believe this too, but how exactly does Kingland use cognitive computing? Sometimes it's best to learn by example. Here at Kingland, we use many different kinds of machine learning techniques including Naive Bayes Classifiers and Recurrent Neural Networks. One of our more recent projects has also required an extensive amount of Natural Language Processing (NLP). To gain some insight into cognitive computing lets focus on an NLP use-case, as an example of how we use some of our cognitive capabilities to address our clients' needs. We can look at other cognitive tools in future blog posts.

Natural Language Processing encompasses many different techniques, such as Parsing, Part-of-Speech Tagging, Lemmatization and Named Entity Recognition. A practitioner, or in this case the Kingland Platform, uses these techniques to help computers read unstructured (albeit human-readable) text. Our clients, particularly those in operational, risk, compliance, or data roles, oftentimes have to read complicated legal documents that contain hundreds of pages of dense, data filled text.  For example, lets say that we needed to read the following text from a legal prospectus for a mutual fund:

If you redeem Investor A or Institutional Shares, and within 60 days buy new Investor A Shares of the same or another Black Rock Fund (equal to all or a portion of the redemption amount), you will not pay a sales charge on the new purchase amount.

Now even for a human, that sentence is a mouthful. This is one sentence out of 7,161 in a 193 page document. The sentence tells the reader that they can repurchase another mutual fund at Net Asset Value (NAV). Unfortunately, it would take several weeks to train a human to read and comprehend such sentences, and one would have to train hundreds of humans to read similar sentences to match the speed at which a single computer could using NLP.  To understand how a computer does this, let's examine how the sentence looks after the computer uses NLP to parse the sentence into a tree (shown in the figure below).

Natural Language Processing.png

First, the computer can use a parse tree along with part-of-speech tagging to extract noun phrases (shown in Step 1). After attaining noun phrases, the computer uses a matching algorithm to tie the noun phrases to the names of shares kept within our structured database. Finally, using a custom query, the computer scans the topology of the parse tree to see if it fits our general idea of a sentence talking about Net Asset Value repurchases.

Let's use our prior NAV example to complete steps 3 and 4. The computer finds that Institutional shares and Investor A shares can both be "sold." It finds the word "redeem," but by using lemmatization and word vectors it realizes that both words are effectively the same. It also finds that we can "repurchase" or "buy" Investor A shares without paying a sales charge. Having found this information, the computer can now convert the unstructured semantics of the sentence to information in our database, and we will persist that Institutional shares can repurchase different classes, while Investor A shares can repurchase the same class only.

Without the tools that make up NLP, it is unclear how one would parse unstructured text in an efficient way. Most of our clients have to deal with unstructured data in one form or another. It is important to have expertise in NLP to accelerate the development or configuration of tools for our client's needs. It is also important to realize that NLP can help us to reach untouched "treasure-troves" of information in human-readable data that we could not reach before.

NLP represents only one of the cognitive tools in our war chest that we use to make sense of your data. If you currently have problems with your data, feel free to ask us questions in the comments below. Otherwise stay tuned for more posts on our other cognitive tools in the future!

Topics: Cognitive, Text Analytics

Subscribe to Email Updates

Recent Posts