One of the greatest movies about invention over the past 35 years has to be “Back to the Future.” Who could ever forget about time travel via the flux-capacitor-equipped DeLorean in this pop culture classic? The thrill of discovery and persistence through initial failures is what drove Doc Brown’s pursuit of his invention throughout the entire movie.
The Inventor, our third text analytics persona in this series, was inspired by this same thrill of discovery evident within Doc Brown’s passion for invention, with even more relevant Kingland inspiration coming from our passion for data. The Inventor’s role within our Text Analytics Platform is to discover literally anything of interest from a “mountain” of data – trends, relationships, quality issues, and more.
When compared to the other Kingland text analytics personas and components, the Inventor is a bit of an outlier within the Text Analytics Platform, in the sense that it can often be an out-of-process data discovery mechanism separate from a traditional Collector->Scholar->CEO content processing pipeline. It can also be considered an outlier from the perspective of Malcolm Gladwell’s book “Outliers,” and more specifically the 10,000-hours-rule of practice for becoming world-class in a field. In our scenario of data discovery with the Inventor, the 10,000 hours analogy equates to data – the more data the Inventor has to “practice” over, the better its results will be, and the more opportunity it will have to turn its data discoveries into yet another module within the Text Analytics Platform’s hardened processing pipeline.
How Does The Inventor Work?
Pairing Kingland’s best data analysts, scientists and engineers with proper techniques and technology give us the best Inventor results for leverage within our Text Analytics Platform. In the past, the “technology” usually meant complex SQL executed against a relational database. Today and going forward, it means leveraging the best AI and ML techniques and Python libraries against the data, suited for the discovery needs at hand. In some cases, there are third party data science and ML platforms that can accelerate discovery, such as Knime, and the new AWS SageMaker platform.
As an example, let’s take a look at a client data management use case near and dear to Kingland’s heart – Data Quality Remediation:
In another example, more related to our Text Analytics Platform’s processing of unstructured content, we paired engineers with NLP techniques such as topic modeling via Latent Dirichlet Allocation (LDA) to find the similarities between thousands of university research papers. We were able to highlight the common topics that were similar among the papers, group them into their categories of similarities, and enable the opportunity to find common reference citations. This type of topic modeling has also shown to be valuable in use cases such as news monitoring or risk surveillance, where the modeling can provide a quick breakdown of the topics within any given document or article, driving efficiencies into the monitoring/surveillance processes. In our Text Analytics Platform’s pipeline of unstructured content processing, this type of content similarity processing via topic modeling can be a valuable module within the pipeline for classifying content.
In both examples, we see a theme of pairing Kingland team members with the best techniques and technology suitable for a given data discovery task. Our best Inventor results, like the discoveries from these examples, have the opportunity for becoming hardened modules within our overall Text Analytics Platform processing pipeline, as long as they have the time and data available for realizing the greatest benefit – no different than the greatest inventions need time and data to become production-ready. However, I’m still waiting for a reliable hoverboard, let alone a time-traveling DeLorean! A key theme that resonates throughout our Text Analytics Platform’s personas is how they work together and complement each other to form a complete solution.
Join me next time as we dive into the final component persona of our Text Analytics Platform, the one responsible for making the big decisions – The CEO!