Data Scientist at the Social Research Centre
Real-time text analytics using R Shiny
In recent years there has been much advancement in statistical and computational approaches to natural language processing. However, a vexing problem still exists on how to explore and present results from textual data in an effective way. Static visualisations of text analysis are not very effective for exploring the high dimensionality of textual data. Where we usually want to probe trends and relationships between terms, perhaps even comparing across population cohorts, frequencies on their own can be very underwhelming.
Text analytics often requires some level of subjective interpretation. Ultimately this requires someone to interpret the semantics of the results. In other words, there is a necessary step between training a model and drawing conclusions.
At the SRC we have been trialing interactive dashboards to support this middle step in the NLP analysis workflow. Interactive dashboards enable users to explore the dimensionality of the data in order to tease out trends and make their own insights, and as such are ideal for presenting text analytics. As well as presenting the NLP results in the most meaningful way, the challenge has been to build something that is flexible enough that the user can crosscut the data whichever way they wish, but with some constraints in order to guide their analysis. This paper will demonstrate an interactive online application that we have built in R Shiny for exploring text analytics.