Conclusion

This whitepaper has outlined the methods, business applications and expected outputs of a range of currently available text summarization techniques. The ongoing research into further summarization methods at UK Institutions is robust, productive and collaborative, particularly that out of Manchester, Cardiff, Sheffield and the prestigious Alan Turing Institute.

Extractive techniques have been explored, with their relevance to the related areas of NER, topic extraction and sentiment analysis. Abstractive techniques are perhaps slightly behind the advances of their extractive counterparts, but offer the greatest potential for producing summaries at the same linguistic quality as human authors. Recent advances in machine learning have opened further doors for exploration into the future of text analytics.

On-premises commercial tools are currently limited in their number and scope, with cloud solutions from Google, Microsoft and IBM currently offering the most options for effective summarization with frequently updated toolsets and models. The potential of open-source machine learning libraries in Python and R is massive and underpins much of the academic research into text analytics, with many bespoke solutions rivalling academic or commercial implementations. Many approaches are highly effective on one style of text but falter on others, and the availability of training and testing data for any implementation may ultimately decide its success. While certain algorithms are known to be more versatile, as with most fields of textual analysis, the best solution is likely the one most tailored to your domain.

Last updated