Summarization Models


This page discusses the actual models and algorithms used to summarize text. If you want to compare the methods available for summarizing text as used in the End-To-End process please visit Combination and Summarization. If you are interested in the code behind the models, would like to reproduce results, or want to adapt upon the low-level summarization components, you’re in the right place.

Neural: The extractive summarizers are provided by HHousen/TransformerSum and the abstractive ones from huggingface/transformers (abstractive was originally accomplished with HHousen/DocSum). Please see those repositories for details on the exact implementation details of the models. Some of the architectures are HHousen’s, some are partly HHousen’s, and many are from other research projects.

Non-Neural Algorithms: The sumy (Sumy GitHub) package provides some non-neural summarization algorithms, mainly the methods for generic_extractive_sumy() such as lsa, luhn, lex_rank, text_rank, edmundson, and random.

Note: The summa (Summa GitHub) package is used to extract keywords using the TextRank algorithm in keyword_based_ext().


All other models/algorithms are, to the best of my knowledge, novel and are directly implemented as part of this project. See Combination and Summarization for details.