Automatic Extractive Single Document Summarization: A Systematic Mapping

Automatic Extractive Single Document Summarization (AESDS) is a research area that aims to create a condensed version of a document with the most relevant information; it acquires more importance daily due to the need of users to obtain information on documents published on the Internet quickly. In...

Full description

Bibliographic Details
Main Authors: Yip-Herrera, Juan-David, Mendoza-Becerra, Martha-Eliana, Rodríguez, Francisco-Javier
Format: Online
Language:eng
Published: Universidad Pedagógica y Tecnológica de Colombia 2023
Subjects:
Online Access:https://revistas.uptc.edu.co/index.php/ingenieria/article/view/15232
Description
Summary:Automatic Extractive Single Document Summarization (AESDS) is a research area that aims to create a condensed version of a document with the most relevant information; it acquires more importance daily due to the need of users to obtain information on documents published on the Internet quickly. In automatic document summarization, each element must be evaluated and ranked to generate a summary. As such, there are three approaches considering the number of objectives they evaluate: single-objective, multi-objective, and many-objective. This systematic mapping aims to provide knowledge about the methods and techniques used in extractive techniques for AESDS, analyzing the number of objectives and characteristics evaluated, which can be helpful for future research. This mapping was carried out using a generic process for the realization of systematic reviews where a search string was built considering some research questions. A filter was then used with inclusion and exclusion criteria for selecting primary studies with which it will carry out the analysis. Additionally, these studies are sorted according to the relevance of their content. This process is summarized in three main steps: planning, execution, and result analysis. At the end of the mapping, the following observations were identified: (i) There is a preference for the use of machine learning methods and the use of clustering techniques, (ii) the importance of using both types of characteristics (statistics and semantics), and (iii) the need to explore the many-objective approach.