Unified extractive-abstractive summarization: a hybrid approach utilizing BERT and transformer models for enhanced document summarization
Document Type
Article
Publication Title
PeerJ Computer Science
Abstract
With the exponential proliferation of digital documents, there arises a pressing need for automated document summarization (ADS). Summarization, a compression technique, condenses a source document into concise sentences that encapsulate its salient information for summary generation. A primary challenge lies in crafting a dependable summary, contingent upon both extracted features and human-established parameters. This article introduces an intelligent methodology that seamlessly integrates extractive and abstractive techniques to ensure heightened relevance between the input document and its summary. Initially, input sentences undergo transformation into representations utilizing BERT, subsequently transposed into a symmetric matrix based on their similarity. Semantically congruent sentences are then extracted from this matrix to construct an extractive summary. The transformer model integrates an objective function highly symmetric and invariant under unitary transformation for language generation. This model refines the extracted informative sentences and generates an abstractive summary akin to manually crafted summaries. Employing this hybrid summarization technique on the CNN/DailyMail dataset and DUC2004, we evaluate its efficacy using ROUGE metrics. Results demonstrate the superiority of our proposed technique over conventional summarization methods.
First Page
1
Last Page
26
DOI
10.7717/peerj-cs.2424
Publication Date
1-1-2024
Recommended Citation
Divya, S.; Sripriya, N.; Andrew, J.; and Mazzara, Manuel, "Unified extractive-abstractive summarization: a hybrid approach utilizing BERT and transformer models for enhanced document summarization" (2024). Open Access archive. 10558.
https://impressions.manipal.edu/open-access-archive/10558