Data2Summ: review on enhancing summarization with transformers and attention models for speech and text compression
Document Type
Article
Publication Title
Physica Scripta
Abstract
Today, nearly 120 zettabytes of data are generated daily, with a significant portion coming from speech and text. Digesting such vast amounts of data is a daunting task for the average person. To simplify this challenge, various speech and text summarization techniques are being developed. Summarization is a crucial task that aims to condense key information from text while preserving its overall meaning. This work focuses on developing Data2Summ, an efficient approach for summarizing speech and text by leveraging transformers and advanced attention models. These models capture intricate contextual relationships between words, enabling a nuanced understanding of the input data. The methodology incorporates various models, such as BERT, Pegasus, Llama-2, RoBert2Robert, T5, and DistilBERT, which harness the power of transformers that have demonstrated exceptional performance in various natural language processing tasks. Additionally, we experiment with a modified version of DistilBERT, which shows improved performance compared to other transformers. Evaluation includes benchmarking against existing summarization techniques and assessing the quality, coherence, and informativeness of the generated summaries using ROUGE score.
DOI
10.1088/1402-4896/adb24c
Publication Date
3-1-2025
Recommended Citation
Krishna, Dasari Siva; Srikanth, Panigrahi; Rao, Routhu Srinivasa; and Holla M, Raviraja, "Data2Summ: review on enhancing summarization with transformers and attention models for speech and text compression" (2025). Open Access archive. 13640.
https://impressions.manipal.edu/open-access-archive/13640