Data2Summ: review on enhancing summarization with transformers and attention models for speech and text compression

Document Type

Article

Publication Title

Physica Scripta

Abstract

Today, nearly 120 zettabytes of data are generated daily, with a significant portion coming from speech and text. Digesting such vast amounts of data is a daunting task for the average person. To simplify this challenge, various speech and text summarization techniques are being developed. Summarization is a crucial task that aims to condense key information from text while preserving its overall meaning. This work focuses on developing Data2Summ, an efficient approach for summarizing speech and text by leveraging transformers and advanced attention models. These models capture intricate contextual relationships between words, enabling a nuanced understanding of the input data. The methodology incorporates various models, such as BERT, Pegasus, Llama-2, RoBert2Robert, T5, and DistilBERT, which harness the power of transformers that have demonstrated exceptional performance in various natural language processing tasks. Additionally, we experiment with a modified version of DistilBERT, which shows improved performance compared to other transformers. Evaluation includes benchmarking against existing summarization techniques and assessing the quality, coherence, and informativeness of the generated summaries using ROUGE score.

DOI

10.1088/1402-4896/adb24c

Publication Date

3-1-2025

This document is currently not available here.

Share

COinS