Open Access archive

From Vision to Voice: A Multi-Modal Assistive Framework for the Physically Impaired

Suhas Bhat, Manipal Institute of Technology
Prajwal Bhat, Manipal Institute of Technology
Sucheta V. Kolekar, Manipal Institute of Technology

Document Type

Article

Publication Title

IEEE Access

Abstract

Providing people with visual and physical limitations their ability to access textual content continues to be a difficult challenge. A desktop-assisted system with automated computer processing enables the conversion of text found in images into audible speech. The application uses Python to develop its interface with Tkinter libraries and implements Tesseract OCR for optical character recognition that receives images through real-time capture enabled by OpenCV. Through the googletrans library, the system enables multilingual operations (more than 100 languages) for text processing and translation across all languages accessible via Google Translate. The system converts extracted or translated text into speech output using Google Text-to-Speech (gTTS) that plays back audio through system default media players as. mp3 files. Users experience intuitive interaction with the interface because it features hover effect characteristics, accessible control elements, and language selection through a dropdown menu. Through an expandable structural design, the system delivers multilingual text-to-speech capabilities, which prove useful in assistive technology applications for accessibility needs.

First Page

128106

Last Page

128121

DOI

10.1109/ACCESS.2025.3590237

Publication Date

1-1-2025

Recommended Citation

Bhat, Suhas; Bhat, Prajwal; and Kolekar, Sucheta V., "From Vision to Voice: A Multi-Modal Assistive Framework for the Physically Impaired" (2025). Open Access archive. 14095.
https://impressions.manipal.edu/open-access-archive/14095

This document is currently not available here.

COinS

Open Access archive

From Vision to Voice: A Multi-Modal Assistive Framework for the Physically Impaired

Document Type

Publication Title

Abstract

First Page

Last Page

DOI

Publication Date

Recommended Citation

Search

Browse

Author Corner

Open Access archive

From Vision to Voice: A Multi-Modal Assistive Framework for the Physically Impaired

Authors

Document Type

Publication Title

Abstract

First Page

Last Page

DOI

Publication Date

Recommended Citation

Share

Search

Browse

Author Corner