Ontology-Guided Hypothesis Generation Using LLMs and Topic Modeling in mHealth Research

Document Type

Article

Publication Title

IEEE Access

Abstract

This study proposes a semantic pipeline designed to generate domain-oriented and contextually relevant hypotheses by analyzing existing literature on mHealth applications in India. Using a corpus of mHealth texts, the framework extracts hidden semantics through TF-IDF, topic modeling, and contextual mapping with domain ontologies. It then employs prompt-based interactions with large language models (LLMs) to systematically generate and validate hypotheses aligned with identified topic-concept relationships. The results demonstrate the framework’s effectiveness in producing high-quality, structured hypotheses, as validated by expert ratings ranging from 4.2 to 4.6. Most hypotheses were found to be plausible or highly plausible, with low semantic redundancy indicating diversity across topics, except in stakeholder-related areas which showed moderate overlap. Although the inclusion of semantic augmentation increased processing time, it significantly enhanced interpretability and validity. The high lexical density observed (up to 0.90) further reflects the linguistic flexibility of the generated hypotheses. This approach underscores the potential of computational methods in automating hypothesis generation and enabling data-driven discoveries in the mHealth domain.

First Page

200725

Last Page

200736

DOI

10.1109/ACCESS.2025.3636466

Publication Date

1-1-2025

This document is currently not available here.

Share

COinS