Ontology-Guided Hypothesis Generation Using LLMs and Topic Modeling in mHealth Research
Document Type
Article
Publication Title
IEEE Access
Abstract
This study proposes a semantic pipeline designed to generate domain-oriented and contextually relevant hypotheses by analyzing existing literature on mHealth applications in India. Using a corpus of mHealth texts, the framework extracts hidden semantics through TF-IDF, topic modeling, and contextual mapping with domain ontologies. It then employs prompt-based interactions with large language models (LLMs) to systematically generate and validate hypotheses aligned with identified topic-concept relationships. The results demonstrate the framework’s effectiveness in producing high-quality, structured hypotheses, as validated by expert ratings ranging from 4.2 to 4.6. Most hypotheses were found to be plausible or highly plausible, with low semantic redundancy indicating diversity across topics, except in stakeholder-related areas which showed moderate overlap. Although the inclusion of semantic augmentation increased processing time, it significantly enhanced interpretability and validity. The high lexical density observed (up to 0.90) further reflects the linguistic flexibility of the generated hypotheses. This approach underscores the potential of computational methods in automating hypothesis generation and enabling data-driven discoveries in the mHealth domain.
First Page
200725
Last Page
200736
DOI
10.1109/ACCESS.2025.3636466
Publication Date
1-1-2025
Recommended Citation
Vibha; Pai, Rajesh R.; and Sumith, N., "Ontology-Guided Hypothesis Generation Using LLMs and Topic Modeling in mHealth Research" (2025). Open Access archive. 13907.
https://impressions.manipal.edu/open-access-archive/13907