Open Access archive

Developing a Hybrid Morphological Analyzer for Low-Resource Languages

Musica Supriya, Manipal Institute of Technology
Dinesh Acharya Udupi, Manipal Institute of Technology
Ashalatha Nayak, Manipal Institute of Technology
Arjuna Srirangapatna Raghavendra, Manipal Academy of Higher Education

Document Type

Article

Publication Title

Applied Sciences Switzerland

Abstract

Morphological analysis is the fundamental and preliminary task for Natural Language Processing (NLP) applications, which involve speech and language. Kannada is a low-resource language belonging to the Dravidian language family, which is highly agglutinative and morphologically rich in nature, where dataset development is happening rapidly due to the increasing demands of NLP tools. This study presents a hybrid approach that integrates rule-based and Transformer-based techniques, aiming to maximize their strengths while minimizing the respective limitations. In the Kannada language, the analysis of inflections has been challenging due to morphological richness, and to address this issue, 85 paradigms are created using Lttoolbox of Apertium. Further, a Transformer model is trained with the generated nominal data to generate the morphological analysis for the out-of-vocabulary inflections. The hybrid approach can be easily extended to new words as they are added to the dictionary. The obtained results are on a test set for inflections in Kannada precision: 0.924; recall: 0.925; and F1 score: 0.925. The main contributions include rule extraction for paradigm design at the word level, morphological analysis for nouns, verbs, adjectives, pronouns, and indeclinables on a benchmark dataset and morphological analysis generation using the Transformer architecture.

DOI

10.3390/app15105682

Publication Date

5-1-2025

Recommended Citation

Supriya, Musica; Acharya Udupi, Dinesh; Nayak, Ashalatha; and Srirangapatna Raghavendra, Arjuna, "Developing a Hybrid Morphological Analyzer for Low-Resource Languages" (2025). Open Access archive. 13261.
https://impressions.manipal.edu/open-access-archive/13261

This document is currently not available here.

COinS

Open Access archive

Developing a Hybrid Morphological Analyzer for Low-Resource Languages

Document Type

Publication Title

Abstract

DOI

Publication Date

Recommended Citation

Search

Browse

Author Corner

Open Access archive

Developing a Hybrid Morphological Analyzer for Low-Resource Languages

Authors

Document Type

Publication Title

Abstract

DOI

Publication Date

Recommended Citation

Share

Search

Browse

Author Corner