Assamese Dialect Identification System using Convolution Neural Networks

Hem Chandra  Das; Kshirod  Sarmah; Deepak  Hajoary; Raju  Narzary; Rinku  Basumatary

doi:10.15379/ijmst.v10i2.3519

Authors

Hem Chandra Das Department of Computer Science and Technology, Bodoland University, Kokrajhar, 783370, Assam, India
Kshirod Sarmah Department of Computer Science, Pandit Deendayal Upadhyaya Adarsha Mahavidyalaya (A Govt. Model College), Goalpara, 783124, Assam, India
Deepak Hajoary Department ofManagement Studies, Bodoland University, Kokrajhar, 783370, Assam, India
Raju Narzary Department of Computer Science and Technology, Bodoland University, Kokrajhar, 783370, Assam, India,
Rinku Basumatary Department of Computer Science and Technology, Bodoland University, Kokrajhar, 783370, Assam, India

DOI:

https://doi.org/10.15379/ijmst.v10i2.3519

Keywords:

ADID, Mel Spectrogram, Classification, Assamese Dialect, Machine Learning

Abstract

Labeling speech in an audio file with appropriate dialect labels is the aim of a dialect identification system. This paper presents a method of using convolution neural networks (CNN) to identify four Assamese dialects: Goalporia dialect, Kamrupi dialect, Eastern Assamese dialect, and Central Assamese dialect. This study employed the speech patterns of four major Assamese regional dialects: the Central Dialects spoken in and around the district of Nagaon; the Eastern Assamese dialect spoken in the districts of Sibsagar and its neighboring areas; the Kamrupi dialect spoken in the districts of Kamrup, Nalbari, Barpeta, Kokarajhar, and some areas of Bongaigaon; and the Goaplari dialect spoken in the Goaplara, Dhuburi, and a portion of Bongaigaon district. Over the course of two hours, audio samples from each of the four dialects were used to train the classifier. Mel spectrogram pictures, which are produced from two to four second divisions of raw audio input with varying audio quality, are used by the CNN. The system's performance is also analyzed in relation to the lengths of the train and test audio samples. The proposed CNN model achieves an accuracy of 90.82 percent, which may be the best when compared to machine learning models.

Downloads

Download data is not yet available.

Assamese Dialect Identification System using Convolution Neural Networks

Authors

DOI:

Keywords:

Abstract

Downloads

Downloads

Published

How to Cite

Issue

Section

License