Audio Signal Classification Using Deep Learning

Omaima Boudaia; Deepak Chakrasali; Suhas Chandra Thejasvi N; Manoj Kumar C; Srivathsa D Bharadwaj

doi:10.61927/igmin345

233 of 242

Comparative Study of Glucose Abnormalities Among Senegalese Migrants and Rural Populations

JM Dollet, L Soyeux, P Mattéi, A Niang-Diene, F Guillemin and SN Diop

235 of 242

Deep Learning-Based Prediction of Nepal Stock Exchange Movements from Financial News Headlines

Keshab Raj Dahal

Engineering Group Research Article Article ID: igmin345

Audio Signal Classification Using Deep Learning

Artificial Intelligence DOI10.61927/igmin345

Omaima Boudaia ^* ,

Deepak Chakrasali ,

Suhas Chandra Thejasvi N ,

Manoj Kumar C and

Srivathsa D Bharadwaj

Affiliation

Department of CSE (AI&ML), ATME College of Engineering, Mysore, Karnataka, India

Fulltext HTML Fulltext PDF Cite this article

24

REFERENCES

1.4k

VIEWS

311

DOWNLOADS

123

Abstract

Audio signal classification plays a significant role in various real-world applications such as speech recognition, environmental sound analysis, and music genre identification. Traditional approaches often depend on manually extracted features, which may not capture the full complexity of audio data. This paper presents a deep learning-based method for automatic classification of audio signals using a One-Dimensional Convolutional Neural Network (1D-CNN) and a Recurrent Neural Network (RNN). The CNN model is utilized to extract spatial features from spectrogram representations, while the RNN model effectively captures temporal dependencies within the audio sequences. Both models were trained and evaluated on a labelled dataset, and their performance was compared using metrics such as accuracy, precision, probability of detection (POD), and F1-score. The experimental results demonstrate that CNN has achieved high classification accuracy compared to RNN, with CNN excelling at spatial feature extraction and RNN providing temporal feature learning. The proposed approach confirms that deep learning models can significantly enhance the performance and reliability of audio signal classification systems.

Figures

References

Hershey S, Chaudhuri S, Ellis DP, Gemmeke JF, Jansen A, Moore RC, Plakal M, Platt D, Saurous RA, Seybold B, Slaney M. CNN architectures for large-scale audio classification. In: 2017 IEEE Int Conf Acoust Speech Signal Process (ICASSP). 2017 Mar; p. 131‑135.
Choi K, Fazekas G, Sandler M, Cho K. Convolutional recurrent neural networks for music classification. In: 2017 IEEE Int Conf Acoust Speech Signal Process (ICASSP). 2017 Mar; p. 2392‑2396.
Kumar R, Gupta M, Ahmed S, Alhumam A, Aggarwal T. Intelligent audio signal processing for detecting rainforest species using deep learning. Intell Autom Soft Comput. 2022;31(2):692‑706.
Gupta M, Sharma R. Deep learning‑based environmental sound classification using CNN and RNN architectures. J Intell Syst. 2021;30(4):415‑427.
Pons J, Lidy T, Serra X. Experimenting with musically motivated convolutional neural networks. In: Proc 14th Int Workshop Content‑Based Multimedia Indexing (CBMI). 2016 Jun; p. 1‑6.
Zaman K, Sah M, Direkoglu C, Unoki M. A survey of audio classification using deep learning. IEEE Access. 2023 Oct;11:106621‑106652. doi:10.1109/ACCESS.2023.3318015.
Bhangale P, Kothandaraman R. Deep learning architectures for audio classification: A comparative study of CNN and RNN models. Int J Eng Res Technol (IJERT). 2020;9(8):123‑130.
Qamhan MA, Altaheri H, Meftah AH, Muhammad G, Alotaibi YA. Digital audio forensics: microphone and environment classification using deep learning. IEEE Access. 2021;9:62719‑62733.
Kumar R, Gupta M, Ahmed S, Alhumam A, Aggarwal T. Intelligent audio signal processing for detecting rainforest species using deep learning. Intell Autom Soft Comput. 2022;31(2):693‑706. doi:10.32604/iasc.2022.019811.
Aslam MA, Sarwar MU, Hanif MK, Talib R, Khalid U. Acoustic classification using deep learning. Int J Adv Comput Sci Appl (IJACSA). 2018;9(8):153‑159.
Purwins H, Li B, Virtanen T, Schlüter J, Chang S‑Y, Sainath T. Deep learning for audio signal processing. IEEE J Sel Top Signal Process. 2019 May;13(2):206‑219. doi:10.1109/JSTSP.2019.2908700.
Akinpelu, Viriri S. Deep learning framework for speech emotion classification. IEEE Access. 2024 Oct;12:152152‑152182. doi:10.1109/ACCESS.2024.3474553.
Hashemi M, Aghabozorgi M, Sadeghi MT. Persian music source separation in audio‑visual data using deep learning. In: Proc 6th Iranian Conf Signal Process Intell Syst (ICSPIS). Yazd, Iran. 2020 Dec; p. 1‑6. doi:10.1109/ICSPIS51611.2020.9349614.
Hasan H, Rahman MSM, Islam MS. Audio forensic authentication using background noise. Appl Intell. 2015 Mar;42(3):627‑641. doi:10.1007/s10489‑014‑0629‑7.
Hassan E, Elbedwehy S, Shams MY, Abd El‑Hafeez T, El‑Rashidy N. Optimizing poultry audio signal classification with deep learning and burn layer fusion. J Big Data. 2024 Sep;11(135):1‑29. doi:10.1186/s40537‑024‑00985‑8.
Alzahrani MA, Aljohani M, Alzahrani MA. Audio‑based activities recognition using machine learning algorithms and deep learning. Sensors. 2019 Oct;19(4819):1‑19. doi:10.3390/s19224819.
Kim JW, Salamon J, Li P, Bello JP. Crepe: A convolutional representation for pitch estimation. In: 2018 IEEE Int Conf Acoust Speech Signal Process (ICASSP). 2018 Apr; p. 161‑165.
Reddy BL, Uma Mahesh RN, Nelleri A. Deep convolutional neural network for three‑dimensional object classification using off‑axis digital Fresnel holography. J Mod Opt. 2022;69(13):705‑717. doi:10.1080/09500340.2022.2081371.
Mahesh RN U, Nelleri A. Multi‑class classification and multi‑output regression of three‑dimensional objects using artificial intelligence applied to digital holographic information. Sensors. 2023;23:1095. doi:10.3390/s23031095.
Uma Mahesh RN, Lokesh Reddy B, Nelleri A. Deep learning‑based multi‑class 3D objects classification using digital holographic complex images. In: Sivasubramanian A, Shastry PN, Hong PC, eds. Futuristic Communication and Network Technologies. VICFCNT 2020. Lect Notes Electr Eng. Vol 792. Springer, Singapore; 2022. p. 43. doi:10.1007/978‑981‑16‑4625‑6_43.
Uma Mahesh RN, Basavaraju L. Three‑dimensional (3‑D) objects classification by means of phase‑only digital holographic information using Alex Network. In: 2024 Int Conf Signal Process Comput Electron Power Telecommun (IConSCEPT). Karaikal, India. 2024; p. 1‑5. doi:10.1109/IConSCEPT61884.2024.10627906.
Uma Mahesh RN, Basavaraju L. Deep learning‑based multi‑class three‑dimensional (3‑D) object classification using phase‑only digital holographic information. IgMin Res. 2024 Jul 9;2(7):550‑557. doi:10.61927/igmin216. Available from: igmin.link/p216.
Mahesh RU, Nagaraju S. Three‑dimensional (3‑D) objects classification by means of phase‑only digital holographic information using deep learning. In: Data Science & Exploration in Artificial Intelligence: Proc 1st Int Conf Data Sci Exploration Artif Intell (CODE‑AI 2024). Bangalore, India. 2024 Jul 3‑4; Vol 1. CRC Press; 2025 Feb. p. 363. doi:10.1201/9781003587392‑53.
Uma Mahesh RN, Rajanahalli Nataraj, Puttaswamy C. Deep residual network for three‑dimensional (3‑D) objects classification using phase‑only digital holographic information. J Intell Syst. 2026;35(1):20240393. doi:10.1515/jisys‑2024‑0393.

Similar Articles

A Rare Entity of Idiopathic Clitoromegaly with HBsAg Positive Status Managed with Dorsal Nerve Sparing Clitoroplasty
Maharjan N, Pokharel PB, Lamichhane A and Dahal P
DOI10.61927/igmin254

Enhancing Material Property Predictions through Optimized KNN Imputation and Deep Neural Network Modeling
Murad Ali Khan
DOI10.61927/igmin197

Analysis of the State of Moisture Control to Ensure and Regulate the Quality of Grain and Grain Products
Kalandarov Palvan Iskandarovich
DOI10.61927/igmin170

Contamination in Heat Exchangers: Types, Energy Effects and Prevention Methods
Mehmet Akif Kartal
DOI10.61927/igmin209

The Influence of Dynamical Downscaling and Boundary Layer Selection on Egypt’s Potential Evapotranspiration using a Calibrated Version of the Hargreaves-samani Equation: RegCM4 Approach
Samy A Anwar and Ankur Srivastava
DOI10.61927/igmin229

Reliability of Water for Life Support for a Near-Term Human Mission to Mars: Requirements, Earth Supply, Recycling, Storage and Mars Indigenous Water
Donald Rapp
DOI10.61927/igmin353

The Lukala Cement Plant's Life Cycle Analysis: Towards a More Sustainable Production
André Mampuya Nzita
DOI10.61927/igmin256

Peritoneal Carcinomatosis from Ovarian Cancer: A Case Report
Andrea González De Godos, Enrique Asensio Diaz, Pilar Pinto Fuentes, Baltasar Pérez Saborido and David Pacheco Sánchez
DOI10.61927/igmin181

Barriers and Enablers to the Adoption of Health Technology Assessment (HTA) in Sub-Saharan Africa Health Financing Decisions Systematic Review
Charles Karanja Wanjiku
DOI10.61927/igmin314

The Comprehensive Regeneration Approach as a Framework for Sustainable Development and Biodiversity
Stephen J Browne
DOI10.61927/igmin186

Page Navigation

Why publish with us?

Global Visibility – Indexed in major databases
Fast Peer Review – Decision within 14–21 days
Open Access – Maximize readership and citation
Multidisciplinary Scope – Biology, Medicine and Engineering
Editorial Board Excellence – Global experts involved
University Library Indexing – Via OCLC
Permanent Archiving – CrossRef DOI
APC – Affordable APCs with discounts
Citation – High Citation Potential

Submit Your Article

Click for top 10 Articles

Quick Links

Submit Manuscript

Browse by Subjects

Browse by Sections

Special Issues

Members

Articles

Explore Content

Identify Us

Publish Now

Policies

Manuscript Guidelines

Other Services

Identify Us

Search

Select Language

Explore Section