Implementation of Music Emotion Classification using Deep Learning
DOI:
https://doi.org/10.58915/ijaris.v1i1.2258Keywords:
Music Emotion Classification, Deep Learning, CNN, CNN-LSTM, CNN-GRU, MFCC Extraction, Spectral ContrastAbstract
Music plays a crucial role in shaping emotions and experiences, making its classification an important area of research with applications in therapy, recommendation systems, and affective computing. This study develops a deep learning-based system to classify music into three emotional categories: "Angry," "Happy," and "Sad." The dataset, consisting of 22 audio files collected from YouTube, was manually labelled, segmented into 30-second clips, and augmented using pitch shifting and time stretching to enhance diversity. Features were extracted using Mel-Frequency Cepstral Coefficients (MFCC) and spectral contrast to analyse the harmonic and timbral characteristics of the audio. Three deep learning models, CNN, CNN-LSTM, and CNN-GRU, were evaluated. CNN-GRU achieved the highest weighted accuracy of 99.10%, demonstrating superior performance. Future work includes adding more emotion categories, diversifying the dataset, exploring advanced architectures like transformers, optimising hyperparameters, implementing real-time applications, and conducting user studies to assess effectiveness. This research successfully developed and evaluated a music emotion classification system, contributing to advancements in the field.