CS 4641 Project
- Midterm Report
- Final Report
Project Proposal
Introduction
In today’s world, the average consumer has access to millions of songs at the simple click of a button, a catalog that includes songs that span across eras, artistic expressions, and themes. While this presents music enthusiasts with an exciting opportunity to get more involved with their work, it also presents many challenges when it comes to organizing, classifying, and recommending songs to individuals with varied tastes.
Our team is impressed by the amount of research that has been done in this area, and wants to add to this literature. We have located several similar studies that each test different models, such as Convolutional Neural Networks (CNNs) [1] and Support Vector Machines [2]. We hope to compare and contrast more types of models to create a better genre classification model.
We plan to accomplish this by analyzing a dataset containing 1700 spectrograms derived from songs that are 270-300 seconds in length, and sorted into a hierarchy containing 3 levels of classifications and 16 distinct genres. Due to limitations in the dataset, our research will mainly be limited to English songs.
Problem Definition & Motivation
This project aims to classify the genres of songs based on their spectrograms by using the visual data and turning it into numeric data. We have three subtasks for classification: classical vs non-classical, sub-genres (classical: symphony, opera, etc., non-classical: pop, indie, rock, etc.), and sub-sub-genres (pop: teen pop, adult contemporary, etc.) We are motivated to work on this as this project can lead into helping to identify trends within genres, which would provide valuable insights for artists, labels, and consumers.
Methods
The most promising method is likely the use of Convolutional Neural Networks, a supervised learning method. They are very good at image recognition tasks, and since spectrograms are images that represent songs, CNNs may get good results [3]. We also plan on trying out traditional Machine Learning algorithms like SVMs and Random Forests. To utilize these methods, we would work to extract features from the spectrograms that could then be used as inputs into these supervised algorithms. Depending on the features we choose after some exploratory data analysis, we may also conduct PCA to reduce the dimensionality of our data and to boost performance. On the unsupervised learning front, we may gain some insights from trying various clustering algorithms to group similar spectrograms.
Potential Results
As briefly touched upon in the previous section, we believe that Convolutional Neural Networks will be able to classify songs with a high degree of accuracy [3]. We will likely find that the success rate of traditional Machine Learning algorithms highly depends on the features we select and create from the spectrograms, it doesn’t seem like it will be completely straightforward. To judge how good each method is, we’ll be looking at the following metrics.
- Accuracy
- F1 Score
- Area under the ROC curve We will also visualize our results with a confusion matrix.
References
- N. M R and S. Mohan B S, “Music Genre Classification using Spectrograms,” 2020 International Conference on Power, Instrumentation, Control and Computing (PICC), Thrissur, India, 2020, pp. 1-5, doi: 10.1109/PICC51425.2020.9362364.
- Costa, Yandre & Soares de Oliveira, Luiz & Koericb, A.L. & Gouyon, Fabien. (2011). Music genre recognition using spectrograms. Intl. Conf. on Systems, Signal and Image Processing. 1 - 4.
- M. Dong, ‘Convolutional Neural Network Achieves Human-level Accuracy in Music Genre Classification’, CoRR, vol. abs/1802.09697, 2018.
- Zhaorui Liu and Zijin Li, “Music Data Sharing Platform for Computational Musicology Research (CCMUSIC DATASET).” Zenodo, Nov. 12, 2021. doi: 10.5281/ZENODO.5676893.
Dataset
Our dataset can be found here: https://huggingface.co/datasets/ccmusic-database/music_genre.
Proposed Timeline
This timeline is subject to change. A more detailed version can be found here.
TASK TITLE | TASK OWNER | START DATE | DUE DATE |
---|---|---|---|
Project Proposal | |||
Introduction & Background | James DiPrimo | 9/27/2023 | 10/6/2023 |
Problem Definition | Anirudh Ramesh | 9/27/2023 | 10/6/2023 |
Methods | Siddhant Dubey | 9/27/2023 | 10/6/2023 |
Timeline | Soongeol Kang | 9/27/2023 | 10/6/2023 |
Potential Results & Discussion | Joseph Campbell | 9/27/2023 | 10/6/2023 |
Video Recording | Siddhant Dubey | 9/27/2023 | 10/6/2023 |
GitHub Page | Siddhant Dubey | 9/27/2023 | 10/6/2023 |
Model 1 (GMM or K-means) | |||
Data Sourcing and Cleaning | James DiPrimo | 10/7/2023 | 10/13/2023 |
Model Selection | All | 10/13/2023 | 10/16/2023 |
Data Pre-Processing | Anirudh Ramesh | 10/16/2023 | 10/23/2023 |
Model Coding | Siddhant Dubey and Anirudh Ramesh | 10/23/2023 | 10/30/2023 |
Results Evaluation and Analysis | Joseph Campbell | 10/30/2023 | 11/2/2023 |
Midterm Report | Everyone | 10/31/2023 | 11/3/2023 |
Model 2 (CNN) | |||
Model Coding | Siddhant Dubey, Joseph Campbell | 10/28/2023 | 11/4/2023 |
Results Evaluation | Soongeol Kang | 11/5/2023 | 11/8/2023 |
Analysis | James DiPrimo | 11/6/2023 | 11/9/2023 |
Model 3 (SVMs) | |||
Midterm Report | Everyone | 11/3/2023 | 11/11/2023 |
Model Coding | Soongeol Kang, James DiPrimo | 11/11/2023 | 11/18/2023 |
Results Evaluation | Anirudh Ramesh | 11/18/2023 | 11/21/2023 |
Analysis | Joseph Campbell | 11/19/2023 | 11/22/2023 |
Model 4 (Random Forests) | |||
Model Coding | James DiPrimo, Anirudh Ramesh | 11/15/2023 | 11/22/2023 |
Results Evaluation | Siddhant Dubey | 11/20/2023 | 11/23/2023 |
Analysis | Soongeol Kang | 11/21/2023 | 11/24/2023 |
Evaluation | |||
Model Comparison | Everyone | 11/29/2021 | 12/4/2021 |
Presentation | Everyone | 12/1/2023 | 12/6/2023 |
Recording | Everyone | 12/6/2021 | 12/7/2021 |
Final Report | Everyone | 12/2/2021 | 12/8/2021 |
Contribution Table
Contribution | Person |
---|---|
Introduction | James |
Problem Statement | Anirudh |
Methods | Siddhant |
Potential Results | Joseph |
Proposed Timeline | Soongeol |
Finding Datasets | Everyone |
Finding Papers | Everyone |