We see the best clustering performance when dividing into two clusters (ideally classical and non-classical), but even with eight clusters there is still a relatively poor performance.
As can be seen in Figure 2 down below (ground truth), we see some distinctions between the classical genres (symphony, opera, solo, and chamber) and the non-classical (pop, dance & house, soul/R&B, and rock), with the former occupying mostly the closer bottom-right octant of the graph in the displayed orientation. The latter genres seem to have a much larger spread, which is not surprising given the larger range in musical techniques and styles among these genres when compared to classical music. This visualization was generated using PCA which reduced the total number of features down to three from over 700.
As discussed above in the quantitative metrics section, we saw that the clustering generally performed pretty poorly, with low purity scores and a low silhouette score even for just two clusters. Unsurprisingly, we do not see any trends within the clusters as in Figure 3 there is no semblance of grouping of the clusters that reflects the distinctions we noted above. As we will discuss in the next section, there is significant room for growth in our feature selection, which will hopefully help with both the supervised learning methods and also improving our clustering.
There is still significant room for improvement to make this project a viable genre classifier. Going forward, we plan to focus our work in two main areas: improving our feature selection process and improving our models.
Right now, our feature selection is quite limited, and does not do a great job of separating the data into distinct genre classification. Based on our current research, we have decided to experiment with using histograms of oriented gradients (HoG)[6] for our classification. This is a common feature reduction process used in image recognition, which we hope will translate well to the spectrogram images included in the dataset. However, this will also greatly increase complexity compared to our current features, so we may need to use PCA[7] to reduce the dimensions of the HoG features further.
Furthermore, we are going to branch into using supervised learning techniques for the second half of this project. These techniques will take advantage of the fact that our data is already labeled, and use these labels to better learn how to classify the songs. So far, we have identified five different models we would like to experiment with: logistic regression, support vector machines (SVMs), random forests, gradient boosting, and convolutional neural networks (CNNs). Each of these should give dramatically better results than our current clustering techniques.
This timeline is subject to change. A more detailed version can be found here.
TASK TITLE | TASK OWNER | START DATE | DUE DATE |
---|---|---|---|
Project Proposal | |||
Introduction & Background | James DiPrimo | 9/27/2023 | 10/6/2023 |
Problem Definition | Anirudh Ramesh | 9/27/2023 | 10/6/2023 |
Methods | Siddhant Dubey | 9/27/2023 | 10/6/2023 |
Timeline | Soongeol Kang | 9/27/2023 | 10/6/2023 |
Potential Results & Discussion | Joseph Campbell | 9/27/2023 | 10/6/2023 |
Video Recording | Siddhant Dubey | 9/27/2023 | 10/6/2023 |
GitHub Page | Siddhant Dubey | 9/27/2023 | 10/6/2023 |
Model 1 (GMM or K-means) | |||
Data Sourcing and Cleaning | Siddhant Dubey and Anirudh Ramesh | 10/7/2023 | 10/13/2023 |
Model Selection | Everyone | 10/13/2023 | 10/16/2023 |
Data Pre-Processing | Siddhant Dubey and Anirudh Ramesh | 10/16/2023 | 10/23/2023 |
Model Coding | Siddhant Dubey and Anirudh Ramesh | 10/23/2023 | 10/30/2023 |
Visualizations | James | 10/30/2023 | 11/2/2023 |
Quantitative Matrics, Analysis (Results Evaluation) | Joseph | 10/30/2023 | 11/2/2023 |
Describe data set, revise, update timeline/contribution table (Mid report) | Kang | 10/31/2023 | 11/3/2023 |
Midterm Report | Everyone | 10/31/2023 | 11/3/2023 |
Model 2 (CNN) | |||
Model Coding | Siddhant Dubey and Anirudh Ramesh | 10/28/2023 | 11/4/2023 |
Results Evaluation | Everyone | 11/5/2023 | 11/8/2023 |
Visualizations | James | 11/5/2023 | 11/8/2023 |
Quantitative Matrics, Analysis (Results Evaluation) | Joseph | 11/5/2023 | 11/8/2023 |
Describe data set, revise, update timeline/contribution table (Mid report) | Kang | 11/6/2023 | 11/9/2023 |
Analysis | Joseph | 11/6/2023 | 11/9/2023 |
Model 3 (SVMs) | |||
Midterm Report | Everyone | 11/3/2023 | 11/11/2023 |
Model Coding | Soongeol Kang, James DiPrimo | 11/11/2023 | 11/18/2023 |
Results Evaluation | Anirudh Ramesh | 11/18/2023 | 11/21/2023 |
Analysis | Joseph Campbell | 11/19/2023 | 11/22/2023 |
Model 4 (Random Forests) | |||
Model Coding | James DiPrimo, Anirudh Ramesh | 11/15/2023 | 11/22/2023 |
Results Evaluation | Siddhant Dubey | 11/20/2023 | 11/23/2023 |
Analysis | Soongeol Kang | 11/21/2023 | 11/24/2023 |
Evaluation | |||
Model Comparison | Everyone | 11/29/2021 | 12/4/2021 |
Presentation | Everyone | 12/1/2023 | 12/6/2023 |
Recording | Everyone | 12/6/2021 | 12/7/2021 |
Final Report | Everyone | 12/2/2021 | 12/8/2021 |
Contribution | Person |
---|---|
Introduction | James |
Problem Statement | Anirudh |
Methods | Siddhant |
Potential Results | Joseph |
Proposed Timeline | Soongeol |
Finding Datasets | Everyone |
Finding Papers | Everyone |
Contribution 2 | Person |
---|---|
pick data pre processing method | Anirudh, Siddhant |
implement algorithm | Anirudh, Siddhant |
quantitative metrics | Joseph |
analysis of algorithm | Joseph |
visualizations | James |
Next steps | James |
describe data set | Soongeol |
revise references, problem motivation and identification | Soongeol |
update timeline/contribution table | Soongeol |
results | Everyone |