An Evaluation of Methodologies For Melodic Similarity in Audio Recordings of Indian Art Music

This is a companion page for:

Article

Gulati, S., Serrà, J., & Serra, X. (2015). An evaluation of methodologies for melodic similarity in audio recordings of Indian art music. In Proceedings of the 40th IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 678–682. Brisbane, Australia

[Postprint manuscript (PDF@MTG)] [BibTex] [Poster]

Abstract

We perform a comparative evaluation of methodologies for computing similarity between short-time melodic fragments of audio recordings of Indian art music. We experiment with 560 different combinations of procedures and parameter values. These include the choices made for the sampling rate of the melody representation, pitch quantization levels, normalization techniques and distance measures. The dataset used for evaluation consists of 157 and 340 annotated melodic fragments of Carnatic and Hindustani music recordings, respectively. Our results indicate that melodic fragment similarity is particularly sensitive to distance measures and normalization techniques. Sampling rates do not have a significant impact for Hindustani music, but can significantly degrade the performance for Carnatic music. Overall, the performed evaluation provides a better understanding of the processing steps and parameter settings for melodic similarity in Indian art music. Importantly, it paves the way for developing unsupervised melodic pattern discovery approaches, whose evaluation is a challenging and, many times, ill-defined task.

Code

The code used for computing melodic similarity can be obtained from here. There is a bunch of code used for computing melodic similarity in different contexts (variable length query, fixed length query, data present in subsequence, data present as a time series etc), which can be found here

Dataset

The dataset used in this study can be found here

Results

Details of the results for all possible combinations of the parameters and procedures is summarized in the csv file below.

Case1: When ground truth segmentation is used:

Carnatic Music Dataset

Hindustani Music Dataset

 

Case2: When the target pattern candidates are of the same length as the query pattern:

Carnatic Music Dataset

Hindustani Music Dataset