Hindustani Music Rhythm Dataset

CompMusic Hindustani Rhythm Dataset is a rhythm annotated test corpus for automatic rhythm analysis tasks in Hindustani Music. The collection consists of audio excerpts from the CompMusic Hindustani research corpus, manually annotated time aligned markers indicating the progression through the taal cycle, and the associated taal related metadata. A brief description of the dataset is provided below. 
Please cite the following publication if you use the dataset in your work:
Ajay Srinivasasmurthy, Andre Holzapfel, Ali Taylan Cemgil, Xavier Serra, "A generalized Bayesian model for tracking long metrical cycles in acoustic music signals", in Proc. of the 41st IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP 2016), Shanghai, China, March 2016 (to appear).

The Dataset

Description of the Dataset
  Structure Dataset
Tāl Matras in cycle # Vibhags Vibhaag structure HMDl  HMDs HMDf Total duration # Annotated matras # Annotated Sam
Teental 16 4 4,4,4,4 13 41 54 108 17142 1081
Ektaal 12 6 2,2,2,2,2,2 32 26 58 116 12999 1087
Jhaptaal 10 4 2,3,2,3 6 13 19 38 3029 302
Rupak taal 7 3 3,2,2 8 12 20 40 2841 406
Total - - - 59 92 151 302 36011 2876

Audio music content 

The pieces are chosen from the CompMusic Hindustani music collection. The pieces were chosen in four popular taals of Hindustani music (Table 1), which encompasses a majority of Hindustani khyal music. The pieces were chosen include a mix of vocal and instrumental recordings, new and old recordings, and to span three lays. For each taal, there are pieces in dhrut (fast), madhya (medium) and vilambit (slow) lays (tempo class). All pieces have Tabla as the percussion accompaniment. The excerpts are two minutes long. Each piece is uniquely identified using the MBID of the recording. The pieces are stereo, 160 kbps, mp3 files sampled at 44.1 kHz. The audio is also available as wav files for experiments. 


There are several annotations that accompany each excerpt in the dataset. 
Sam, vibhaag and the maatras: The primary annotations are audio synchronized time-stamps indicating the different metrical positions in the taal cycle. The sam and matras of the cycle are annotated. The annotations were created using Sonic Visualizer by tapping to music and manually correcting the taps. Each annotation has a time-stamp and an associated numeric label that indicates the position of the beat marker in the taala cycle. The annotations and the associated metadata have been verified for correctness and completeness by a professional Hindustani musician and musicologist. The structure of the taals in the dataset is shown in Figure 1. The long thick lines show vibhaag boundaries. The numerals indicate the matra number in cycle. In each case, the sam (the start of the cycle, analogous to the downbeat) are indicated using the numeral 1. 
           1. Teentaal                               2. Ektaal (vilambit lay)
    3. Jhaptaal                   4. Rupak taal

Taal related metadata: For each excerpt, the taal and the lay of the piece are recorded. Each excerpt can be uniquely identified and located with the MBID of the recording, and the relative start and end times of the excerpt within the whole recording. A separate 5 digit taal based unique ID is also provided for each excerpt as a double check. The artist, release, the lead instrument, and the raag of the piece are additional editorial metadata obtained from the release. There are optional comments on audio quality and annotation specifics. 

Data subsets

The dataset consists of excerpts with a wide tempo range from 10 MPM (matras per minute) to 370 MPM. To study any effects of the tempo class, the full dataset (HMDf) is also divided into two other subsets - the long cycle subset (HMDl) consisting of vilambit (slow) pieces with a median tempo between 10-60 MPM, and the short cycle subset (HMDs) with madhyalay (medium, 60-150 MPM) and the drut lay (fast, 150+ MPM). 

Possible uses of the dataset

Possible tasks where the dataset can be used include taal, sama and beat tracking, tempo estimation and tracking, taal recognition, rhythm based segmentation of musical audio, audio to score/lyrics alignment, and rhythmic pattern discovery. 

Dataset organization

The dataset consists of audio, annotations, an accompanying spreadsheet providing additional metadata, a MAT-file that has identical information as the spreadsheet, and a dataset description document.

Availability of the Dataset

The annotations are publicly shared and available to all. The audio is from commercially available releases. It cannot be publicly shared but can be made available on request for non-commercial research purposes. In the future, the dataset would be available for viewing and download through an interface in Dunya (http://dunya.compmusic.upf.edu). Please write to us if you wish to use the dataset and need the audio and annotations.  


If you have any questions or comments about the dataset, please feel free to write to us. 
Ajay Srinivasamurthy
Music Technology Group
Universitat Pompeu Fabra, 
Barcelona, Spain
Kaustuv Kanti Ganguli
DAP lab, Dept. of Electrical Engineering,
Indian Institute of Technology Bombay
Mumbai, India