Tabla Solo dataset

The Tabla Solo dataset is a parallel corpus comprising time-aligned syllabic scores and audio-recordings of 38 solo tabla compositions. The audio and scores for these recordings is from the instructional video DVD titled Shades of Tabla by Pt. Arvind Mulgaonkar. If you use the dataset in your work, please cite the following publication:


S. Gupta, A. Srinivasamurthy, M. Kumar, H. A. Murthy, X. Serra. Discovery of Syllabic Percussion Patterns in Tabla Solo Recordings. In Proc. of the 16th International Society for Music Information Retrieval Conference (ISMIR), 2015.


A companion page to the paper is here:




This dataset can be downloaded here.




In Hindustani music, tabla is the main rhythm accompaniment (Examples of individual strokes of tabla can be obtained from here). To showcase the nuances of the tāl (the rhythmic framework of Hindustani music) as well as the skill of the percussionist with the tabla, Hindustani music concerts feature a tabla solo. A tabla solo is intricate and elaborate, with a variety of pre-composed forms used for developing further elaborations. There are specific principles that govern these elaborations. Musical forms of tabla such as the thēkā, kāyadā, palatā, ̣ rēlā, pēśkār and gat ̣are a part of the solo performance and have different functional and aesthetic roles in a solo performance. Harmonium or sarangi usually plays the role of a time-keeper in tabla solo performances.


Percussion in Hindustani music is organized and orally transmitted with the use of onomatopoeic mnemonic syllables (called the bōl) representative of the different strokes of tabla. Further, tabla has different stylistic schools called gharānās. The repertoires of major gharānās or schools of tabla differ in aspects such as the use of specific bōls, the dynamics of strokes, ornamentation and rhythmic phrases. But there are also many similarities due to the fact that same forms and same standard phrases reappear across these repertoires.


The Dataset


The syllabic representation for tabla solos provide a meaningful representation for analysis. This dataset uses a such a representation. The dataset comprises audio recordings, scores and time aligned syllabic transcriptions for 38 tabla solo compositions of different forms in tīntāl (a metrical cycle of 16 time units). The compositions are from the instructional video DVD Shades Of Tabla by Pandit Arvind Mulgaonkar, who is among the most renowned contemporary tabla maestros. Out of the 120 compositions in the DVD, we chose 38 representative compositions spanning all the gharānās of tabla (Ajrada, Benaras, Dilli, Lucknow, Punjab, Farukhabad). The dataset contains about 17 minutes of audio with over 8200 syllables.



The audio is extracted from the DVD video and segmented at the level of compositions from the full audio recording. The audio files are mono wav files, sampled at 44.1 kHz with a bit depth of 16 bits. All audios have a soft harmonium accompaniment.




The booklet accompanying the DVD provides a syllabic transcription for each composition. We used Tesseract, an open source Optical Character Recognizer (OCR) engine to convert printed scores to a machine readable format. The scores obtained from OCR were manually verified and corrected for errors, adding the the vibhāgs (sections) of the tāl to the syllabic transcription. A time aligned syllabic transcription for each score and audio file pair was obtained using a spectral flux based onset detector followed by manual correction. The score for each composition has additional metadata describing gharānā, composer and its musical form.


The scores in the booklet consists of 41 different mnemonic syllables that are reduced and mapped to 18 syllables based on the timbral similarity between the syllables. The list of syllables along with their mapping can be found here: Syllable Mappings


Dataset Organization


The dataset consists of set of four files for each composition:

  • WAV audio file (*.wav)

  • The syllable scores as retrieved from the booklet with the metadata (*.txt)

  • Time-aligned non-mapped syllabic score with stroke onset times (*.csv)

  • Time-aligned mapped syllabic score with stroke onset times (*.csv)


Example Compositions


Composition 1 : wav, scores, non-mapped-syllable-onsets, mapped-syllable-onsets


Composition 2 : wav, scores, non-mapped-syllable-onsets, mapped-syllable-onsets


Possible Uses of the Dataset


The dataset can be used for variety of of MIR tasks such as onset detection, percussion transcription, rhythm and percussion pattern analysis, and tabla stroke modeling.




The dataset (audio+scores+annotations) is available for research purposes.




Ajay Srinivasamurthy

PhD Student, Music Technology Group

Universitat Pompeu Fabra,

Barcelona, Spain


Swapnil Gupta

Masters Student, Sound and Music Computing

Universitat Pompeu Fabra,

Barcelona, Spain


Xavier Serra

Head, Music Technology Group

Universitat Pompeu Fabra,

Barcelona, Spain