Mridangam Tani-avarthanam dataset

The Mridangam Tani-avarthanam dataset is a transcribed collection of two tani-avarthanams played by the renowned Mridangam maestro Padmavibhushan Umayalpuram K. Sivaraman. The audio was recorded at IIT Madras, India and annotated by professional Carnatic percussionists. If you use the dataset in your work, please cite the following publication:
Jom Kuriakose, Chaitanya Kumar, Padi Sarala, Hema A Murthy, Umayalpuram Sivaraman, Akshara Transcription of Mrudangam Strokes in Carnatic Music, in Proceedings of the 21st National Conference on Communication, Feb. 2015, Mumbai, India


In Carnatic music, Tani-avarthanam is the solo performance by the percussion ensemble following the main piece of the concert. The solo is performed within the framework of the tala, but with much improvisation on the percussion patterns. The tani strives to present a showcase of the tala with a variety of percussion and rhythmic patterns that can be played in the tala. The percussion instruments complement each other in a solo of each instrument, with all instruments coming together to a cadential end. Tani avarthanam is a showcase of the skill and talent of the percussion artists. It is replete with a variety of percussion patterns and hence is very useful for analysis of percussion patterns. The tani is often performed with a subset of Mridangam, Khanjira, Ghatam, Morsing and vocal percussion (called Konnakol). The Mridangam is always present, while the other instruments are optional. 


This dataset can be downloaded here.

The Dataset

Percussion in Carnatic music is organized and transmitted orally with the use of onomatopoeic syllables representative of the different strokes of the Mridangam. The syllabic representation of the tani and the patterns provides a musically meaningful representation for analysis. The dataset uses such a representation. The dataset consists of two tani avarthanams played on a Mridangam tuned to tonic C#, one played in vilambita Adi taala (a cycle of 16 beats) and the other played in Rupaka tala (a cycle of 3 beats). Each tani is about 12 minutes long. The tani has been segmented into short phrases and each phrase has been transcribed into its constituent strokes, represented as syllables. The trancriptions also include pauses (denoted by , ) and change in speed (denoted by { and } ). The combined duration of both the tanis is approx. 24 minutes and consist of 8863 strokes. 


Both tanis were recorded in studio-like conditions using a Zoom H4n recorder with an SM 57 for the treble head (right) and SM 58 for the base head (left) of the Mridangam. The audio files are mono, sampled at 44.1KHz, and stored in 16 bit .wav format. 


The audio file has been segmented into short musically relevant phrases by professional musicians. The syllabic transcription of each phrase was done by professional Carnatic percussionists. The transcription is not time aligned, but only a sequence of the strokes played in the phrase. The entire set of strokes, and their notation used in the transcription files can be seen here: Strokes List

The list also specifies the number of occurences of each stroke in the tanis. 

A few example phrases with their transcription can be seen below. 

Example phrases

Tani 1 Phrase-1 (wav, label)

Tani-1 Phrase-2 (wav, label)

Tani 2 Phrase-1 (wav, label)

Tani 2 Phrase-2 (wav, label)

Dataset Organization

The dataset consists of pairs of files: audio .wav file and its corresponding transcription as a .txt file. The two tanis are separately stored, the full tani is also provided as a single file, without any annotations. 

Possible uses of the dataset

The dataset can be used for several MIR tasks such as onset detection, percussion transcription, rhythm and percussion pattern analysis, and Mridangam stroke modeling. 


The dataset (audio+annotations) is available for research purposes.


Manoj Kumar

Prof. Hema Murthy

DON Lab, Dept. of CSE, 

Indian Institute of Technology Madras

Chennai, India.