Carnatic Kriti Dataset
Carnatic Kriti Dataset is a collection of recordings curated from the CompMusic collection with a set criteria on the number of recordings and works per raaga. As the CompMusic collection evolves over time, the dataset is presented here as a snapshot at different time periods. The changes between the snapshots reflect the new releases/recordings being added to the collection, as well as improvments in the data quality. Such improvements include adding missing metadata, correcting mislabelled data, changes in the data schema etc.
The Dataset
Version 2.0
This snapshot is dated June 2016. This is created with a criteria as follows: each raaga must have a minimum number of 20 performances that span over at least 5 different compositions. This too, like version 1.0, does not feature RTPs, but has Keertanams and Varnams. This resulted in a dataset that featured 42 raagas, 667 works and 2324 recordings.
Version 1.0
This snapshot is dated May 2015. This is created with a criteria as follows: each raaga must have a minimum number of 10 performances that span over at least 5 different compositions. Note that this do not include RTPs (Raagam-Taanam-Pallavi) and mainly feature Kritis. Other forms that this dataset consists of are Keertanas and Varnams. With the CompMusic collection at the time, this resulted in a dataset that featured 45 raagas, 545 works and 934 recordings.
Notations
We have looked up the works (in both version 1.0 and version 2.0) for their notation in different sources (published books, online resources such as personal blogs and forums). Those which were available are manually converted to a machine readable format (yaml). Each file is essentially a dictionary with section names of the work/composition as keys. Each section is represented as a list of cycles. Each cycle in turn has a list of divisions.