Companion webpage for the PhD thesis of Gopala Krishna Koduri

This page is the companion webpage for the PhD thesis titled

Towards a multimodal knowledge base for Indian art music: A case study with melodic intonation

Gopala Krishna Koduri

(Actively updated at the moment. Last updated: 11 Feb 2017.

Abstract:  This thesis is a result of our research efforts in building a multi-modal knowledge-base for the specific case of Carnatic music. Besides making use of metadata and symbolic notations, we process natural language text and audio data to extract culturally relevant and musically meaningful information and structuring it with formal knowledge representations. This process broadly consists of two parts. In the first part, we analyze the audio recordings for intonation description of pitches used in the performances. We conduct a thorough survey and evaluation of the previously proposed pitch distribution based approaches on a common dataset, outlining their merits and limitations. We propose a new data model to describe pitches to overcome the shortcomings identified. This expands the perspective of the note model in-vogue to cater to the conceptualization of melodic space in Carnatic music. We put forward three different approaches to retrieve compact description of pitches used in a given recording employing our data model. We qualitatively evaluate our approaches comparing the representations of pitched obtained from our approach with those from a manually labeled dataset, showing that our data model and approaches have resulted in representations that are very similar to the latter. Further, in a raaga classification task on the largest Carnatic music dataset so far, two of our approaches are shown to outperform the state-of-the-art by a statistically significant margin.

In the second part, we develop knowledge representations for various concepts in Carnatic music, with a particular emphasis on the melodic framework. We discuss the limitations of the current semantic web technologies in expressing the order in sequential data that curtails the application of logical inference. We present our use of rule languages to overcome this limitation to a certain extent. We then use open information extraction systems to retrieve concepts, entities and their relationships from natural language text concerning Carnatic music. We evaluate these systems using the concepts and relations from knowledge representations we have developed, and groundtruth curated using Wikipedia data. Thematic domains like Carnatic music have limited volume of data available online. Considering that these systems are built for web-scale data where repetitions are taken advantage of, we compare their performances qualitatively and quantitatively, emphasizing characteristics desired for cases such as this. The retrieved concepts and entities are mapped to those in the metadata. In the final step, using the knowledge representations developed, we publish and integrate the information obtained from different modalities to a knowledge-base. On this resource, we demonstrate how linking information from different modalities allows us to deduce conclusions which otherwise would not have been possible.

Thesis document (PDF)

Slides of presentation (PDF)

PhD defense presentation: https://youtu.be/7jgbZ1zDFDg

The main outcomes from the thesis are: i) Approaches to model and extract a musically meaningful svara intonation description from audio music recordings and ii) Ontologies for Indian art music traditions. These outcomes in their various forms can be accessed from the links below.

As part of the work during the course of this thesis, we have put together the Catnatic Varnam and Kriti datasets and contributed to a few others. All these datasets are publicly available for research purposes. The companion webpages corresponding to the publications list these associated datasets. The prominent ones amongst them are listed below:

The complete list of datasets resulting from CompMusic project can be accessed here.

Here we list relevant publications related with the work presented in the thesis.

Peer-reviewed journals

Full articles in peer-reviewed conferences

Other contributions to conferences

This thesis has resulted in several reusable modules of code. These include the core contributions of the thesis, which are methods to extract a musically meaningful svara description. Most of it has been written in Python language, and is versioned using github. Some of the code has been released as python packages which can be install using python setup-tools like pip or easy_install. Most others either have Jupyter notebooks (earlier, IPython notebook) that show the usage, or have Readme that explains how to reuse the code.

  • Python module for intonation description in audio music files [github, pypi]
  • Intonation description using score-aligned approach [github]
  • Python module for extracting peaks from generic data distributions using different criteria like steepness or data intervals [github, pypi]
  • Vichakshana, a system to quantify the salience of musical characteristics from unstructured text [github]
  • Code for OpenIE system evaluation framework [github]
  • Scripts used for Wikipedia graph analysis (built on this fork) [github]
  • String matching algorithms that work best for matching variations of roman transliteration of words in Indian languages [github]
  • All the experiments conducted during the course of the thesis using codes from several of the above repositories [github]

Repositories extensively used during our work:

  • Essentia audio analysis library [UPF]
  • Dunya API [github]

Part of the code/results/outcomes of the thesis work are already incorporated in mobile applications that are designed to provide enhance listening experience of Indian art music and as a tool to aid in learning and teaching of this music tradition. The particular applications are given below.

  • Sarāga: A mobile application that provides an enriched listening atmosphere over a collection of Carnatic and Hindustani music.
  • Riyāz: A mobile application that aims to facilitate music learning for students of Indian art music by making their practice sessions more efficient.

Besides these results, certain outcomes of my work are also integrated into Dunya. These features are available from both the Dunya webpage as well as through Dunya API

Ontologies for Indian art music traditions constitute one of the important contributions from the thesis. They include the following:

  • Raaga ontology
  • Carnatic music ontology, which subsumes raaga ontology and the following ontologies:
    • Taala ontology
    • Carnatic Forms ontology
    • Performer ontology

All of these are publicly accessible in the following repositories.

  • Ontologies for Indian art music traditions [github]
  • Multimodal knowledge-base for Indian art music traditions [github]
Apart from the core experiments related code, there are some other useful resources/tools, which are summarized below.