Final Report

CompMusic has finished, and our funding agency, ERC, asked us to write a brief report. Here is it.

Achievements along the main objectives/activities

The CompMusic project has been a big and long project with many achievements, impossible to summarize here. For a presentation of all the results please check the project website. Here we try to just highlight the main achievements.

The main objectives of the project, as stated in the initial proposal were:

Promote a multicultural perspective in Music Computing research.
Advance in description and formalization of music, making it more accessible to computational approaches.
Reduce the gap between audio signal descriptions and semantically meaningful music concepts.
Develop information modelling techniques for some non-Western music repertories.
Develop computational models to represent culture specific music contexts.

We made relevant contributions within all these proposed objectives. The project contributed extensively to the field of Music Information Retrieval and to the musical cultures that it studied. It had a major impact in promoting the topic of cultural/domain specificity, influencing many researchers and institutional initiatives. We compiled and made openly available corpora of five music traditions: Hindustani (North India), Carnatic (South India), Beijing Opera (China), Turkish makam (Turkey), and Arab Andalusian (Maghreb); also created 24 datasets developed for specific experiments around these traditions. We produced 150 publications (plus some more still being prepared) with a wide variety of contributions, especially on the extraction of features from audio music recordings related to melody and rhythm and on the semantic analysis of the contextual information of those recordings. We developed new and improved existing software tools which now are becoming a reference in the field. We edited a special issue of the Journal of New Music Research in 2014, presenting a number of position papers, and a recent journal article (Serra 2017) summarized some of the CompMusic results in the context of Computational Musicology.

The main research efforts in CompMusic were centered on developing computational methodologies to process the audio recordings that are part of the corpora. In this way, we obtained features and models that can be used to study melodic and rhythmic characteristics of the different music repertoires.

All the music we studied was heterophonic, with a prominent melody, and we chose to focus on performances with a lead singing voice, the most common performance practice in the traditions we examined. The necessary first step in studying the melody of a song is measuring the pitch contour (fundamental frequency) of the prominent voice. Given that significant prior work was available for this task, we decided to implement several state-of-the-art algorithms, such as the ones presented by Salamon et al. 2014, within the ESSENTIA software library (Bogdanov et al. 2013). By implementing these algorithms and by adding small adaptations to our music (e.g. Atlı et al. 2015), we were able to obtain accurate enough pitch contours for most of the recordings in our corpora. Our main melodic analysis work started from these pitch contours.

In traditions like the Hindustani, Carnatic or Turkish modal systems, no standard reference pitch is used. Because of this, to compare pitch contours of different songs we had to develop methods that could automatically identify the tonic of each recording and normalize the measured pitch values with it (Gulati et al. 2014; Şentürk et al. 2013; Atlı et al. 2015). The resulting pitch contours were still too complex to be used directly in any musicological analysis, so we developed methods to represent the complex pitch behavior and the intonation of the performances, which can be very culture specific (Bozkurt 2011; Koduri et al. 2014; Şentürk et al. 2016).

Every piece of music has a characteristic temporal structure and a series of sections, which need to be identified if we want to process and study each section of a piece separately. For this purpose, we developed audio-segmentation methods, most of them very specific for given styles and structural elements (Sarala and Murthy 2013; Senturk et al. 2014; Sankaran et al. 2015; Verma et al. 2015; T.P. et al. 2016; Sekhar PV et al. 2016; Gong et al. 2016).

A core goal in the description of melodies and rhythms is the identification of repeating patterns. In the case of melody, we worked on developing similarity measures using pitch contours and, from these, on developing melodic pattern discovery methodologies (Gulati et al. 2015; Ganguli et al. 2016; Gulati, 2016 chapter 5). For the analysis of rhythm, we developed onset detection methodologies from audio recordings (Tian et al. 2014, Dzhambazov et al. 2017). From these onsets, we developed automatic analysis methods to derive rhythm similarity measures and rhythm patterns, while establishing a state of the art in automatic meter analysis of particular music traditions (Srinivasamurthy 2016, chapter 3 and 5; Srinivasamurthy et al. 2017).

An important piece of complementary information for the study of melodies is the lyrics of the songs. In CompMusic we worked on the automatic alignment of lyrics and audio (Dzhambazov et al. 2014; Dzhambazov et al. 2016; Dzhambazov 2017) and, for the particular case of Beijing opera, on the relation between linguistic tones and the melodic pitch contours (Zhang et al. 2014; Zhang et al. 2015; Caro et al. 2017) and on the relation between the lyrics and the musical expression (Zhang et al. 2017). Related to lyrics are the mnemonic syllables that are used in some traditions to represent the timbres of the percussion strokes. We studied methods with which to identify them automatically from the audio recordings (Srinivasamurthy et al. 2014; Gupta et al. 2015).

Despite the fact that all the music cultures we studied were based in oral tradition, the scores still have an important role in some of them. As with the lyrics, it was an important task for us to develop methodologies for aligning the available scores with the corresponding audio recordings (Şentürk et al. 2014; Holzapfel et al. 2015; Şentürk 2016; Gong et al. 2017). Together with that, we did some research on analyzing the scores such that it was possible automatically to identify the sections of a song (Şentürk and Serra 2016).

To complement the analysis of the corpus, we also analyzed other contextual data from other sources, mainly text, to obtain complementary information about the musical cultures we studied. The methodologies we developed and used fall into the context of what is called semantic web technologies. For example, in CompMusic we did some text-mining research using online forums for music lovers (Sordo et al. 2012) in order to characterize the behavior within a particular community. A major aim of this semantic analysis was to automatically create ontologies with which to formalize the various musical concepts used in a given musical culture (Koduri 2016, chapter 9).

Cross disciplinary contributions

The project was clearly interdisciplinary at many levels; however, it is especially significant the musicological contributions that came out from our work.

With the Turkish makam corpus, Holzapfel and Bozkurt 2012 studied the main rhythm structures of this tradition, usuls, and showed that metrical contradiction is systematically applied in some usuls. In a further study, Holzapfel 2013 studied improvisatory performances, taksims, examining how rhythmic idioms are formed and maintained throughout a performance. Bozkurt 2015 studied the concept of seyir (melodic progression), describing and comparing different makams by visualizing of the long-term evolution of the melodies within pieces.

Using the jingju corpus, in Zhang et al. 2014, Zhang et al. 2015 and Caro et al. 2017, we studied the relations between linguistic tones and pitch contours, identifying the difficulty accompanying the use of two dialectal tone systems in this music tradition. In Caro et al. 2015 we also used various audio analysis methods to compare two of the best known performing schools of the dan role-type, supporting, in part, the descriptions given of these schools in various musicological studies.

Analyzing the Hindustani and Carnatic corpora, in Serrà et al. 2011 we compared the intonation profiles of the two music traditions. We proved that Carnatic music does not use equal temperament while the intonation used Hindustani music is closer to equal temperament. In the article by Ganguli et al. 2016 we studied how Hindustani musicians improvise in accordance with the raga grammar, discovering some patterns that corroborate existing musicological understanding of the “unfolding” of a raga during a performance. In the paper by Gulati et al. 2016 we studied the properties of the ragas by identifying the characteristic melodic motives of a number of ragas as a step in the automatic identification of ragas.

Knowledge and technology transfer contributions

An important goal of CompMusic was to create corpora and to develop technologies that could be used both by researchers and the general public, while also promoting technology transfer. We succeeded in doing that.

Dunya comprises the music corpora and related software tools that have been developed and that are useful for the research community. These corpora include audio recordings plus complementary information that describes the recordings. Each corpus has specific characteristics and the developed software tools allow to process the available information in order to study and explore the characteristics of each musical repertoire.

As extensions of CompMusic we obtained support for two PoC projects, CAMUT and TECSOME, to promote specific tech transfer initiatives.

The goal of CAMUT was to exploit some technologies developed within CompMusic, developing specific prototype products tailored to suit the cultural/social/economic context of India, formulating a business plan around them. The main outcomes of CAMUT were two software applications and a spin-off company. Saraga, which is a mobile app for Android that allows to navigate and listen to a collection of Hindustani and Carnatic songs, and Riyaz, which is also an Android app, but this one specific to learn music. MusicMuni has been created in order to commercialize Riyaz.

TECSOME is a project that will start in October 2017 that will take advantage of some technologies that are centered on measuring music similarity. We will build an automatic assessment system, named Music Critic, to support music performance courses and help them scale up to MOOC level.

Dissemination

The dissemination strategy of CompMusic has been based on a clear open science model; thus sharing our ideas, goals, and results as openly and widely as possible. All our publications have been made available as soon as they have been written, all our code is open source, and all the data generated is available under open licenses. We have also organized seminars, workshops, and concerts, which have been recorded and made available from the project website and in general we have been very active in disseminating our work.