CompMusic Seminar

On November 18th 2016, Friday, from 10h to 18:30h in room 55.410 of the Communication Campus of the Universitat Pompeu Fabra in Barcelona, we will have a CompMusic seminar. This seminar accompanies the PhD thesis defenses of Ajay Srinivasamurthy and Sankalp Gulati.

10:00 Simon Dixon (QMUL, London) [video]

"Music Similarity and Cover Song Identification: The Case of Jazz"

Similarity in music is an evasive and subjective concept, yet computational models of similarity are cited as important for addressing tasks such as music recommendation and the management of music collections. Cover song (or version) identification deals with a specific case of music similarity, where the underlying musical work is the same, but its realisation is different in each version, usually involving different performers and differing arrangements of the music, which may vary in instrumentation, form, tempo, key, lyrics or in other aspects of rhythm, melody, harmony and timbre. The new version retains some features of the original recording, and it is usually assumed that the sequential pitch content (corresponding to melody and harmony) is preserved with limited alterations from the original version.

In music information retrieval, a standard approach to version identification uses predominant melody extraction to represent melodic content and chroma features to represent harmonic content. These features are adapted to allow for variation in key or tempo between versions, and a pairwise sequence matching algorithm computes the pairwise similarity between tracks, which can be used to estimate groups of cover songs. Different versions of a jazz standard can be regarded as a set of cover songs, but the identification of such covers is more complicated than for many other styles of music, due to the improvisatory nature of jazz, which allows ornamentation and transformation of the melody as well as substitution of chords in the harmony. We report on experiments on a set of 300 jazz standards using discrete-valued and continuous-valued measures of pairwise predictability between sequences, based on work with a former PhD student, Peter Foster.

11:00 Geoffroy Peeters (IRCAM, Paris) [video]

"Recent researches at IRCAM related to the recognition of rhythm, vocal imitations and music structure"

In this talk, I will present some recent researches at IRCAM related to - the description of rhythm (especially the use of the Fourier-Mellin transform or of the Modulation Scale Transform with Auditory statistics) - the recognition of vocal imitations (using HMM decoding of SI-PLCA kernels over time) - the estimation of musical structure (using Convolutional Neural Networks).

12:00 Coffe break

12:30 Andre Holzapfel (KTH, Stockholm) [video]

"Tracking time: State-of-the-art and open problems in meter inference"

Throughout the last years, significant progress was made in algorithmic approaches that aim at the recognition of metrical cycles, and the tracking of their structure in music audio signals. The automatic adaption to rhythmic patterns enabled to go beyond manually tailored tracking approaches, and deep learning based features increase the accuracy of the inference given an unknown audio signal. In principle, arbitrary time signatures can be recognized and tracked from a music recording, assuming the existence of a large enough representative dataset to learn from. In this talk a short summary of the state of the art will be provided, and open problems will be presented that represent potential subjects of future studies. These open problems comprise the tracking of metrical cycles of very long duration, the inclusion of modes beyond the acoustic signal, and a variety of subjects that arise within areas like performance studies, music theory, and ethnomusicology.

13:30 Lunch break

15:00 Barış Bozkurt (Koç University, Istanbul) [video]

"Melodic analysis for Turkish makam music"

A makam generally implies a miscellany of rules for melodic composition, a design for melodic contour as a sequence of melodies (from specific categories) emphasising specific tones. This talk will start by presenting melody concepts in Turkish makam music and then continue discussing the methods, uses and automatisation of melodic analysis for that music tradition. A study on culture-specific automatic melodic segmentation (of scores) will be presented. Finally we will discuss future perspectives for melodic analysis within the context of corpus-based study of makams.

16:00 Juan Pablo Bello (NYU, New York) [video]

"Some Thoughts on the How, What and Why of Music Informatics Research"

The framework of music informatics research (MIR) can be thought of as a closed loop of data collection, algorithmic development and benchmarking. Much of what we do is heavily focused on the algorithmic aspects, or how to optimally combine various techniques from e.g., signal processing, data mining, and machine learning, to solve a variety of problems, from auto-tagging to automatic transcription, that captivate the interest of our community. We are very good at this, and in this talk I will describe some of the know-how that we have collectively accumulated over the years. On the other hand, I would argue that we are less proficient at clearly defining the “what” and “why” behind our work, that data collection and benchmarking have received far less attention and are often treated as afterthoughts, and that we sometimes tend to rely on widespread and limiting assumptions about music that affect the validity and usability of our research. On this, we can learn from other fields, such as music cognition, particularly with regards to the adoption of methods and practices that fully embrace the complexity and variability of human responses to music, while still clearly delineating the scope of the solutions or analyses being proposed.

17:00 Coffee break

17:30 Joan Serrà (Telefónica R+D, Barcelona) [video]

"Facts and myths about deep learning"

Deep learning has revolutionized the traditional machine learning pipeline, with impressive results in domains such as computer vision, speech analysis, or natural language processing. The concept has gone beyond research/application environments, and permeated into the mass media, news blogs, job offers, startup investors, or big company executives' meetings. But what is behind deep learning? Why has it become so mainstream? What can we expect from it? In this talk, I will highlight a number of facts and myths that will provide a shallow answer to the previous questions. While doing that, I will also highlight a number of applications we have worked on at our lab. Overall, the talk wants to place a series of basic concepts, while giving ground for reflection or discussion on the topic.