Annotated jingju arias dataset

The Annotated Jingju Arias Dataset is a collection of 34 jingju arias manually segmented in various levels using the software Praat v5.3.53. The selected arias contain samples of the two main shengqiang in jingju, name xipi and erhuang, and the five main role types in terms of singing, namely, dan, jing, laodan, laosheng and xiaosheng.

Download

This dataset can be donwloaded here.

Dataset

The dataset includes a Praat TextGrid file for each aria with the following tiers (all the annotations are in Chinese):

aria: name of the work (one segment for the whole aria)
MBID: MusicBrainz ID of the audioi recording (one segment for the whole aria)
artist: name of the singing performer (one segment for the whole aria)
school: related performing school (one segment for the whole aria)
role-type: role type of the singing character (one segment for the whole aria)
shengqiang: boundaries and label of the shengqiang performed in the aria (including accompaniment)
banshi: boundaries and label of the banshi performed in the aria (including accompaniment)
lyrics-lines: boundaries and annotation of each line of lyrics
lyrics-syllables: boundaries and annotation of each syllable
luogu: boundaries and label of each of the performed percussion patterns in the aria

The ariasInfo.txt file contains a summary of the contents per aira of the whole dataset.

A subset of this dataset comprising 20 arias has been used for the study of the relationship between linguistic tones and melody in the following papers:

Shuo Zhang, Rafael Caro Repetto, and Xavier Serra (2014) “Study of the Similarity between Linguistic Tones and Melodic Pitch Contours in Beijing Opera Singing.” In Proceedings of the 15^th International Society for Music Information Retrieval Conference (ISMIR 2014), Taipei, Taiwan, October 27–31, pp. 343–348.
______ (2015) “Predicting Pairwise Pitch Contour Relations Based on Linguistic Tone Information in Beijing Opera Singing.” In Proceedings of the 16^th International Society for Music Information Retrieval Conference (ISMIR 2015), Málaga, Spain, October 26–30, pp. 107–113.

Here is the list of the arias from the dataset used in these papers.

The whole dataset has been used for the automatic analysis of the structure of jingju arias and their automatic segmentation in the following master's thesis: