Beijing Opera Percussion Pattern Dataset

The Beijing Opera Percussion Pattern (BOPP) dataset is a collection of audio examples of percussion patterns played by the percussion ensemble in Beijing Opera (Jingju, 京剧). The percussion ensemble in Jingju plays a set of pre-defined and labeled percussion patterns, which serve many functions. The percussion patterns can be defined as sequences of strokes played by different combinations of the percussion instruments, and the resulting variety of timbres are transmitted using oral syllables as mnemonics. More information on the percussion instruments used in Beijing Opera can be found at

The dataset presented here was used as the training dataset in the folllowing paper. A detailed description of percussion patterns in Jingju can also be found in it. If you use the dataset in your work, please cite the following publication. 

[1] Ajay Srinivasamurthy, Rafael Caro Repetto, Harshavardhan Sundar, Xavier Serra, "Transcription and Recognition of Syllable based Percussion Patterns: The Case of Beijing Opera," in Proceedings of the 15th International Society for Music Information Retrieval (ISMIR) Conference, Taipei, Taiwan, Oct 2014.


This dataset can be downloaded here.


The dataset is a collection of 133 audio percussion patterns spanning five different pattern classes as described below. The scores for the patterns and additional details about the patterns are at:

Table 1: Beijing Opera Percussion Pattern Dataset
Pattern Class PatternID (pID) Instances (N)
daobantou【导板头】 10 66
man changchui 【慢长锤】 11 33
duotou 【夺头】 12 19
xiaoluo duotou【小锣夺头】 13 11
shanchui【闪锤】 14 8
Total   133

Audio Content

The audio files are short segments containing one of the above mentioned patterns. The audio is stereo, sampled at 44.1 kHz, and stored as wav files. The segments were chosen from the introductory parts of arias. The recordings of arias are from commercially available releases spanning various artists. The audio and segments were chosen carefully by a musicologist to be representative of the percussion patterns that occur in Jingju. The audio segments contain diverse instrument timbres of percussion instruments (though the same set of instruments are played, there can be slight variations in the individual instruments across different ensembles), recording quality and period of the recording. Though these recordings were chosen from introductions of arias where only percussion ensemble is playing, there are some examples in the dataset where the melodic accompaniment starts before the percussion pattern ends. 


Each of the audio patterns has an associated syllable level transcription of the audio pattern. The transcription is obtained from the score for the pattern and is not time aligned to the audio. The transcription is done using the reduced set of five syllables described in Table 1 of [1] and is sufficient to computationally model the timbres of all the syllables. The annotations are stored as Hidden Markov Model Toolkit (HTK) label files. There is also a single master label file provided for batch processing using HTK ( 

Dataset organization

The dataset has wav files and label files. The files are named as


The pID is as in Table 1, instID is a three digit identifier for the specific instance of the pattern, and extension can be .wav for the audio file or .lab for the label file. pID ϵ {10, 11, 12, 13, 14}, InstID ϵ {1, 2, ..., NpID}. e.g. The audio file and the label file for the fifth instance of the pattern duotuo is named 12005.wav and 12005.lab, respectively. The master label file is called masterLabels.lab

Availability of the Dataset

The annotations are publicly shared and available to all. The audio is from commercially available releases. It cannot be publicly shared but can be made available on request for non-commercial research purposes.


If you have any questions or comments about the dataset, please feel free to write to us.

Ajay Srinivasamurthy ()

Rafael Caro Repetto ()