Music information research

Music information research

Music Information Research (MIR), is the research field aiming at the design and development of methods for the retrieval and mining of musical content. As these two targets are wide enough, MIR is an intensively inter-domain field that draws from a plethora of other fields such as musicology, psychology, signal processing, information retrieval, machine learning, human computer interaction to name a few.

Greek Audio Dataset

The Greek Audio Dataset (GAD), is a freely available collection of audio features and metadata for a thousand popular Greek tracks. In this work, the creation process of the dataset is described together with its contents. Following the methodology of existing datasets, the GAD dataset does not include the audio content of the respective data due to intellectual property rights but it includes MIR important features extracted directly from the content in addition to lyrics and manually annotated genre and mood for each audio track. Moreover, for each track a link to available audio content in YouTube is provided in order to support researchers that require the extraction of new featuresets, not included in the GAD. The selection of the features extracted has been based on the Million Song Dataset in order to ensure that researchers do not require new programming interfaces in order to take advantage of the GAD.

The dataset is available at https://hilab.di.ionio.gr/wp-content/uploads/2018/10/GAD_dataset.zip

When using the dataset, please make reference of our work: Makris D., Kermanidis K. L., Karydis I. The Greek Audio Dataset. Conference on Artificial Intelligence Applications and Innovations (AIAI 2014): International Workshop on Mining Humanistic Data, MHDW 2014. Island of Rhodes, Greece, September 19-21, 2014. (presentation – bib)

Greek Music Dataset

The Greek Music Dataset (GMD), is a significant extension of the “Greek Audio Dataset” than now includes 1400 popular Greek tracks, while for each track it features:

  • pre-computed audio, lyrics & symbolic features for immediate use in MIR tasks,
  • manually annotated labels pertaining to mood & genre styles of music,
  • metadata,
  • a manually selected MIDI file for the track (currently available for 500 of the tracks),
  • a manually selected link to a performance / audio content in YouTube is provided for further research, following the methodology of existing datasets, not to include content from the respective audio data due to intellectual property rights

The dataset is available at https://hilab.di.ionio.gr/wp-content/uploads/2018/10/GMD_dataset.zip

When using the dataset, please make reference of our work: Makris, D., Karydis, I., Sioutas, S.: “The Greek Music Dataset”, Proceedings Mining Humanistic Data Workshop, 2015. (presentation – bib)

Conditional Rhythm Composition using Deep Learning Architectures

Considering music as a sequence of events with multiple complex dependencies on various levels of a composition, the Long Short-Term Memory-based (LSTM) architectures have been proven to be very efficient in learning and reproducing musical styles. The “rampant force” of these architectures, however, makes them hardly useful for tasks that incorporate human input or generally constraints. Such an example is the generation of drums’ rhythms under a given metric structure (potentially combining different time signatures), with a given instrumentation (e.g. bass and guitar notes).

We present a solution that harnesses the LSTM sequence learner with a Feed-Forward (FF) part which is called the “Conditional Layer”. The LSTM and the FF layers influence (are merged into) a single layer making the final decision about the next drums’ event, given previous events (LSTM layer) and current constraints (FF layer). The resulting architecture is called the Conditional Neural Sequence Learner (CNSL).

More information on Conditional Neural Sequence Learners for Generating Drums’ Rhythms and DeepDrum: An Adaptive Conditional Neural Network for generating drum rhythms.

Makris, Dimos, et al. “Combining LSTM and feed forward neural networks for conditional rhythm composition.” International Conference on Engineering Applications of Neural Networks. Springer, Cham, 2017.

Makris, Dimos, et al. “Conditional neural sequence learners for generating drums’ rhythms.” Neural Computing and Applications (2018): 1-12.

Makris, Dimos, Maximos Kaliakatsos-Papakostas, and Katia Lida Kermanidis. “DeepDrum: An Adaptive Conditional Neural Network.” A Joint Workshop program of ICML, IJCAI/ECAI, and AAMAS on Machine Learning for Music, 2018.