DROPS

Volume

Dagstuhl Follow-Ups, Volume 3

Multimodal Music Processing

Editors: Meinard Müller, Masataka Goto, and Markus Schedl

Document

DOI: 10.4230/DagRep.12.2.103

Deep Learning and Knowledge Integration for Music Audio Analysis (Dagstuhl Seminar 22082)

Authors: Meinard Müller, Rachel Bittner, Juhan Nam, Michael Krause, and Yigitcan Özer

Published in: Dagstuhl Reports, Volume 12, Issue 2 (2022)

Abstract

Given the increasing amount of digital music, the development of computational tools that allow users to find, organize, analyze, and interact with music has become central to the research field known as Music Information Retrieval (MIR). As in general multimedia processing, many of the recent advances in MIR have been driven by techniques based on deep learning (DL). There is a growing trend to relax problem-specific modeling constraints from MIR systems and instead apply relatively generic DL-based approaches that rely on large quantities of data. In the Dagstuhl Seminar 22082, we critically examined this trend, discussing the strengths and weaknesses of these approaches using music as a challenging application domain. We mainly focused on music analysis tasks applied to audio representations (rather than symbolic music representations) to give the seminar cohesion. In this context, we systematically explored how musical knowledge can be integrated into or relaxed from computational pipelines. We then discussed how this choice could affect the explainability of models or the vulnerability to data biases and confounding factors. Furthermore, besides explainability and generalization, we also addressed efficiency, ethical and educational aspects considering traditional model-based and recent data-driven methods. In this report, we give an overview of the various contributions and results of the seminar. We start with an executive summary describing the main topics, goals, and group activities. Then, we give an overview of the participants' stimulus talks and subsequent discussions (listed alphabetically by the main contributor’s last name) and summarize further activities, including group discussions and music sessions.

Cite as

Meinard Müller, Rachel Bittner, Juhan Nam, Michael Krause, and Yigitcan Özer. Deep Learning and Knowledge Integration for Music Audio Analysis (Dagstuhl Seminar 22082). In Dagstuhl Reports, Volume 12, Issue 2, pp. 103-133, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2022)

Copy BibTex To Clipboard

@Article{muller_et_al:DagRep.12.2.103,
  author =	{M\"{u}ller, Meinard and Bittner, Rachel and Nam, Juhan and Krause, Michael and \"{O}zer, Yigitcan},
  title =	{{Deep Learning and Knowledge Integration for Music Audio Analysis (Dagstuhl Seminar 22082)}},
  pages =	{103--133},
  journal =	{Dagstuhl Reports},
  ISSN =	{2192-5283},
  year =	{2022},
  volume =	{12},
  number =	{2},
  editor =	{M\"{u}ller, Meinard and Bittner, Rachel and Nam, Juhan and Krause, Michael and \"{O}zer, Yigitcan},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops-dev.dagstuhl.de/entities/document/10.4230/DagRep.12.2.103},
  URN =		{urn:nbn:de:0030-drops-169333},
  doi =		{10.4230/DagRep.12.2.103},
  annote =	{Keywords: Audio signal processing, deep learning, knowledge representation, music information retrieval, user interaction and interfaces}
}

Document

DOI: 10.4230/DagRep.9.1.125

Computational Methods for Melody and Voice Processing in Music Recordings (Dagstuhl Seminar 19052)

Authors: Meinard Müller, Emilia Gómez, and Yi-Hsun Yang

Published in: Dagstuhl Reports, Volume 9, Issue 1 (2019)

Abstract

In our daily lives, we are constantly surrounded by music, and we are deeply influenced by music. Making music together can create strong ties between people, while fostering communication and creativity. This is demonstrated, for example, by the large community of singers active in choirs or by the fact that music constitutes an important part of our cultural heritage. The availability of music in digital formats and its distribution over the world wide web has changed the way we consume, create, enjoy, explore, and interact with music. To cope with the increasing amount of digital music, one requires computational methods and tools that allow users to find, organize, analyze, and interact with music--topics that are central to the research field known as \emph{Music Information Retrieval} (MIR). The Dagstuhl Seminar 19052 was devoted to a branch of MIR that is of particular importance: processing melodic voices (with a focus on singing voices) using computational methods. It is often the melody, a specific succession of musical tones, which constitutes the leading element in a piece of music. In the seminar we discussed how to detect, extract, and analyze melodic voices as they occur in recorded performances of a piece of music. Gathering researchers from different fields, we critically reviewed the state of the art of computational approaches to various MIR tasks related to melody processing including pitch estimation, source separation, singing voice analysis and synthesis, and performance analysis (timbre, intonation, expression). This triggered interdisciplinary discussions that leveraged insights from fields as disparate as audio processing, machine learning, music perception, music theory, and information retrieval. In particular, we discussed current challenges in academic and industrial research in view of the recent advances in deep learning and data-driven models. Furthermore, we explored novel applications of these technologies in music and multimedia retrieval, content creation, musicology, education, and human-computer interaction. In this report, we give an overview of the various contributions and results of the seminar. We start with an executive summary, which describes the main topics, goals, and group activities. Then, we present a more detailed overview of the participants' contributions (listed alphabetically by their last names) as well as of the ideas, results, and activities of the group meetings, the demo, and the music sessions.

Cite as

Meinard Müller, Emilia Gómez, and Yi-Hsun Yang. Computational Methods for Melody and Voice Processing in Music Recordings (Dagstuhl Seminar 19052). In Dagstuhl Reports, Volume 9, Issue 1, pp. 125-177, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2019)

Copy BibTex To Clipboard

@Article{muller_et_al:DagRep.9.1.125,
  author =	{M\"{u}ller, Meinard and G\'{o}mez, Emilia and Yang, Yi-Hsun},
  title =	{{Computational Methods for Melody and Voice Processing in Music Recordings (Dagstuhl Seminar 19052)}},
  pages =	{125--177},
  journal =	{Dagstuhl Reports},
  ISSN =	{2192-5283},
  year =	{2019},
  volume =	{9},
  number =	{1},
  editor =	{M\"{u}ller, Meinard and G\'{o}mez, Emilia and Yang, Yi-Hsun},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops-dev.dagstuhl.de/entities/document/10.4230/DagRep.9.1.125},
  URN =		{urn:nbn:de:0030-drops-105732},
  doi =		{10.4230/DagRep.9.1.125},
  annote =	{Keywords: Acoustics of singing, audio signal processing, machine learning, music composition and performance, music information retrieval, music perception and cognition, music processing, singing voice processing, sound source separation, user interaction and interfaces}
}

Document

DOI: 10.4230/DagRep.6.2.147

Computational Music Structure Analysis (Dagstuhl Seminar 16092)

Authors: Meinard Müller, Elaine Chew, and Juan Pablo Bello

Published in: Dagstuhl Reports, Volume 6, Issue 2 (2016)

Abstract

Music is a ubiquitous and vital part of the lives of billions of people worldwide. Musical creations and performances are among the most complex and intricate of our cultural artifacts, and the emotional power of music can touch us in surprising and profound ways. In view of the rapid and sustained growth of digital music sharing and distribution, the development of computational methods to help users find and organize music information has become an important field of research in both industry and academia. The Dagstuhl Seminar 16092 was devoted to a research area known as music structure analysis, where the general objective is to uncover patterns and relationships that govern the organization of notes, events, and sounds in music. Gathering researchers from different fields, we critically reviewed the state of the art for computational approaches to music structure analysis in order to identify the main limitations of existing methodologies. This triggered interdisciplinary discussions that leveraged insights from fields as disparate as psychology, music theory, composition, signal processing, machine learning, and information sciences to address the specific challenges of understanding structural information in music. Finally, we explored novel applications of these technologies in music and multimedia retrieval, content creation, musicology, education, and human-computer interaction. In this report, we give an overview of the various contributions and results of the seminar. We start with an executive summary, which describes the main topics, goals, and group activities. Then, we present a list of abstracts giving a more detailed overview of the participants' contributions as well as of the ideas and results discussed in the group meetings of our seminar.

Cite as

Meinard Müller, Elaine Chew, and Juan Pablo Bello. Computational Music Structure Analysis (Dagstuhl Seminar 16092). In Dagstuhl Reports, Volume 6, Issue 2, pp. 147-190, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2016)

Copy BibTex To Clipboard

@Article{muller_et_al:DagRep.6.2.147,
  author =	{M\"{u}ller, Meinard and Chew, Elaine and Bello, Juan Pablo},
  title =	{{Computational Music Structure Analysis (Dagstuhl Seminar 16092)}},
  pages =	{147--190},
  journal =	{Dagstuhl Reports},
  ISSN =	{2192-5283},
  year =	{2016},
  volume =	{6},
  number =	{2},
  editor =	{M\"{u}ller, Meinard and Chew, Elaine and Bello, Juan Pablo},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops-dev.dagstuhl.de/entities/document/10.4230/DagRep.6.2.147},
  URN =		{urn:nbn:de:0030-drops-61415},
  doi =		{10.4230/DagRep.6.2.147},
  annote =	{Keywords: Music Information Retrieval, Music Processing, Music Perception and Cognition, Music Composition and Performance, Knowledge Representation, User Interaction and Interfaces, Audio Signal Processing, Machine Learning}
}

Document

DOI: 10.4230/DagRep.3.11.1

Computational Audio Analysis (Dagstuhl Seminar 13451)

Authors: Meinard Müller, Shrikanth S. Narayanan, and Björn Schuller

Published in: Dagstuhl Reports, Volume 3, Issue 11 (2014)

Abstract

Compared to traditional speech, music, or sound processing, the computational analysis of general audio data has a relatively young research history. In particular, the extraction of affective information (i.e., information that does not deal with the 'immediate' nature of the content such as the spoken words or note events) from audio signals has become an important research strand with a huge increase of interest in academia and industry. At an early stage of this novel research direction, many analysis techniques and representations were simply transferred from the speech domain to other audio domains. However, general audio signals (including their affective aspects) typically possess acoustic and structural characteristics that distinguish them from spoken language or isolated `controlled' music or sound events. In the Dagstuhl Seminar 13451 titled "Computational Audio Analysis" we discussed the development of novel machine learning as well as signal processing techniques that are applicable for a wide range of audio signals and analysis tasks. In particular, we looked at a variety of sounds besides speech such as music recordings, animal sounds, environmental sounds, and mixtures thereof. In this report, we give an overview of the various contributions and results of the seminar. We start with an executive summary, which describes the main topics, goals, and group activities. Then, one finds a list of abstracts giving a more detailed overview of the participants' contributions as well as of the ideas and results discussed in the group meetings of our seminar. To conclude, an attempt is made to define the field as given by the views of the participants.

Cite as

Meinard Müller, Shrikanth S. Narayanan, and Björn Schuller. Computational Audio Analysis (Dagstuhl Seminar 13451). In Dagstuhl Reports, Volume 3, Issue 11, pp. 1-28, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2014)

Copy BibTex To Clipboard

@Article{muller_et_al:DagRep.3.11.1,
  author =	{M\"{u}ller, Meinard and Narayanan, Shrikanth S. and Schuller, Bj\"{o}rn},
  title =	{{Computational Audio Analysis (Dagstuhl Seminar 13451)}},
  pages =	{1--28},
  journal =	{Dagstuhl Reports},
  ISSN =	{2192-5283},
  year =	{2014},
  volume =	{3},
  number =	{11},
  editor =	{M\"{u}ller, Meinard and Narayanan, Shrikanth S. and Schuller, Bj\"{o}rn},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops-dev.dagstuhl.de/entities/document/10.4230/DagRep.3.11.1},
  URN =		{urn:nbn:de:0030-drops-44346},
  doi =		{10.4230/DagRep.3.11.1},
  annote =	{Keywords: Audio Analysis, Signal Processing, Machine Learning, Sound, Speech, Music, Affective Computing}
}

Document

Complete Volume

DOI: 10.4230/DFU.Vol3.11041

DFU, Volume 3, Multimodal Music Processing, Complete Volume

Authors: Meinard Müller, Masataka Goto, and Markus Schedl

Published in: Dagstuhl Follow-Ups, Volume 3, Multimodal Music Processing (2012)

Abstract

DFU, Volume 3, Multimodal Music Processing, Complete Volume

Cite as

Multimodal Music Processing. Dagstuhl Follow-Ups, Volume 3, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2012)

Copy BibTex To Clipboard

@Collection{DFU.Vol3.11041,
  title =	{{DFU, Volume 3, Multimodal Music Processing, Complete Volume}},
  booktitle =	{Multimodal Music Processing},
  series =	{Dagstuhl Follow-Ups},
  ISBN =	{978-3-939897-37-8},
  ISSN =	{1868-8977},
  year =	{2012},
  volume =	{3},
  editor =	{M\"{u}ller, Meinard and Goto, Masataka and Schedl, Markus},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops-dev.dagstuhl.de/entities/document/10.4230/DFU.Vol3.11041},
  URN =		{urn:nbn:de:0030-drops-36023},
  doi =		{10.4230/DFU.Vol3.11041},
  annote =	{Keywords: Sound and Music Computing, Arts and Humanities–Music, Multimedia Information Systems}
}

Document

DOI: 10.4230/DFU.Vol3.11041.i

Frontmatter, Table of Contents, Preface, List of Authors

Authors: Meinard Müller, Masataka Goto, and Markus Schedl

Published in: Dagstuhl Follow-Ups, Volume 3, Multimodal Music Processing (2012)

Abstract

Frontmatter, Table of Contents, Preface, List of Authors

Cite as

Multimodal Music Processing. Dagstuhl Follow-Ups, Volume 3, pp. 0:i-0:xii, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2012)

Copy BibTex To Clipboard

@InCollection{muller_et_al:DFU.Vol3.11041.i,
  author =	{M\"{u}ller, Meinard and Goto, Masataka and Schedl, Markus},
  title =	{{Frontmatter, Table of Contents, Preface, List of Authors}},
  booktitle =	{Multimodal Music Processing},
  pages =	{0:i--0:xii},
  series =	{Dagstuhl Follow-Ups},
  ISBN =	{978-3-939897-37-8},
  ISSN =	{1868-8977},
  year =	{2012},
  volume =	{3},
  editor =	{M\"{u}ller, Meinard and Goto, Masataka and Schedl, Markus},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops-dev.dagstuhl.de/entities/document/10.4230/DFU.Vol3.11041.i},
  URN =		{urn:nbn:de:0030-drops-34621},
  doi =		{10.4230/DFU.Vol3.11041.i},
  annote =	{Keywords: Frontmatter, Table of Contents, Preface, List of Authors}
}

Document

DOI: 10.4230/DFU.Vol3.11041.1

Linking Sheet Music and Audio - Challenges and New Approaches

Authors: Verena Thomas, Christian Fremerey, Meinard Müller, and Michael Clausen

Published in: Dagstuhl Follow-Ups, Volume 3, Multimodal Music Processing (2012)

Abstract

Score and audio files are the two most important ways to represent, convey, record, store, and experience music. While score describes a piece of music on an abstract level using symbols such as notes, keys, and measures, audio files allow for reproducing a specific acoustic realization of the piece. Each of these representations reflects different facets of music yielding insights into aspects ranging from structural elements (e.g., motives, themes, musical form) to specific performance aspects (e.g., artistic shaping, sound). Therefore, the simultaneous access to score and audio representations is of great importance. In this paper, we address the problem of automatically generating musically relevant linking structures between the various data sources that are available for a given piece of music. In particular, we discuss the task of sheet music-audio synchronization with the aim to link regions in images of scanned scores to musically corresponding sections in an audio recording of the same piece. Such linking structures form the basis for novel interfaces that allow users to access and explore multimodal sources of music within a single framework. As our main contributions, we give an overview of the state-of-the-art for this kind of synchronization task, we present some novel approaches, and indicate future research directions. In particular, we address problems that arise in the presence of structural differences and discuss challenges when applying optical music recognition to complex orchestral scores. Finally, potential applications of the synchronization results are presented.

Cite as

Verena Thomas, Christian Fremerey, Meinard Müller, and Michael Clausen. Linking Sheet Music and Audio - Challenges and New Approaches. In Multimodal Music Processing. Dagstuhl Follow-Ups, Volume 3, pp. 1-22, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2012)

Copy BibTex To Clipboard

@InCollection{thomas_et_al:DFU.Vol3.11041.1,
  author =	{Thomas, Verena and Fremerey, Christian and M\"{u}ller, Meinard and Clausen, Michael},
  title =	{{Linking Sheet Music and Audio - Challenges and New Approaches}},
  booktitle =	{Multimodal Music Processing},
  pages =	{1--22},
  series =	{Dagstuhl Follow-Ups},
  ISBN =	{978-3-939897-37-8},
  ISSN =	{1868-8977},
  year =	{2012},
  volume =	{3},
  editor =	{M\"{u}ller, Meinard and Goto, Masataka and Schedl, Markus},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops-dev.dagstuhl.de/entities/document/10.4230/DFU.Vol3.11041.1},
  URN =		{urn:nbn:de:0030-drops-34637},
  doi =		{10.4230/DFU.Vol3.11041.1},
  annote =	{Keywords: Music signals, audio, sheet music, music synchronization, alignment, optical music recognition, user interfaces, multimodality}
}

Document

DOI: 10.4230/DFU.Vol3.11041.53

A Cross-Version Approach for Harmonic Analysis of Music Recordings

Authors: Verena Konz and Meinard Müller

Published in: Dagstuhl Follow-Ups, Volume 3, Multimodal Music Processing (2012)

Abstract

The automated extraction of chord labels from audio recordings is a central task in music information retrieval. Here, the chord labeling is typically performed on a specific audio version of a piece of music, produced under certain recording conditions, played on specific instruments and characterized by individual styles of the musicians. As a consequence, the obtained chord labeling results are strongly influenced by version-dependent characteristics. In this chapter, we show that analyzing the harmonic properties of several audio versions synchronously stabilizes the chord labeling result in the sense that inconsistencies indicate version-dependent characteristics, whereas consistencies across several versions indicate harmonically stable passages in the piece of music. In particular, we show that consistently labeled passages often correspond to correctly labeled passages. Our experiments show that the cross-version labeling procedure significantly increases the precision of the result while keeping the recall at a relatively high level. Furthermore, we introduce a powerful visualization which reveals the harmonically stable passages on a musical time axis specified in bars. Finally, we demonstrate how this visualization facilitates a better understanding of classification errors and may be used by music experts as a helpful tool for exploring harmonic structures.

Cite as

Verena Konz and Meinard Müller. A Cross-Version Approach for Harmonic Analysis of Music Recordings. In Multimodal Music Processing. Dagstuhl Follow-Ups, Volume 3, pp. 53-72, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2012)

Copy BibTex To Clipboard

@InCollection{konz_et_al:DFU.Vol3.11041.53,
  author =	{Konz, Verena and M\"{u}ller, Meinard},
  title =	{{A Cross-Version Approach for Harmonic Analysis of Music Recordings}},
  booktitle =	{Multimodal Music Processing},
  pages =	{53--72},
  series =	{Dagstuhl Follow-Ups},
  ISBN =	{978-3-939897-37-8},
  ISSN =	{1868-8977},
  year =	{2012},
  volume =	{3},
  editor =	{M\"{u}ller, Meinard and Goto, Masataka and Schedl, Markus},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops-dev.dagstuhl.de/entities/document/10.4230/DFU.Vol3.11041.53},
  URN =		{urn:nbn:de:0030-drops-34665},
  doi =		{10.4230/DFU.Vol3.11041.53},
  annote =	{Keywords: Harmonic analysis, chord labeling, audio, music, music synchronization, audio alignment}
}

Document

DOI: 10.4230/DFU.Vol3.11041.73

Score-Informed Source Separation for Music Signals

Authors: Sebastian Ewert and Meinard Müller

Published in: Dagstuhl Follow-Ups, Volume 3, Multimodal Music Processing (2012)

Abstract

In recent years, the processing of audio recordings by exploiting additional musical knowledge has turned out to be a promising research direction. In particular, additional note information as specified by a musical score or a MIDI file has been employed to support various audio processing tasks such as source separation, audio parameterization, performance analysis, or instrument equalization. In this contribution, we provide an overview of approaches for score-informed source separation and illustrate their potential by discussing innovative applications and interfaces. Additionally, to illustrate some basic principles behind these approaches, we demonstrate how score information can be integrated into the well-known non-negative matrix factorization (NMF) framework. Finally, we compare this approach to advanced methods based on parametric models.

Cite as

Sebastian Ewert and Meinard Müller. Score-Informed Source Separation for Music Signals. In Multimodal Music Processing. Dagstuhl Follow-Ups, Volume 3, pp. 73-94, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2012)

Copy BibTex To Clipboard

@InCollection{ewert_et_al:DFU.Vol3.11041.73,
  author =	{Ewert, Sebastian and M\"{u}ller, Meinard},
  title =	{{Score-Informed Source Separation for Music Signals}},
  booktitle =	{Multimodal Music Processing},
  pages =	{73--94},
  series =	{Dagstuhl Follow-Ups},
  ISBN =	{978-3-939897-37-8},
  ISSN =	{1868-8977},
  year =	{2012},
  volume =	{3},
  editor =	{M\"{u}ller, Meinard and Goto, Masataka and Schedl, Markus},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops-dev.dagstuhl.de/entities/document/10.4230/DFU.Vol3.11041.73},
  URN =		{urn:nbn:de:0030-drops-34670},
  doi =		{10.4230/DFU.Vol3.11041.73},
  annote =	{Keywords: Audio processing, music signals, source separation, musical score, alignment, music synchronization, non-negative matrix factorization, parametric mod}
}

Document

DOI: 10.4230/DFU.Vol3.11041.157

Audio Content-Based Music Retrieval

Authors: Peter Grosche, Meinard Müller, and Joan Serrà

Published in: Dagstuhl Follow-Ups, Volume 3, Multimodal Music Processing (2012)

Abstract

The rapidly growing corpus of digital audio material requires novel retrieval strategies for exploring large music collections. Traditional retrieval strategies rely on metadata that describe the actual audio content in words. In the case that such textual descriptions are not available, one requires content-based retrieval strategies which only utilize the raw audio material. In this contribution, we discuss content-based retrieval strategies that follow the query-by-example paradigm: given an audio query, the task is to retrieve all documents that are somehow similar or related to the query from a music collection. Such strategies can be loosely classified according to their "specificity", which refers to the degree of similarity between the query and the database documents. Here, high specificity refers to a strict notion of similarity, whereas low specificity to a rather vague one. Furthermore, we introduce a second classification principle based on "granularity", where one distinguishes between fragment-level and document-level retrieval. Using a classification scheme based on specificity and granularity, we identify various classes of retrieval scenarios, which comprise "audio identification", "audio matching", and "version identification". For these three important classes, we give an overview of representative state-of-the-art approaches, which also illustrate the sometimes subtle but crucial differences between the retrieval scenarios. Finally, we give an outlook on a user-oriented retrieval system, which combines the various retrieval strategies in a unified framework.

Cite as

Peter Grosche, Meinard Müller, and Joan Serrà. Audio Content-Based Music Retrieval. In Multimodal Music Processing. Dagstuhl Follow-Ups, Volume 3, pp. 157-174, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2012)

Copy BibTex To Clipboard

@InCollection{grosche_et_al:DFU.Vol3.11041.157,
  author =	{Grosche, Peter and M\"{u}ller, Meinard and Serr\`{a}, Joan},
  title =	{{Audio Content-Based Music Retrieval}},
  booktitle =	{Multimodal Music Processing},
  pages =	{157--174},
  series =	{Dagstuhl Follow-Ups},
  ISBN =	{978-3-939897-37-8},
  ISSN =	{1868-8977},
  year =	{2012},
  volume =	{3},
  editor =	{M\"{u}ller, Meinard and Goto, Masataka and Schedl, Markus},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops-dev.dagstuhl.de/entities/document/10.4230/DFU.Vol3.11041.157},
  URN =		{urn:nbn:de:0030-drops-34711},
  doi =		{10.4230/DFU.Vol3.11041.157},
  annote =	{Keywords: music retrieval, content-based, query-by-example, audio identification, audio matching, cover song identification}
}

Document

DOI: 10.4230/DFU.Vol3.11041.175

Data-Driven Sound Track Generation

Authors: Meinard Müller and Jonathan Driedger

Published in: Dagstuhl Follow-Ups, Volume 3, Multimodal Music Processing (2012)

Abstract

Background music is often used to generate a specific atmosphere or to draw our attention to specific events. For example in movies or computer games it is often the accompanying music that conveys the emotional state of a scene and plays an important role for immersing the viewer or player into the virtual environment. In view of home-made videos, slide shows, and other consumer-generated visual media streams, there is a need for computer-assisted tools that allow users to generate aesthetically appealing music tracks in an easy and intuitive way. In this contribution, we consider a data-driven scenario where the musical raw material is given in form of a database containing a variety of audio recordings. Then, for a given visual media stream, the task consists in identifying, manipulating, overlaying, concatenating, and blending suitable music clips to generate a music stream that satisfies certain constraints imposed by the visual data stream and by user specifications. It is our main goal to give an overview of various content-based music processing and retrieval techniques that become important in data-driven sound track generation. In particular, we sketch a general pipeline that highlights how the various techniques act together and come into play when generating musically plausible transitions between subsequent music clips.

Cite as

Meinard Müller and Jonathan Driedger. Data-Driven Sound Track Generation. In Multimodal Music Processing. Dagstuhl Follow-Ups, Volume 3, pp. 175-194, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2012)

Copy BibTex To Clipboard

@InCollection{muller_et_al:DFU.Vol3.11041.175,
  author =	{M\"{u}ller, Meinard and Driedger, Jonathan},
  title =	{{Data-Driven Sound Track Generation}},
  booktitle =	{Multimodal Music Processing},
  pages =	{175--194},
  series =	{Dagstuhl Follow-Ups},
  ISBN =	{978-3-939897-37-8},
  ISSN =	{1868-8977},
  year =	{2012},
  volume =	{3},
  editor =	{M\"{u}ller, Meinard and Goto, Masataka and Schedl, Markus},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops-dev.dagstuhl.de/entities/document/10.4230/DFU.Vol3.11041.175},
  URN =		{urn:nbn:de:0030-drops-34725},
  doi =		{10.4230/DFU.Vol3.11041.175},
  annote =	{Keywords: Sound track, content-based retrieval, audio matching, time-scale modification, warping, tempo, beat tracking, harmony}
}

Document

DOI: 10.4230/DagRep.1.1.68

Multimodal Music Processing (Dagstuhl Seminar 11041)

Authors: Meinard Müller, Masataka Goto, and Simon Dixon

Published in: Dagstuhl Reports, Volume 1, Issue 1 (2011)

Abstract

From January 23 to January 28, 2011, the Dagstuhl Seminar 11041 ``Multimodal Music Processing'' was held at Schloss Dagstuhl~--~Leibniz Center for Informatics. During the seminar, we discussed various aspects of the automated processing of music-related documents. These documents may describe a musical work in different ways comprising visual representations (e.,g., sheet music), symbolic representations (e.,g., MIDI, tablatures, chords), acoustic representations (CD recordings), audio-visual representations (videos), or text-based metadata. In this report, we give an overview of the main contributions and results of the seminar. We start with an executive summary, which describes the main topics, goals, and group activities. Then one finds a list of abstracts giving a more detailed overview of the participants' contributions as well as of the ideas and results discussed in the group meetings and panels of our seminar.

Cite as

Meinard Müller, Masataka Goto, and Simon Dixon. Multimodal Music Processing (Dagstuhl Seminar 11041). In Dagstuhl Reports, Volume 1, Issue 1, pp. 68-101, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2011)

Copy BibTex To Clipboard

@Article{muller_et_al:DagRep.1.1.68,
  author =	{M\"{u}ller, Meinard and Goto, Masataka and Dixon, Simon},
  title =	{{Multimodal Music Processing (Dagstuhl Seminar 11041)}},
  pages =	{68--101},
  journal =	{Dagstuhl Reports},
  ISSN =	{2192-5283},
  year =	{2011},
  volume =	{1},
  number =	{1},
  editor =	{M\"{u}ller, Meinard and Goto, Masataka and Dixon, Simon},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops-dev.dagstuhl.de/entities/document/10.4230/DagRep.1.1.68},
  URN =		{urn:nbn:de:0030-drops-31457},
  doi =		{10.4230/DagRep.1.1.68},
  annote =	{Keywords: Music information retrieval, music processing, multimodality, audio, sheet music, content-based analysis, signal processing, user interaction}
}

Document

DOI: 10.4230/DagSemProc.09051.3

Case Study ``Beatles Songs'' – What can be Learned from Unreliable Music Alignments?

Authors: Sebastian Ewert, Meinard Müller, Daniel Müllensiefen, Michael Clausen, and Geraint A. Wiggins

Published in: Dagstuhl Seminar Proceedings, Volume 9051, Knowledge representation for intelligent music processing (2009)

Abstract

As a result of massive digitization efforts and the world wide web, there is an exploding amount of available digital data describing and representing music at various semantic levels and in diverse formats. For example, in the case of the Beatles songs, there are numerous recordings including an increasing number of cover songs and arrangements as well as MIDI data and other symbolic music representations. The general goal of music synchronization is to align the multiple information sources related to a given piece of music. This becomes a difficult problem when the various representations reveal significant differences in structure and polyphony, while exhibiting various types of artifacts. In this paper, we address the issue of how music synchronization techniques are useful for automatically revealing critical passages with significant difference between the two versions to be aligned. Using the corpus of the Beatles songs as test bed, we analyze the kind of differences occurring in audio and MIDI versions available for the songs.

Cite as

Sebastian Ewert, Meinard Müller, Daniel Müllensiefen, Michael Clausen, and Geraint A. Wiggins. Case Study ``Beatles Songs'' – What can be Learned from Unreliable Music Alignments?. In Knowledge representation for intelligent music processing. Dagstuhl Seminar Proceedings, Volume 9051, pp. 1-16, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2009)

Copy BibTex To Clipboard

@InProceedings{ewert_et_al:DagSemProc.09051.3,
  author =	{Ewert, Sebastian and M\"{u}ller, Meinard and M\"{u}llensiefen, Daniel and Clausen, Michael and Wiggins, Geraint A.},
  title =	{{Case Study ``Beatles Songs'' – What can be Learned from Unreliable Music Alignments?}},
  booktitle =	{Knowledge representation for intelligent music processing},
  pages =	{1--16},
  series =	{Dagstuhl Seminar Proceedings (DagSemProc)},
  ISSN =	{1862-4405},
  year =	{2009},
  volume =	{9051},
  editor =	{Eleanor Selfridge-Field and Frans Wiering and Geraint A. Wiggins},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops-dev.dagstuhl.de/entities/document/10.4230/DagSemProc.09051.3},
  URN =		{urn:nbn:de:0030-drops-19640},
  doi =		{10.4230/DagSemProc.09051.3},
  annote =	{Keywords: MIDI, audio, music synchronization, multimodal, music collections, Beatles songs}
}

Document

DOI: 10.4230/DagSemProc.09051.7

Towards Automated Processing of Folk Song Recordings

Authors: Meinard Müller, Peter Grosche, and Frans Wiering

Published in: Dagstuhl Seminar Proceedings, Volume 9051, Knowledge representation for intelligent music processing (2009)

Abstract

Folk music is closely related to the musical culture of a specific nation or region. Even though folk songs have been passed down mainly by oral tradition, most musicologists study the relation between folk songs on the basis of symbolic music descriptions, which are obtained by transcribing recorded tunes into a score-like representation. Due to the complexity of audio recordings, once having the transcriptions, the original recorded tunes are often no longer used in the actual folk song research even though they still may contain valuable information. In this paper, we present various techniques for making audio recordings more easily accessible for music researchers. In particular, we show how one can use synchronization techniques to automatically segment and annotate the recorded songs. The processed audio recordings can then be made accessible along with a symbolic transcript by means of suitable visualization, searching, and navigation interfaces to assist folk song researchers to conduct large scale investigations comprising the audio material.

Cite as

Meinard Müller, Peter Grosche, and Frans Wiering. Towards Automated Processing of Folk Song Recordings. In Knowledge representation for intelligent music processing. Dagstuhl Seminar Proceedings, Volume 9051, pp. 1-15, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2009)

Copy BibTex To Clipboard

@InProceedings{muller_et_al:DagSemProc.09051.7,
  author =	{M\"{u}ller, Meinard and Grosche, Peter and Wiering, Frans},
  title =	{{Towards Automated Processing of Folk Song Recordings}},
  booktitle =	{Knowledge representation for intelligent music processing},
  pages =	{1--15},
  series =	{Dagstuhl Seminar Proceedings (DagSemProc)},
  ISSN =	{1862-4405},
  year =	{2009},
  volume =	{9051},
  editor =	{Eleanor Selfridge-Field and Frans Wiering and Geraint A. Wiggins},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops-dev.dagstuhl.de/entities/document/10.4230/DagSemProc.09051.7},
  URN =		{urn:nbn:de:0030-drops-19666},
  doi =		{10.4230/DagSemProc.09051.7},
  annote =	{Keywords: Folk songs, audio, segmentation, music synchronization, annotation, performance analysis}
}

15 Search Results for "Müller, Meinard"

Abstract

Cite as

Abstract

Cite as

Abstract

Cite as

Abstract

Cite as

Abstract

Cite as

Abstract

Cite as

Abstract

Cite as

Abstract

Cite as

Abstract

Cite as

Abstract

Cite as

Abstract

Cite as

Abstract

Cite as

Abstract

Cite as

Abstract

Cite as

Thanks for your feedback!

Could not send message