Dagstuhl Follow-Ups, Volume 3

Multimodal Music Processing



Thumbnail PDF

Publication Details

  • published at: 2012-04-01
  • Publisher: Schloss Dagstuhl – Leibniz-Zentrum für Informatik
  • ISBN: 978-3-939897-37-8
  • DBLP: db/conf/dagstuhl/dfu3

Access Numbers

Documents

No documents found matching your filter selection.
Document
Complete Volume
DFU, Volume 3, Multimodal Music Processing, Complete Volume

Authors: Meinard Müller, Masataka Goto, and Markus Schedl


Abstract
DFU, Volume 3, Multimodal Music Processing, Complete Volume

Cite as

Multimodal Music Processing. Dagstuhl Follow-Ups, Volume 3, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2012)


Copy BibTex To Clipboard

@Collection{DFU.Vol3.11041,
  title =	{{DFU, Volume 3, Multimodal Music Processing, Complete Volume}},
  booktitle =	{Multimodal Music Processing},
  series =	{Dagstuhl Follow-Ups},
  ISBN =	{978-3-939897-37-8},
  ISSN =	{1868-8977},
  year =	{2012},
  volume =	{3},
  editor =	{M\"{u}ller, Meinard and Goto, Masataka and Schedl, Markus},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops-dev.dagstuhl.de/entities/document/10.4230/DFU.Vol3.11041},
  URN =		{urn:nbn:de:0030-drops-36023},
  doi =		{10.4230/DFU.Vol3.11041},
  annote =	{Keywords: Sound and Music Computing, Arts and Humanities–Music, Multimedia Information Systems}
}
Document
Frontmatter, Table of Contents, Preface, List of Authors

Authors: Meinard Müller, Masataka Goto, and Markus Schedl


Abstract
Frontmatter, Table of Contents, Preface, List of Authors

Cite as

Multimodal Music Processing. Dagstuhl Follow-Ups, Volume 3, pp. 0:i-0:xii, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2012)


Copy BibTex To Clipboard

@InCollection{muller_et_al:DFU.Vol3.11041.i,
  author =	{M\"{u}ller, Meinard and Goto, Masataka and Schedl, Markus},
  title =	{{Frontmatter, Table of Contents, Preface, List of Authors}},
  booktitle =	{Multimodal Music Processing},
  pages =	{0:i--0:xii},
  series =	{Dagstuhl Follow-Ups},
  ISBN =	{978-3-939897-37-8},
  ISSN =	{1868-8977},
  year =	{2012},
  volume =	{3},
  editor =	{M\"{u}ller, Meinard and Goto, Masataka and Schedl, Markus},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops-dev.dagstuhl.de/entities/document/10.4230/DFU.Vol3.11041.i},
  URN =		{urn:nbn:de:0030-drops-34621},
  doi =		{10.4230/DFU.Vol3.11041.i},
  annote =	{Keywords: Frontmatter, Table of Contents, Preface, List of Authors}
}
Document
Linking Sheet Music and Audio - Challenges and New Approaches

Authors: Verena Thomas, Christian Fremerey, Meinard Müller, and Michael Clausen


Abstract
Score and audio files are the two most important ways to represent, convey, record, store, and experience music. While score describes a piece of music on an abstract level using symbols such as notes, keys, and measures, audio files allow for reproducing a specific acoustic realization of the piece. Each of these representations reflects different facets of music yielding insights into aspects ranging from structural elements (e.g., motives, themes, musical form) to specific performance aspects (e.g., artistic shaping, sound). Therefore, the simultaneous access to score and audio representations is of great importance. In this paper, we address the problem of automatically generating musically relevant linking structures between the various data sources that are available for a given piece of music. In particular, we discuss the task of sheet music-audio synchronization with the aim to link regions in images of scanned scores to musically corresponding sections in an audio recording of the same piece. Such linking structures form the basis for novel interfaces that allow users to access and explore multimodal sources of music within a single framework. As our main contributions, we give an overview of the state-of-the-art for this kind of synchronization task, we present some novel approaches, and indicate future research directions. In particular, we address problems that arise in the presence of structural differences and discuss challenges when applying optical music recognition to complex orchestral scores. Finally, potential applications of the synchronization results are presented.

Cite as

Verena Thomas, Christian Fremerey, Meinard Müller, and Michael Clausen. Linking Sheet Music and Audio - Challenges and New Approaches. In Multimodal Music Processing. Dagstuhl Follow-Ups, Volume 3, pp. 1-22, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2012)


Copy BibTex To Clipboard

@InCollection{thomas_et_al:DFU.Vol3.11041.1,
  author =	{Thomas, Verena and Fremerey, Christian and M\"{u}ller, Meinard and Clausen, Michael},
  title =	{{Linking Sheet Music and Audio - Challenges and New Approaches}},
  booktitle =	{Multimodal Music Processing},
  pages =	{1--22},
  series =	{Dagstuhl Follow-Ups},
  ISBN =	{978-3-939897-37-8},
  ISSN =	{1868-8977},
  year =	{2012},
  volume =	{3},
  editor =	{M\"{u}ller, Meinard and Goto, Masataka and Schedl, Markus},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops-dev.dagstuhl.de/entities/document/10.4230/DFU.Vol3.11041.1},
  URN =		{urn:nbn:de:0030-drops-34637},
  doi =		{10.4230/DFU.Vol3.11041.1},
  annote =	{Keywords: Music signals, audio, sheet music, music synchronization, alignment, optical music recognition, user interfaces, multimodality}
}
Document
Lyrics-to-Audio Alignment and its Application

Authors: Hiromasa Fujihara and Masataka Goto


Abstract
Automatic lyrics-to-audio alignment techniques have been drawing attention in the last years and various studies have been made in this field. The objective of lyrics-to-audio alignment is to estimate a temporal relationship between lyrics and musical audio signals and can be applied to various applications such as Karaoke-style lyrics display. In this contribution, we provide an overview of recent development in this research topic, where we put a particular focus on categorization of various methods and on applications.

Cite as

Hiromasa Fujihara and Masataka Goto. Lyrics-to-Audio Alignment and its Application. In Multimodal Music Processing. Dagstuhl Follow-Ups, Volume 3, pp. 23-36, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2012)


Copy BibTex To Clipboard

@InCollection{fujihara_et_al:DFU.Vol3.11041.23,
  author =	{Fujihara, Hiromasa and Goto, Masataka},
  title =	{{Lyrics-to-Audio Alignment and its Application}},
  booktitle =	{Multimodal Music Processing},
  pages =	{23--36},
  series =	{Dagstuhl Follow-Ups},
  ISBN =	{978-3-939897-37-8},
  ISSN =	{1868-8977},
  year =	{2012},
  volume =	{3},
  editor =	{M\"{u}ller, Meinard and Goto, Masataka and Schedl, Markus},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops-dev.dagstuhl.de/entities/document/10.4230/DFU.Vol3.11041.23},
  URN =		{urn:nbn:de:0030-drops-34644},
  doi =		{10.4230/DFU.Vol3.11041.23},
  annote =	{Keywords: Lyrics, Alignment, Karaoke, Multifunctional music player, Lyrics-based music retrieval}
}
Document
Fusion of Multimodal Information in Music Content Analysis

Authors: Slim Essid and Gaël Richard


Abstract
Music is often processed through its acoustic realization. This is restrictive in the sense that music is clearly a highly multimodal concept where various types of heterogeneous information can be associated to a given piece of music (a musical score, musicians' gestures, lyrics, user-generated metadata, etc.). This has recently led researchers to apprehend music through its various facets, giving rise to "multimodal music analysis" studies. This article gives a synthetic overview of methods that have been successfully employed in multimodal signal analysis. In particular, their use in music content processing is discussed in more details through five case studies that highlight different multimodal integration techniques. The case studies include an example of cross-modal correlation for music video analysis, an audiovisual drum transcription system, a description of the concept of informed source separation, a discussion of multimodal dance-scene analysis, and an example of user-interactive music analysis. In the light of these case studies, some perspectives of multimodality in music processing are finally suggested.

Cite as

Slim Essid and Gaël Richard. Fusion of Multimodal Information in Music Content Analysis. In Multimodal Music Processing. Dagstuhl Follow-Ups, Volume 3, pp. 37-52, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2012)


Copy BibTex To Clipboard

@InCollection{essid_et_al:DFU.Vol3.11041.37,
  author =	{Essid, Slim and Richard, Ga\"{e}l},
  title =	{{Fusion of Multimodal Information in Music Content Analysis}},
  booktitle =	{Multimodal Music Processing},
  pages =	{37--52},
  series =	{Dagstuhl Follow-Ups},
  ISBN =	{978-3-939897-37-8},
  ISSN =	{1868-8977},
  year =	{2012},
  volume =	{3},
  editor =	{M\"{u}ller, Meinard and Goto, Masataka and Schedl, Markus},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops-dev.dagstuhl.de/entities/document/10.4230/DFU.Vol3.11041.37},
  URN =		{urn:nbn:de:0030-drops-34652},
  doi =		{10.4230/DFU.Vol3.11041.37},
  annote =	{Keywords: Multimodal music processing, music signals indexing and transcription, information fusion, audio, video}
}
Document
A Cross-Version Approach for Harmonic Analysis of Music Recordings

Authors: Verena Konz and Meinard Müller


Abstract
The automated extraction of chord labels from audio recordings is a central task in music information retrieval. Here, the chord labeling is typically performed on a specific audio version of a piece of music, produced under certain recording conditions, played on specific instruments and characterized by individual styles of the musicians. As a consequence, the obtained chord labeling results are strongly influenced by version-dependent characteristics. In this chapter, we show that analyzing the harmonic properties of several audio versions synchronously stabilizes the chord labeling result in the sense that inconsistencies indicate version-dependent characteristics, whereas consistencies across several versions indicate harmonically stable passages in the piece of music. In particular, we show that consistently labeled passages often correspond to correctly labeled passages. Our experiments show that the cross-version labeling procedure significantly increases the precision of the result while keeping the recall at a relatively high level. Furthermore, we introduce a powerful visualization which reveals the harmonically stable passages on a musical time axis specified in bars. Finally, we demonstrate how this visualization facilitates a better understanding of classification errors and may be used by music experts as a helpful tool for exploring harmonic structures.

Cite as

Verena Konz and Meinard Müller. A Cross-Version Approach for Harmonic Analysis of Music Recordings. In Multimodal Music Processing. Dagstuhl Follow-Ups, Volume 3, pp. 53-72, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2012)


Copy BibTex To Clipboard

@InCollection{konz_et_al:DFU.Vol3.11041.53,
  author =	{Konz, Verena and M\"{u}ller, Meinard},
  title =	{{A Cross-Version Approach for Harmonic Analysis of Music Recordings}},
  booktitle =	{Multimodal Music Processing},
  pages =	{53--72},
  series =	{Dagstuhl Follow-Ups},
  ISBN =	{978-3-939897-37-8},
  ISSN =	{1868-8977},
  year =	{2012},
  volume =	{3},
  editor =	{M\"{u}ller, Meinard and Goto, Masataka and Schedl, Markus},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops-dev.dagstuhl.de/entities/document/10.4230/DFU.Vol3.11041.53},
  URN =		{urn:nbn:de:0030-drops-34665},
  doi =		{10.4230/DFU.Vol3.11041.53},
  annote =	{Keywords: Harmonic analysis, chord labeling, audio, music, music synchronization, audio alignment}
}
Document
Score-Informed Source Separation for Music Signals

Authors: Sebastian Ewert and Meinard Müller


Abstract
In recent years, the processing of audio recordings by exploiting additional musical knowledge has turned out to be a promising research direction. In particular, additional note information as specified by a musical score or a MIDI file has been employed to support various audio processing tasks such as source separation, audio parameterization, performance analysis, or instrument equalization. In this contribution, we provide an overview of approaches for score-informed source separation and illustrate their potential by discussing innovative applications and interfaces. Additionally, to illustrate some basic principles behind these approaches, we demonstrate how score information can be integrated into the well-known non-negative matrix factorization (NMF) framework. Finally, we compare this approach to advanced methods based on parametric models.

Cite as

Sebastian Ewert and Meinard Müller. Score-Informed Source Separation for Music Signals. In Multimodal Music Processing. Dagstuhl Follow-Ups, Volume 3, pp. 73-94, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2012)


Copy BibTex To Clipboard

@InCollection{ewert_et_al:DFU.Vol3.11041.73,
  author =	{Ewert, Sebastian and M\"{u}ller, Meinard},
  title =	{{Score-Informed Source Separation for Music Signals}},
  booktitle =	{Multimodal Music Processing},
  pages =	{73--94},
  series =	{Dagstuhl Follow-Ups},
  ISBN =	{978-3-939897-37-8},
  ISSN =	{1868-8977},
  year =	{2012},
  volume =	{3},
  editor =	{M\"{u}ller, Meinard and Goto, Masataka and Schedl, Markus},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops-dev.dagstuhl.de/entities/document/10.4230/DFU.Vol3.11041.73},
  URN =		{urn:nbn:de:0030-drops-34670},
  doi =		{10.4230/DFU.Vol3.11041.73},
  annote =	{Keywords: Audio processing, music signals, source separation, musical score, alignment, music synchronization, non-negative matrix factorization, parametric mod}
}
Document
Music Information Retrieval Meets Music Education

Authors: Christian Dittmar, Estefanía Cano, Jakob Abeßer, and Sascha Grollmisch


Abstract
This paper addresses the use of Music Information Retrieval (MIR) techniques in music education and their integration in learning software. A general overview of systems that are either commercially available or in research stage is presented. Furthermore, three well-known MIR methods used in music learning systems and their state-of-the-art are described: music transcription, solo and accompaniment track creation, and generation of performance instructions. As a representative example of a music learning system developed within the MIR community, the Songs2See software is outlined. Finally, challenges and directions for future research are described.

Cite as

Christian Dittmar, Estefanía Cano, Jakob Abeßer, and Sascha Grollmisch. Music Information Retrieval Meets Music Education. In Multimodal Music Processing. Dagstuhl Follow-Ups, Volume 3, pp. 95-120, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2012)


Copy BibTex To Clipboard

@InCollection{dittmar_et_al:DFU.Vol3.11041.95,
  author =	{Dittmar, Christian and Cano, Estefan{\'\i}a and Abe{\ss}er, Jakob and Grollmisch, Sascha},
  title =	{{Music Information Retrieval Meets Music Education}},
  booktitle =	{Multimodal Music Processing},
  pages =	{95--120},
  series =	{Dagstuhl Follow-Ups},
  ISBN =	{978-3-939897-37-8},
  ISSN =	{1868-8977},
  year =	{2012},
  volume =	{3},
  editor =	{M\"{u}ller, Meinard and Goto, Masataka and Schedl, Markus},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops-dev.dagstuhl.de/entities/document/10.4230/DFU.Vol3.11041.95},
  URN =		{urn:nbn:de:0030-drops-34689},
  doi =		{10.4230/DFU.Vol3.11041.95},
  annote =	{Keywords: Music learning, music transcription, source separation, performance feedback}
}
Document
Human Computer Music Performance

Authors: Roger B. Dannenberg


Abstract
Human Computer Music Performance (HCMP) is the study of music performance by live human performers and real-time computer-based performers. One goal of HCMP is to create a highly autonomous artificial performer that can fill the role of a human, especially in a popular music setting. This will require advances in automated music listening and understanding, new representations for music, techniques for music synchronization, real-time human-computer communication, music generation, sound synthesis, and sound diffusion. Thus, HCMP is an ideal framework to motivate and integrate advanced music research. In addition, HCMP has the potential to benefit millions of practicing musicians, both amateurs and professionals alike. The vision of HCMP, the problems that must be solved, and some recent progress are presented.

Cite as

Roger B. Dannenberg. Human Computer Music Performance. In Multimodal Music Processing. Dagstuhl Follow-Ups, Volume 3, pp. 121-134, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2012)


Copy BibTex To Clipboard

@InCollection{dannenberg:DFU.Vol3.11041.121,
  author =	{Dannenberg, Roger B.},
  title =	{{Human Computer Music Performance}},
  booktitle =	{Multimodal Music Processing},
  pages =	{121--134},
  series =	{Dagstuhl Follow-Ups},
  ISBN =	{978-3-939897-37-8},
  ISSN =	{1868-8977},
  year =	{2012},
  volume =	{3},
  editor =	{M\"{u}ller, Meinard and Goto, Masataka and Schedl, Markus},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops-dev.dagstuhl.de/entities/document/10.4230/DFU.Vol3.11041.121},
  URN =		{urn:nbn:de:0030-drops-34693},
  doi =		{10.4230/DFU.Vol3.11041.121},
  annote =	{Keywords: Interactive performance, music processing, music signals, music analysis, music synthesis, audio, score}
}
Document
User-Aware Music Retrieval

Authors: Markus Schedl, Sebastian Stober, Emilia Gómez, Nicola Orio, and Cynthia C.S. Liem


Abstract
Personalized and user-aware systems for retrieving multimedia items are becoming increasingly important as the amount of available multimedia data has been spiraling. A personalized system is one that incorporates information about the user into its data processing part (e.g., a particular user taste for a movie genre). A context-aware system, in contrast, takes into account dynamic aspects of the user context when processing the data (e.g., location and time where/when a user issues a query). Today's user-adaptive systems often incorporate both aspects. Particularly focusing on the music domain, this article gives an overview of different aspects we deem important to build personalized music retrieval systems. In this vein, we first give an overview of factors that influence the human perception of music. We then propose and discuss various requirements for a personalized, user-aware music retrieval system. Eventually, the state-of-the-art in building such systems is reviewed, taking in particular aspects of "similarity" and "serendipity" into account.

Cite as

Markus Schedl, Sebastian Stober, Emilia Gómez, Nicola Orio, and Cynthia C.S. Liem. User-Aware Music Retrieval. In Multimodal Music Processing. Dagstuhl Follow-Ups, Volume 3, pp. 135-156, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2012)


Copy BibTex To Clipboard

@InCollection{schedl_et_al:DFU.Vol3.11041.135,
  author =	{Schedl, Markus and Stober, Sebastian and G\'{o}mez, Emilia and Orio, Nicola and Liem, Cynthia C.S.},
  title =	{{User-Aware Music Retrieval}},
  booktitle =	{Multimodal Music Processing},
  pages =	{135--156},
  series =	{Dagstuhl Follow-Ups},
  ISBN =	{978-3-939897-37-8},
  ISSN =	{1868-8977},
  year =	{2012},
  volume =	{3},
  editor =	{M\"{u}ller, Meinard and Goto, Masataka and Schedl, Markus},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops-dev.dagstuhl.de/entities/document/10.4230/DFU.Vol3.11041.135},
  URN =		{urn:nbn:de:0030-drops-34709},
  doi =		{10.4230/DFU.Vol3.11041.135},
  annote =	{Keywords: user-aware music retrieval, personalization, recommendation, user context, adaptive systems, similarity measurement, serendipity}
}
Document
Audio Content-Based Music Retrieval

Authors: Peter Grosche, Meinard Müller, and Joan Serrà


Abstract
The rapidly growing corpus of digital audio material requires novel retrieval strategies for exploring large music collections. Traditional retrieval strategies rely on metadata that describe the actual audio content in words. In the case that such textual descriptions are not available, one requires content-based retrieval strategies which only utilize the raw audio material. In this contribution, we discuss content-based retrieval strategies that follow the query-by-example paradigm: given an audio query, the task is to retrieve all documents that are somehow similar or related to the query from a music collection. Such strategies can be loosely classified according to their "specificity", which refers to the degree of similarity between the query and the database documents. Here, high specificity refers to a strict notion of similarity, whereas low specificity to a rather vague one. Furthermore, we introduce a second classification principle based on "granularity", where one distinguishes between fragment-level and document-level retrieval. Using a classification scheme based on specificity and granularity, we identify various classes of retrieval scenarios, which comprise "audio identification", "audio matching", and "version identification". For these three important classes, we give an overview of representative state-of-the-art approaches, which also illustrate the sometimes subtle but crucial differences between the retrieval scenarios. Finally, we give an outlook on a user-oriented retrieval system, which combines the various retrieval strategies in a unified framework.

Cite as

Peter Grosche, Meinard Müller, and Joan Serrà. Audio Content-Based Music Retrieval. In Multimodal Music Processing. Dagstuhl Follow-Ups, Volume 3, pp. 157-174, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2012)


Copy BibTex To Clipboard

@InCollection{grosche_et_al:DFU.Vol3.11041.157,
  author =	{Grosche, Peter and M\"{u}ller, Meinard and Serr\`{a}, Joan},
  title =	{{Audio Content-Based Music Retrieval}},
  booktitle =	{Multimodal Music Processing},
  pages =	{157--174},
  series =	{Dagstuhl Follow-Ups},
  ISBN =	{978-3-939897-37-8},
  ISSN =	{1868-8977},
  year =	{2012},
  volume =	{3},
  editor =	{M\"{u}ller, Meinard and Goto, Masataka and Schedl, Markus},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops-dev.dagstuhl.de/entities/document/10.4230/DFU.Vol3.11041.157},
  URN =		{urn:nbn:de:0030-drops-34711},
  doi =		{10.4230/DFU.Vol3.11041.157},
  annote =	{Keywords: music retrieval, content-based, query-by-example, audio identification, audio matching, cover song identification}
}
Document
Data-Driven Sound Track Generation

Authors: Meinard Müller and Jonathan Driedger


Abstract
Background music is often used to generate a specific atmosphere or to draw our attention to specific events. For example in movies or computer games it is often the accompanying music that conveys the emotional state of a scene and plays an important role for immersing the viewer or player into the virtual environment. In view of home-made videos, slide shows, and other consumer-generated visual media streams, there is a need for computer-assisted tools that allow users to generate aesthetically appealing music tracks in an easy and intuitive way. In this contribution, we consider a data-driven scenario where the musical raw material is given in form of a database containing a variety of audio recordings. Then, for a given visual media stream, the task consists in identifying, manipulating, overlaying, concatenating, and blending suitable music clips to generate a music stream that satisfies certain constraints imposed by the visual data stream and by user specifications. It is our main goal to give an overview of various content-based music processing and retrieval techniques that become important in data-driven sound track generation. In particular, we sketch a general pipeline that highlights how the various techniques act together and come into play when generating musically plausible transitions between subsequent music clips.

Cite as

Meinard Müller and Jonathan Driedger. Data-Driven Sound Track Generation. In Multimodal Music Processing. Dagstuhl Follow-Ups, Volume 3, pp. 175-194, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2012)


Copy BibTex To Clipboard

@InCollection{muller_et_al:DFU.Vol3.11041.175,
  author =	{M\"{u}ller, Meinard and Driedger, Jonathan},
  title =	{{Data-Driven Sound Track Generation}},
  booktitle =	{Multimodal Music Processing},
  pages =	{175--194},
  series =	{Dagstuhl Follow-Ups},
  ISBN =	{978-3-939897-37-8},
  ISSN =	{1868-8977},
  year =	{2012},
  volume =	{3},
  editor =	{M\"{u}ller, Meinard and Goto, Masataka and Schedl, Markus},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops-dev.dagstuhl.de/entities/document/10.4230/DFU.Vol3.11041.175},
  URN =		{urn:nbn:de:0030-drops-34725},
  doi =		{10.4230/DFU.Vol3.11041.175},
  annote =	{Keywords: Sound track, content-based retrieval, audio matching, time-scale modification, warping, tempo, beat tracking, harmony}
}
Document
Music Information Retrieval: An Inspirational Guide to Transfer from Related Disciplines

Authors: Felix Weninger, Björn Schuller, Cynthia C.S. Liem, Frank Kurth, and Alan Hanjalic


Abstract
The emerging field of Music Information Retrieval (MIR) has been influenced by neighboring domains in signal processing and machine learning, including automatic speech recognition, image processing and text information retrieval. In this contribution, we start with concrete examples for methodology transfer between speech and music processing, oriented on the building blocks of pattern recognition: preprocessing, feature extraction, and classification/decoding. We then assume a higher level viewpoint when describing sources of mutual inspiration derived from text and image information retrieval. We conclude that dealing with the peculiarities of music in MIR research has contributed to advancing the state-of-the-art in other fields, and that many future challenges in MIR are strikingly similar to those that other research areas have been facing.

Cite as

Felix Weninger, Björn Schuller, Cynthia C.S. Liem, Frank Kurth, and Alan Hanjalic. Music Information Retrieval: An Inspirational Guide to Transfer from Related Disciplines. In Multimodal Music Processing. Dagstuhl Follow-Ups, Volume 3, pp. 195-216, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2012)


Copy BibTex To Clipboard

@InCollection{weninger_et_al:DFU.Vol3.11041.195,
  author =	{Weninger, Felix and Schuller, Bj\"{o}rn and Liem, Cynthia C.S. and Kurth, Frank and Hanjalic, Alan},
  title =	{{Music Information Retrieval: An Inspirational Guide to Transfer from Related Disciplines}},
  booktitle =	{Multimodal Music Processing},
  pages =	{195--216},
  series =	{Dagstuhl Follow-Ups},
  ISBN =	{978-3-939897-37-8},
  ISSN =	{1868-8977},
  year =	{2012},
  volume =	{3},
  editor =	{M\"{u}ller, Meinard and Goto, Masataka and Schedl, Markus},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops-dev.dagstuhl.de/entities/document/10.4230/DFU.Vol3.11041.195},
  URN =		{urn:nbn:de:0030-drops-34737},
  doi =		{10.4230/DFU.Vol3.11041.195},
  annote =	{Keywords: Feature extraction, machine learning, multimodal fusion, evaluation, human factors, cross-domain methodology transfer}
}
Document
Grand Challenges in Music Information Research

Authors: Masataka Goto


Abstract
This paper discusses some grand challenges in which music information research will impact our daily lives and our society in the future. Here, some fundamental questions are how to provide the best music for each person, how to predict music trends, how to enrich human-music relationships, how to evolve new music, and how to address environmental, energy issues by using music technologies. Our goal is to increase both attractiveness and social impacts of music information research in the future through such discussions and developments.

Cite as

Masataka Goto. Grand Challenges in Music Information Research. In Multimodal Music Processing. Dagstuhl Follow-Ups, Volume 3, pp. 217-226, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2012)


Copy BibTex To Clipboard

@InCollection{goto:DFU.Vol3.11041.217,
  author =	{Goto, Masataka},
  title =	{{Grand Challenges in Music Information Research}},
  booktitle =	{Multimodal Music Processing},
  pages =	{217--226},
  series =	{Dagstuhl Follow-Ups},
  ISBN =	{978-3-939897-37-8},
  ISSN =	{1868-8977},
  year =	{2012},
  volume =	{3},
  editor =	{M\"{u}ller, Meinard and Goto, Masataka and Schedl, Markus},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops-dev.dagstuhl.de/entities/document/10.4230/DFU.Vol3.11041.217},
  URN =		{urn:nbn:de:0030-drops-34743},
  doi =		{10.4230/DFU.Vol3.11041.217},
  annote =	{Keywords: Music information research, grand challenges, music processing, music signals}
}
Document
Music Information Technology and Professional Stakeholder Audiences: Mind the Adoption Gap

Authors: Cynthia C.S. Liem, Andreas Rauber, Thomas Lidy, Richard Lewis, Christopher Raphael, Joshua D. Reiss, Tim Crawford, and Alan Hanjalic


Abstract
The academic discipline focusing on the processing and organization of digital music information, commonly known as Music Information Retrieval (MIR), has multidisciplinary roots and interests. Thus, MIR technologies have the potential to have impact across disciplinary boundaries and to enhance the handling of music information in many different user communities. However, in practice, many MIR research agenda items appear to have a hard time leaving the lab in order to be widely adopted by their intended audiences. On one hand, this is because the MIR field still is relatively young, and technologies therefore need to mature. On the other hand, there may be deeper, more fundamental challenges with regard to the user audience. In this contribution, we discuss MIR technology adoption issues that were experienced with professional music stakeholders in audio mixing, performance, musicology and sales industry. Many of these stakeholders have mindsets and priorities that differ considerably from those of most MIR academics, influencing their reception of new MIR technology. We mention the major observed differences and their backgrounds, and argue that these are essential to be taken into account to allow for truly successful cross-disciplinary collaboration and technology adoption in MIR.

Cite as

Cynthia C.S. Liem, Andreas Rauber, Thomas Lidy, Richard Lewis, Christopher Raphael, Joshua D. Reiss, Tim Crawford, and Alan Hanjalic. Music Information Technology and Professional Stakeholder Audiences: Mind the Adoption Gap. In Multimodal Music Processing. Dagstuhl Follow-Ups, Volume 3, pp. 227-246, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2012)


Copy BibTex To Clipboard

@InCollection{liem_et_al:DFU.Vol3.11041.227,
  author =	{Liem, Cynthia C.S. and Rauber, Andreas and Lidy, Thomas and Lewis, Richard and Raphael, Christopher and Reiss, Joshua D. and Crawford, Tim and Hanjalic, Alan},
  title =	{{Music Information Technology and Professional Stakeholder Audiences: Mind the Adoption Gap}},
  booktitle =	{Multimodal Music Processing},
  pages =	{227--246},
  series =	{Dagstuhl Follow-Ups},
  ISBN =	{978-3-939897-37-8},
  ISSN =	{1868-8977},
  year =	{2012},
  volume =	{3},
  editor =	{M\"{u}ller, Meinard and Goto, Masataka and Schedl, Markus},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops-dev.dagstuhl.de/entities/document/10.4230/DFU.Vol3.11041.227},
  URN =		{urn:nbn:de:0030-drops-34759},
  doi =		{10.4230/DFU.Vol3.11041.227},
  annote =	{Keywords: music information retrieval, music computing, domain expertise, technology adoption, user needs, cross-disciplinary collaboration}
}

Filters


Questions / Remarks / Feedback
X

Feedback for Dagstuhl Publishing


Thanks for your feedback!

Feedback submitted

Could not send message

Please try again later or send an E-mail