diGLo

diGLo


DIGLO (Information Retrieval of Musical Gesture), developed between 2008 and 2009 at Leem UNLP, was an early experimental system for Music Information Retrieval focused not on symbolic music data (scores, MIDI, video annotation with audio descriptors), but on embodied musical performance understood as a multimodal, temporal, and gestural phenomenon. At a time when MIR research was largely dominated by audio feature extraction, score alignment, and corpus-based similarity metrics, DIGLO proposed a framework in which musical meaning emerges from the correlation between movement annotated in video,sound by audio descriptors, and a 3D timeline with a cross-parametrical visualization.

Conceptually, DIGLO framed MIR as a four-stage process: observation, organization, extraction, and generation. Rather than treating performance data as static inputs to be classified, the system treated them as evolving traces that could be annotated, clustered, reinterpreted, and recursively fed back into both analysis and creation. Video-based motion tracking and vocal pitch extraction were aligned on a shared temporal axis, allowing gesture and sound to be analyzed as co-dependent processes rather than independent streams.

Technically and epistemologically, DIGLO anticipated several later developments:

  • The integration of qualitative interpretation with quantitative data mining.
  • The use of temporal windows and multivariate correlation as primary analytical tools.
  • The idea of MIR systems as instruments for thought and creation, not merely analytical utilities.
  • A recursive, “genetic” conception of tools that evolve through their own analytical outputs.

In this sense, DIGLO was less a finished software product than a research prototype articulating a methodological position: that musical gesture cannot be reduced to audio descriptors alone, and that performance analysis requires tools capable of handling heterogeneity, ambiguity, and emergence.

Comparison with contemporaneous software (circa 2008–2009)

High-end motion capture existed but was expensive, laboratory-bound, and typically disconnected from musical analysis frameworks. DIGLO instead explored low-level video analysis as a pragmatic, research-driven approach, embedding gesture analysis directly into musical inquiry.

  1. Early MIR research platforms and video Annotators (ANVIL, MARSYAS, MIRtoolbox)

    These focused on batch processing of audio corpora, feature extraction, and classification. DIGLO contrasted sharply by privileging real-time observation, interpretive flexibility, and performative context over large-scale statistical evaluation.

From today’s perspective, DIGLO can be seen as anticipating later trends such as embodied music cognition, multimodal MIR, interactive machine learning, and practice-based research tools. Its emphasis on visualization, temporal alignment, and recursive tool evolution aligns more closely with post-2015 research trajectories than with its immediate contemporaries.

Positioning

Historically, DIGLO occupies an intermediate zone between:

  • MIR as engineering discipline,
  • performance studies,
  • artistic research,
  • and speculative tool-making.

Its conceptual grounding in diglossia and heteroglossia frames musical gesture as a site of tension between multiple languages: body and voice, intuition and formalization, observation and theory. In this sense, DIGLO does not seek a metalanguage of music, but a system capable of sustaining multiplicity while still allowing measurement, comparison, and discourse.

references

@article{article,
author = {Martínez, Isabel and Español, Silvia},
pages = {297-305},
journal = {Proceedings of the 7th Triennial Conference of European Society for the Cognitive Sciences of Music}
year  = {2009},
title = {Image-schemas in parental performance}
doi = {URN:NBN:fi:jyu-2009411279}
}

@article{Wanderley2001,
  author  = {Marcelo M. Wanderley},
  title   = {Gestural Control of Music},
  journal = {International Workshop on Human Supervision and Control in Engineering and Music},
  year    = {2001}
}

@article{Camurri2004,
  author  = {Antonio Camurri and Gualtiero Volpe},
  title   = {Gesture-Based Interaction Techniques for Music},
  journal = {Trends in Gestural Control of Music},
  year    = {2004},
  publisher = {IRCAM}
}

@book{Leman2008,
  author    = {Marc Leman},
  title     = {Embodied Music Cognition and Mediation Technology},
  year      = {2008},
  publisher = {MIT Press},
  address   = {Cambridge, MA}
}

@article{Downie2003,
  author  = {J. Stephen Downie},
  title   = {Music Information Retrieval},
  journal = {Annual Review of Information Science and Technology},
  year    = {2003},
  volume  = {37},
  pages   = {295--340}
}

@book{Artaud1974,
  author    = {Antonin Artaud},
  title     = {Héliogabale ou l’anarchiste couronné},
  year      = {1974},
  publisher = {Gallimard},
  address   = {Paris}
}