Cross-modal mapping allows information from one sensory domain—like sound—to control or influence another, such as visuals. It’s the foundation of many computational art systems that blur the line between hearing and seeing.
Translating Between Senses
Cross-modal mapping has an easily understandable concept: data can represent both sound and image as long as it is read differently. The same waveform is simply a string of numbers that can be used to define color, movement or intensity of light.
Visual artists and developers utilize this method to synchronize images with audio or generate sound from the patterns in the visuals. As an example, a low bass frequency could determine how bright a light would be, while high frequencies (treble) could affect the particles’ movement. Cross-modal mapping converts one type of energy into another type of expression by associating different types of sensory input with the same format.
Cross-modal mapping is an essential element of live performance, generative design and experimental media. In these applications, perception is both interactive and multi-dimensional.
Common Grounds: Data
Cross-modal mapping is typically about taking data from one medium (like video) and using it in another (like audio). In other words, an example would be to take an audio amplitude, scale it between 0 and 1 and use that as the X and Y for the position of a graphic.
Another example would be the frequency of your audio affecting the color hue you see, or the density of texture you feel. What matters is that the brain will perceive a relationship and therefore coherence between the two mediums based on their correlated movements. It does not have to be direct – it can be metaphorical. For example, a song played slowly may produce warmer colors, while an erratic drum beat produces chaotic motions.
A machine learning approach takes this concept to a new level by enabling systems to “learn” what relationships are between sensory patterns, so they do not have to rely on pre-determined formulae.
Reflection
Cross-modal mapping turns data into a bridge between senses. It invites viewers to hear movement and see rhythm, merging logic and intuition. By connecting what we see with what we hear, computational systems remind us that creativity often begins not in one medium, but in the dialogue between them.
