Laman

Lumo Spiral Visualization Of Audio For The Deaf

...and for all other truthful lovers of music too sounds...

In a previous text, I proposed a visualization of the sounds that would allow the deaf people to listen – including the distinguishing of vowels, accents, dissimilar people's voices, consonants, other noises, music, frequencies, octaves, melodies, dissimilar musical instruments, chords too several people talking or singing simultaneously, too everything else.



The view is that the encephalon connected to the oculus just gets trained to evaluate real similar data equally the data that is coming from the ear. Ideally, the oculus (plus a slice of the encephalon connected to it) should run "almost the same" equally the ear.

Someone has mentioned the WinAmp visualizations (or Milk Drop), click for an example. Yes, that's what I roughly hateful except that this video instance too most others don't allow me to "hear" anything. They seem similar pretty pictures that are just slightly affected past times the good that is beingness played but there's no straightforward mode to extract the precise good from the picture.




I conduct maintain some problem to acquire the spectrum from a trimmed good file inwards Mathematica – some good commands don't seem to be inwards my version of Mathematica at all. So allow me pull what I conduct maintain inwards heed almost just too thence that a amend programmer may write the total code.




You accept the latest 0.1 seconds of your music too practice the Fourier decomposition (the paradigm should live updated lx times a second, for every refreshed screen). All the Fourier components conduct maintain frequencies that are integer multiples of 10 Hz. You may desire to brand the spectrum continuous past times Fourier transforming the whole good for \(t\lt 0\), but alongside the ability multiplied past times \(\exp(t/t_0)\) for \(t\lt 0\) where \(t_0\) is 1/10 of a second, OK?

Now, y'all withdraw to visualize the powers \(P(f)\) equally a color inwards the polar coordinates \((r,\phi)\) inwards the plane. For a given frequency \(f\), the coordinates are given by\[

\eq{
\phi &= 2\pi \cdot \frac{\log (f/10\,{\rm Hz})}{\log 2},\\
r &= 10-\frac{\phi}{2\pi} \pm 0.5
}

\] The term \(\pm 0.5\) agency that the ability of a given frequency is visualized inwards a whole business interval betwixt 2 points of a spiral. The spiral has \(r\) proportional to \(\phi\).

If y'all add together some element of the good alongside a doubled frequency \(2f\), it volition live visualized inwards the adjacent arc of the spiral, i.e. for \(\phi\to \phi+2\pi\) too \(r\to t-2\pi\).

The term \(10\) inwards the radial coordinate says that the real depression frequencies roughly \(f\approx 10\,{\rm Hz}\) that agree to \(\phi=0\) volition live seen equally points alongside \(r\approx 10\), real far from the source of the coordinates. On the contrary, the highest audible frequencies volition live almost \(10\times 2^{10}\,{\rm Hz}\) or 10 kilohertz, too they volition live represented past times the features of the paradigm close the origin.

The spiral volition live injure roughly almost 10 times, which corresponds to the 10 octaves (a factor of 1024 inwards frequencies) that this "ear" volition live able to "hear". I uncovering it natural that the high-pitch sounds are going to live drawn close the request at the origin.

The visualization along the spirals where \(\phi\) is linear inwards the logarithm of the frequency guarantees that if y'all accept whatever good too just uniformly increase the frequency past times a factor, y'all increase the note or the pitch, the corresponding motion-picture exhibit volition live just rotated relatively to the master copy one. Helpfully enough, i octave is composed of 12 half-tones, too thence i half-tone volition agree to the same angle equally i lx minutes on the clocks!

The coefficient relating \(\phi\) too \(\log f\) is chosen inwards such a mode that i revolution corresponds to i octave. There are 10 octaves visible inwards the elipse – almost 10 layers of the elipse volition acquire far to the picture.

To brand the good fifty-fifty to a greater extent than readable, y'all should visualize the ability \(P(f)\) equally the intensity of the cherry color at the request \(r,\phi\), too the powers of \(P(3f/4)\) too \(P(5f/4)\) should live drawn equally the intensity of light-green too bluish at the same point, respectively. Maybe the cherry too light-green should live reserved for ratios \(P(3f/4)/P(f)\) too \(P(5f/4)/P(f)\), I am non sure. The add-on similar that could brand "nice chords" (like C+E, C+G) straightaway distinguishable from the "ugly chords" (C+H).

But the principal request is that if y'all conduct maintain a uniform good that lasts, similar some note or a vowel, at that topographic point should live a corresponding paradigm that looks similar a particular, non rotationally invariant, object – alongside a exceptional fingerprints of color – that sits on a spiral clock. When y'all just increase the frequency of the sound, the object should acquire rotated too nix else.

If the details are adjusted correctly, the vowels U,O,A,E,I (I hateful OO, AW, AH, EH, EE) should live visualized equally unforgettable objects that y'all may develop to recognize just similar recognize the vowels. If the good just changes the accent, these objects slightly change, too. You may develop yourself to recognize these pocket-sized deformations, too. Similarly, the consonants would agree to some noisy pictures. The consonants in all probability comprise to a greater extent than oftentimes than non higher frequencies, too thence they would live objects sitting closed to the origin. And they don't conduct maintain equally good well-defined "main" frequencies too thence their grapheme would live to a greater extent than rotationally symmetric too boring.

You know, the destination should live that from the characteristic pictures that y'all generate for given sounds, the deaf individual volition live able to listen the melody, chords, which instruments play them, whether the vocalist has proficient pitch (whether the clocks are just sitting at the whole hours). And when 2 people sing together, the visualization higher upward should even too thence honour some superposition regulation too thence the deaf individual should see-hear overlapping pictures on top of each other, which powerfulness even too thence live reconstructed. H5N1 deaf individual who is non tone-deaf volition acquire to appreciate complicated chords from the pictures, too.

Maybe y'all desire to pull to a greater extent than data extracted from the ability spectrum – inwards betwixt the ellipses too elsewhere. But I shout back that y'all should honour 1) the superposition regulation (the sounds combine to live heard together should live translated to overlapping images), 2) the fact that the uniform increase of the frequency corresponds to a elementary rotation/scaling (well, deformed scaling because I decided to brand the spiral linear, non exponential, afterwards all; y'all may desire to reparameterize the \(r\) coordinate past times some nonlinear transformation), 3) the total ability spectrum \(P(f)\) should live possible to decode from the motion-picture exhibit at each moment.

If someone has understood what I am maxim too tin write the program, it would live prissy if he could create the video file alongside the visualization of some speech, song, or a symphony. Needless to say, all the powers too colors should live adjusted too thence that the motion-picture exhibit is neither equally good dim, nor it surpasses the maximum possible intensity of the color etc. It must live chromatically balanced to transmit maximum information.

I believe that the ear effectively plant equally some touch-sensitive organ that perceives pretty much equivalent shapes that y'all pull alongside your computer programme sketched above, but perceives them past times "touch", non past times "vision" – too thence the ear basically plant similar the fingers' peel that reads the visualization y'all are going to code equally if it were Braille (the writing organization for the blind people, to acquire far to a greater extent than confusing LOL). But the relative representation of dissimilar frequencies must live perceived past times the ear equally some "characteristic fingerprint" that touches dissimilar places of the touching sensors inwards the ear – which analyze the frequencies.

You tin brand the whole organization stereo – form images for the left oculus too the right oculus that are calculated from the left speaker too right speaker, respectively. At least, y'all may pull both of them adjacent to each other for the starters.

Note that the deaf people alongside their caput too display kept directly would conduct maintain perfect pitch. To acquire far actually nice, concert pitch (the frequency 440 Hz, the H5N1 higher upward the heart too soul C) should live rotated too thence that it points towards "12" on the clocks. ;-)

Just a consistency cheque that y'all understood the basic idea: if y'all play C, C#, D, D#, E, F, F#, G, G#, A, A#, H, C, on the pianoforte (in well temperament where the frequency ratios are powers of the twelfth root of two), your visualization should just exhibit the brusk lx minutes mitt jumping from 12:00, 13:00, ... dorsum to 12:00. Music should live pretty too logical! Chords would live similar several brusk lx minutes hands added on a clock: the chord C+E+G should hold off similar 3 hands on a clock pointing to 12, 4, 7. The shape of these brusk lx minutes hands would stand upward for the musical musical instrument (the relative representation of higher harmonics), too and thence on.

I promise to brand to a greater extent than progress alongside the Mathematica deconstruction of the good files inwards coming days or weeks if no i creates it earlier me.

No comments:

Post a Comment