Sounds and music are quite unique when compared to pictures and movies. The fact being that, whenever we listen to a song or music, we cannot have a wholesome view of it. But in the case of a picture we can instantaneously look at the entire picture. And if it’s a movie, we can instantaneously view the images of all frames to get this wholesome view. We can keep on looking at the pictures or frames (in the case of movies) without any time limit. This ability allows painters, cinematographers, photographers to analyze the theme without haste.
Consider sound, this is utterly impossible since the theory of uncertainty spoils the soup. One can listen to only one frame at an instant and while listening to one particular frame, the previous or next cannot be listened. So one has to compulsorily listen through the entire length if at all he needs to get a wholesome picture.
Is there any means to capture the entire length of a song and display it instantaneously?
Well, I have an idea. Let us start the analysis with Carnatic music. There are seven stable pitches (frequencies called swaram). All the possible tunes (ragas) are composed with the maximum of these seven frequencies. Similarly there are seven colours in a spectrum (the rainbow VIBGYOR). What if we represent each of the frequency with each of the colours in the spectrum?
I’ll elaborate. Now its clear that there are seven frequencies (pitches or swaras) namely Sa, Re, Ga, Ma, Pa, Tha, Ne, SA. Hence we can very easily equate each of the seven colours Red, Orange, Yellow, Green, Blue, Indigo, Violet to the seven frequencies. When u look closely, u can see a similarity. As the pitches ascend from Sa to SA, the frequency increases and likewise from Red to Violet the frequency increases making cyber computation easier.
Now for the amplitude of loudness of sound, we can use intensity value. Thus more the amplitude, the more is the intensity or brightness of the colour.
This is it, the prototype is complete. Let us apply this logic. The vocalist starts singing making a lot of variation in frequency and amplitude as the song proceeds. We start encoding the song as pixels of colour. The sampling frequency can be adjusted accordingly, we can choose one pixel for one millisecond or the user wishes.
Once the song is completed, the picture would truly be a modern art (sarcastic). As the vocalist stretches a word, the colour in the picture also stretches. As the artist breaks a line with a pause, the row in the picture ends creating a new row. This proceeds for the entire song creating a wonderful picture called the “colour of sound”.
But we have a shortcoming; this logic is applicable only to uni-track music. Meaning this logic can encode only the sound that comes out of the vocalist. If it’s a pop music which should contain numerous instruments (both real and synthetic) our logic fails bitterly. Wait, there is a solution; we can encode each of the track (instrument) separately and later we can superpose all the images into one without loosing the layers; creating the popular ‘image with layers’ or layered imagery.
Then for the lyrics, we can fix alphabets in each of the pixel. Its quite obvious that when we sample the sound at the rate of milliseconds as one pixel, the words would be in alphabetical form. That’s it. Ur favourite music track is now looking like a picture.
Application.
Any invention is useless unless it has substantial applications. Let me unroll my list. I am sure, you will be left awe struck.
1. Dynamically new compression logic. Consider a music track with 5 instruments other than the vocalist that runs for 5 min. Lets sample the music at the rate of 10 pixels per second. This creates 3000 pixels and six times of it. Considering 8 bits for each pixel the pictures requires only 17.57 Kilo bytes..! far far better than MP3 right !
Consider sound, this is utterly impossible since the theory of uncertainty spoils the soup. One can listen to only one frame at an instant and while listening to one particular frame, the previous or next cannot be listened. So one has to compulsorily listen through the entire length if at all he needs to get a wholesome picture.
Is there any means to capture the entire length of a song and display it instantaneously?
Well, I have an idea. Let us start the analysis with Carnatic music. There are seven stable pitches (frequencies called swaram). All the possible tunes (ragas) are composed with the maximum of these seven frequencies. Similarly there are seven colours in a spectrum (the rainbow VIBGYOR). What if we represent each of the frequency with each of the colours in the spectrum?
I’ll elaborate. Now its clear that there are seven frequencies (pitches or swaras) namely Sa, Re, Ga, Ma, Pa, Tha, Ne, SA. Hence we can very easily equate each of the seven colours Red, Orange, Yellow, Green, Blue, Indigo, Violet to the seven frequencies. When u look closely, u can see a similarity. As the pitches ascend from Sa to SA, the frequency increases and likewise from Red to Violet the frequency increases making cyber computation easier.
Now for the amplitude of loudness of sound, we can use intensity value. Thus more the amplitude, the more is the intensity or brightness of the colour.
This is it, the prototype is complete. Let us apply this logic. The vocalist starts singing making a lot of variation in frequency and amplitude as the song proceeds. We start encoding the song as pixels of colour. The sampling frequency can be adjusted accordingly, we can choose one pixel for one millisecond or the user wishes.
Once the song is completed, the picture would truly be a modern art (sarcastic). As the vocalist stretches a word, the colour in the picture also stretches. As the artist breaks a line with a pause, the row in the picture ends creating a new row. This proceeds for the entire song creating a wonderful picture called the “colour of sound”.
But we have a shortcoming; this logic is applicable only to uni-track music. Meaning this logic can encode only the sound that comes out of the vocalist. If it’s a pop music which should contain numerous instruments (both real and synthetic) our logic fails bitterly. Wait, there is a solution; we can encode each of the track (instrument) separately and later we can superpose all the images into one without loosing the layers; creating the popular ‘image with layers’ or layered imagery.
Then for the lyrics, we can fix alphabets in each of the pixel. Its quite obvious that when we sample the sound at the rate of milliseconds as one pixel, the words would be in alphabetical form. That’s it. Ur favourite music track is now looking like a picture.
Application.
Any invention is useless unless it has substantial applications. Let me unroll my list. I am sure, you will be left awe struck.
1. Dynamically new compression logic. Consider a music track with 5 instruments other than the vocalist that runs for 5 min. Lets sample the music at the rate of 10 pixels per second. This creates 3000 pixels and six times of it. Considering 8 bits for each pixel the pictures requires only 17.57 Kilo bytes..! far far better than MP3 right !
2. Secrecy. With encryption techniques grown to sky heights, imagine how all could this piece of picture be played with creating an ultimately esoteric information that smoothly unfolds to a music track (or a clandestine dialogue) only for the informed yet simply remains to be some sort of a chaotic picture for the uninformed.
3. Finally, best of all the applications, we can make the deaf merrily transcend into the beautiful world of sounds and music just by looking at the ‘colour of sound’ picture. Ya if suitably trained to interpret the picture as sound, a deaf person can really LISTEN to music as he looks into the picture engrossed.
Sound is indeed colourful!
Sound is indeed colourful!
8 comments:
hi suraj..
till yesteday, i've never heard about it.. but today, im amazed.. the wiki link is great.. n reading tat was even greater.. could'nt believe tat such a mental anomaly could exist n so much of research is going on.. but sincerely, im not suffering from that..
the article is genuinely my imagination...
bye
and u imagined all this.. cool atma,.. had u not mentioned it as sarcastic (a competed song as a modern art) i wudn't have understood... keep going..
Machi... write up somethin da..
have u moved ahead beyond this?
the comprssion logic wud work?....i dont think the picture cud be decoded to create the sound from which it was originaly made....
there r many anomalies in it...
hi.. think u were quite surprised to c 2 posts in my otherwise dormant blogg..heheee.. i dint get tat low pass filter thing...u mean i omitted sum stuff??
ya, i believe u had suppressed a few very interesting events.
quite delayed comment... yet..
the theory logic is fine.. supposing there are 6 different layers(5+1) as u say, where lies the difference.? the basic sound of each of the instruments is different.. so at the decoding side, how are we gonna reconstruct them.. in short at the decoding side, how can we made a guitar sound like a guitar..?
a solution might be we need a software to do this.. first we need to have an attribute while encoding tat has the code for each instrument. at the decoding side, the software maps the code to the instrument n the layer is decoded correspondingly so that the music can be reproduced with (acceptable) accuracy
mayb u had this in ur mind while writing?
took a while for me to get the essence of this post.. hence the delay.. looking forward to more discussions with u on this topic..
Hello Sir,
That was really a novelty. But, i would like to know if u had made it to a reality in someway or other. If so, can u mark a few points about this same idea in reality. How did it work out???
Post a Comment