Blog Post

Sampling Sound From Pictures

A great video came up in my YouTube feed today. A video from the excellent Computerphile channel caught my eye. It concerned turning pictures of sound waves back into audio files. It was entitled How NOT to Sample Audio!

The basic method used was as follows:

  • Get a screen grab of a sound file waveform (in the time domain)
  • Loop through the columns of the BMP picture file to find and extract the approximation of the waveform
  • Brightness is used to detect if the difference between background and the sound
  • A loop is used to pick out column max and min heights
  • Store these values as the sound (basically a series of values
  • To compensate for low resolution, a stretch is required to make up for fact the resolution of the image is less in columns than you would have samples, in an audio file
  • Values added between samples to enable the stretch
  • Add the WAV file header information to the series of numbers you have created

In the example in the film, an 8 Bit sound generated in a 35k file (ASCII). Clearly the WAV to graphics accuracy is dependant on the number of screen pixels used.

The result reminded me of the first voice synthesis I heard from the Commodore 64 game, Ghostbusters! The magic of hearing “you slimed me” is etched in my mind,

Reading the comments on the video I also noticed someone had mentioned a fascinating project called the Visual Microphone. A quick search of the internet revealed the following paper and website. The Visual Microphone: Passive Recovery of Sound from Video

http://people.csail.mit.edu/mrub/VisualMic/

That looks like the next rabbit hole to dive down…