Why do some songs sound like garbage when you speed them up (and others don't)?
Ever listen to podcasts at 2x speed and everything sounds fine until a song comes on?
Like many people, I listen to podcasts sped up. Anywhere from 1.5x - 2x speed, and it's great. I probably wouldn't listen to podcasts at all if I couldn't speed them up. These days, people's voices don't sound like chipmunks when you speed up digital audio, because instead of merely playing the audio faster, the sample rate is reduced and the remaining audio is stitched together. Think of it like pulling out tiny slices of audio and then pressing the remaining slices together - this is called the "sampling method," and it's used for all types of media content that have duration - (think podcasts and videos). Nice coverage of the topic from WaPo here.
But sometimes, it doesn't work so well. If you consume much sped-up media, you may notice that music often sounds terrible, while speech almost always sounds natural. Why?
To begin with, of course music is going to sound somewhat worse -- lower sample rates entail lower fidelity. It's a general guideline to record audio of all types in 44100HZ - which means audio is sampled 23 times every .00050th of a second -- or 46,000 times in a second. Which seems really high. So even if you speed a 44100HZ video up 2x, you still get 23,000 samples per second. It's still going to be recognizable with that much data, but it's longer going to be as high-def.
Initial research led me to think that maybe songs with higher pitches get hit harder, because you actually lose the ability to accurately represent high notes with lower sample rates:
But after digging in a bit, I realized that pitch isn't predictive of how badly a song gets screwed up by the sampling method. Some styles lose their sense of rhythm, and others get a "buzzing" effect, sort of like the music is imitating bees. So I decided to try listening to a bunch of styles of music, on YouTube, sped up, to figure out what characteristics cause a song to degrade (so you don't have to).
First, I wanted to make sure that YouTube's speed controller is in fact using the sampling method (described above). I checked out the code, and it uses Chromium's API to change the speed. And sure enough, Chromium changes the sample rate of the audio, so the sampling method is confirmed.
I found these characteristics to ruin a song sped up: vibrato and tempo defined by similar sounds...
If you have other theories or songs that don't fit these patterns and still sound like garbage when you speed them up, let me know. Also, I found that vibrato gets ruined when it's sped up with the sampling technique, but can anyone explain why? Since the distance between two notes used in vibratto is essentially infinite, maybe we lose too many samples of the note changing, causing the 'buzzing' effect... something along those lines? Tweet me @jonsjournals.