What sample rate and bitrate should I use for music distribution?
When you’re a musician you want your listeners to have the best experience when they listen to your audio. While there are many platforms that can make listening to your music very simple and efficient, there are things that you should be doing behind closed doors, and that’s understanding the various audio formats and when to use them. This post will help you understand a few of the most common audio formats: MP3, WAV, OGG, FLAC, AIFF, and more.
Prior to jumping in head first to this topic, we must first define what high quality audio is, and to do that we must discuss bitrate and sample rate. According to Elyse Betters on Pocket-Init, “Hi-res audio is: ‘Lossless audio that is capable of reproducing the full range of sound from recordings that have been mastered from better than CD quality music resources.'”
Sound waves are similar to ocean waves. We have the crest (high point) and the trough (low point). Similar to frames within in film, the human brain captures these sound waves in a similar fashion. Two important jargon terms we use to describe these digital audio pictures are bitrate and sample rate. The bitrate will determine the various levels for each sample your captured audio can be placed. 8-bit, 16-bit, 24-bit, and 32-bit are the most common bitrates, with 16- and 24-bit being industry standard in most cases. The sample rate is how many snapshots of the recorded audio can be taken per second. So, we have 44.1kHz, 48kHz, 96kHz, and 192kHz, while 96kHz and 192kHz being the highest quality and 44.1kHz and 48kHz being the most common. At 44.1kHz you will be capturing 44,100 digital audio pictures per second. At 48kHz you’ll be capturing 48,000 digital audio pictures per second, and so on. Note: It will take more CPU to record at higher sample rates, but you will get higher quality (discussed later in the blog post).
How do bitrate and sample rate work together?
Think of the bitrate as a vertical line and the sample rate as the horizontal line. When you’re recording 16-bit audio, you have 65,536 various points that your audio samples can be placed. Let’s assume you’re recording 16-bit, 44.1kHz (CD quality). Your audio interface (or computer) is taking 44,100 pictures per second while placing each of those pictures at the various 65,536 points to implicate the volume of the incoming audio. Bitrate acts as the Y-axis, while sample rate acts as the X-axis. The more pictures your computer or audio interface is capturing, the more accurately and higher quality your audio will be.
Human hearing range and why it correlates to the sample rate
The human range of hearing is 20Hz to 20kHz, give or take a few. When you’re recording at 44.1kHz, the highest frequency that will be recorded will be 22.05kHz. The highest captured frequency is equal to have the recording sample rate. This is called the Nyquist Frequency. So, if you’re recording at 48kHz, the highest frequency that will be recorded is 24kHz, which is 4kHz above the human range of hearing. At 96kHz, the highest frequency that will be captured is 48kHz, which is more than double the human range of hearing.
Most commonly audio will be recorded at 44.1kHz or 48kHz, because working with large file types can be a headache. However, there will be some studios with $50k digital boards that will record at 96kHz or 192kHz. However, this can seriously tax one’s computer, and it’s not necessary because that’s going to get converted down to 44.1kHz or lower when uploaded online.
Remember, recording at a higher sample rate will allow you take more digital pictures per second to achieve higher quality audio. This is most often the reason why recording at a higher sample rate is beneficial.
Streaming platforms and their audio formats
Each streaming platform has a different file type that they utilize. This is due to the fact that high quality WAV files can take a while to load, take up a lot of room, and may not load properly on the listener’s device. There are a number of reasons why that may be. For example, SoundCloud will convert your WAV file to a 128kbps MP3 file. This saves them a lot of room on their servers as well as improve load times for the tracks that its users are listening to.
Tidal, the streaming platform launched by Jay-Z, prides itself on streaming its audio in lossless, 1,411kbps WAV files. Spotify Premium streams its audio at 320kbps MP3 Vorbis files; Spotify Free streams at 128kbps, and iTunes utilizes a 256kbps AAC (Apple Audio Codec). The AAC file is Apple’s higher quality version of an MP3. And, as mentioned above, SoundCloud streams at 128kbps, which can be a disgruntling issue, as the quality greatly degrades on the platform. Depending on the track, it can often times be incredibly noticeable.
Chances are, however, you’re not going to be able to tell the difference between any of the streaming platform’s audio quality unless you have a professional studio, high-end headphones, or a tightly trained ear.
Volume levels: LUFS (Loudness Units Full Scale) and the loudness war
The Audio Engineering Society (AES) has implemented an industry standard loudness that users around the globe follow to ensure their track is “up to par.” While most engineers continue to slam their tracks with compression and hard limiting to squeeze every last ounce of headroom out of their track, there is a guide everyone follows when preparing their track for distribution on the digital service providers.
We measure loudness in LUFS, and an industry standard loudness is -16 LUFS with a true peak (TP) being no higher than -1dB. Mixing and mastering for a lower LUFS is technically better because many platforms will actually cap off at a certain LUFS. For example, iTunes: -16 LUFS; YouTube: -13 LUFS; and Spotify: -12 LUFS. Each platform various, but as the numbers go lower, the louder your track is. To put this into perspective: If you have a track that hits at -10 LUFS (extremely loud) and then upload it to YouTube, YouTube will automatically limit it to -13 LUFS. Is there really a point to mixing loud when the platform you’re going to upload to is just going to turn it down? Not really.
The loudness war refers to a track’s loudness. Essentially, engineers are all competing as to who can get their track to sound the loudest, removing all headroom. They claim that the louder the track is the better it’s going to sound. As deadmau5 once said, “You’ve got a volume knob for that.” This is extremely popular in the electronic dance music realm. Take a listen to the examples below. Set whatever device you’re listening on to a comfortable listening level and play both songs back to back. Do you hear the difference?
Toto – “Africa” (1982)
Selena Gomez – “Come & Get It” (2013)
“Come & Get It” is much more compressed and in your face, while “Africa” is much more dynamic.
Which audio format should I use?
You should use the format that’s most convenient and popular amongst consumers and the products they use. It also depends on the situation. If you’re rendering out a preview of your track to send to a friend to listen, then send an MP3. If you’re sending your final product off to mastering, then either send a 24-bit, 48kHz WAV file, an AIFF, or whatever the engineer requests. The MP3 file happens to be one of the most common and supported file types because many phones, tablets, and cars have native support for MP3 files. The MP3 file was developed by the Moving Pictures Experts Group (MPEG), and achieves a smaller file size without sacrificing a vast amount of quality. This is why it’s such a popular file type amongst professionals and consumers.
WAV and AIFF files are extremely popular amongst professionals. It’s somewhat of a “behind the scenes” file type, as most consumers don’t know what a WAV or AIFF file is. WAV stands for waveform audio file format and was designed by Microsoft and IBM to store audio bitstreams on PCs. It’s a lossless audio file type, as opposed to MP3 files. This allows for higher quality. AIFF stands for audio interchange file format and was designed by Apple back in 1988. There are a few different versions of AIFF formats, including a compressed format called AIFF-C or AIFFC, that have a multitude of compression codecs.
Ultimately, it depends on where your rendered audio is going. Is it going on SoundCloud? Upload a 128kbps MP3 file, as that’s what SoundCloud converts its files to. If you upload a 128kbps MP3 file, then a conversion won’t take place, which will save you some audio quality.
The noise floor is that subtle hissing in an audio file. According to James Wren at Prosig Noise & Vibration Blog, “the noise floor is the level of background noise in a signal, or the level of noice introduced by the system.” By recording at a higher bitrate, say, 24-bits, you’ll reduce the level of the noise floor and introduce more dynamic range for your recorded audio. Consequently, you’ll have a larger file size. All audio files will have a noise floor, and there are methods to work around the noise floor and reduce it. For example, normalizing your track is going to make it worse. Once you’ve rendered out your audio and your normalize it by, for example, 6dB, then you’re going to increase the noise floor by 6dB. This is always unwanted, and will make your mixing and mastering engineer’s head spin.
USB microphones vs. analog microphones
High quality microphones exist in both the USB and analog world. There are a few quirks–both technical and non-technical–that exist within both worlds. USB microphones are generally cheaper. They’re easier to get going, and they often have a plug and play feature–perfect for those who’re just getting started. Just plug in the microphone, install the driver(s), and away you go. Open up the audio program that you’re going to use when you record, set the input, and that’s it. Some USB microphones may come with software that will allow you to adjust the settings and enhance its capabilities and sound. The small downside with USB microphones is that they don’t have an analog to digital converter, which causes them to “guess” the bits for the bitrate. An analog microphone will leverage an external audio interface, which will do the analog to digital conversion. The USB microphone is strictly digital so it’s not going to be as accurate at placing those “digital points” on the y-axis as we discussed above.
Unlike USB microphones, analog microphones–microphones that require an external audio interface–can be more expensive. However, you’re going to get a cleaner recording because of the audio interface’s analog to digital conversion.
The following companies provide high quality USB microphones and analog microphones: Blue, RODE, Shure, and Sennheiser. The following companies provide high quality audio interfaces: Focusrite, PreSonus, and M-Audio.
Depending on your application and need, a plethora of factors come into play when choosing the proper sample rate and bitrate. When you’re recording for a track, then 24-bit, 48kHz will generally suffice. If you don’t, however, have a lot of a hard drive space, then 16-bit, 44.1kHz will be just fine. The application for each will always differ, but we hope that this information will give you a good insight as to what sample rate, bitrate, file types, and analog/digital microphones are.
As always, if you have any questions, please don’t hesitate to reach out to us. We’d be more than happy to help you out with any of the information discussed within this post.