This is a topic I covered more than once on Real Home Recording. It’s an important one and one that even experienced engineers may not fully understand. First, bit depth:
Bit depth is resolution…but that does not mean you can always hear it. The difference between 8-bit and 16-bit audio is very audible. That’s because the digital noise floor (which consists of digital quantization noise) sits at around -48 dBFS (decibels full scale). That is worse than the noise floor of a vinyl record!
Every bit of data is equal to around 6 decibels (abbreviated dB) of resolution. So, a 16 bit audio file has a a digital quantization noise floor at approximately -96 dBFS. This is much better than the best vinyl records and more than adequate for dynamic orchestral recordings. It is also below the noise floor of tape.
So, why did 24-bit audio (and 20 bit audio before it) enter the picture? Production. When you combine a bunch of 16-bit tracks, the digital noise floor can cumulatively cause audible noise. This forced engineers to “record to the top” of 16-bit recording devices to avoid that problem. This introduced further problems with nasty digital distortion AKA clipping and (transient killing) analog saturation.
When 20 bit recorders were released, the noise floor sat at -120 dBFS. This allowed recording engineers to set input signals at around -20 dBFS which gave them more headroom. Transients were preserved without clipping and with a lower digital noise floor.
Truth be told, 20 bit audio would be the ideal recording format (since it uses less storage space) if not for computers being able to calculate 24-bits of audio better. Or at least that’s what I’ve been told.
Some DAWs may give you a 32-bit option. Don’t bother. It’s a waste of space and has zero benefit. Only use the 32-bit floating point (FP) format as an intermediate format. Uses include track/buss bounces, mixdowns and master files.
24-bit audio when it comes to delivery formats is a waste of space. 16-bit audio more than does the job but some audiophiles insist on it. So if they are willing to spend more money on 24-bit files I say go for it. I went into depth on these topics in the following videos, in case you want to learn more:
Now as for sample rate, this one is a bit more complicated and controversial.
The too long didn’t read version is capture audio at 96 kHz because it is supported by the most amount of plugins and interfaces. Its benefits lie in digital signal processing, not in audible frequencies. Latency is also reduced between 44.1 kHz and 96 kHz recordings. Above 96 kHz is a waste of space.
The more in depth explanation is that recording at 44.1 kHz usually isn’t good enough, especially on cheaper audio interfaces. The transition band (the area where the signal gradually rolls off at the top frequencies) is too close to Nyquist and many plugins require oversampling to function properly. It is best to use 88.2 kHz or 96 kHz for more overhead. There won’t be a direct audio difference upon capture but once plugins begin crunching the ones and zeros they will have more data to work with.
To me, mixing with 96 kHz simply sounds “sweeter” or more “analog”. Plugins that make use of pitch/time stretch (Auto Tune/Melodyne), time domain variables such as compressors (including de-essers and dynamic equalizers), guitar amp simulators and certain equalizers (particularly in the upper frequency ranges) benefit the most. If you use Acustica Audio plugins, they are natively sampled at 96 kHz.
It’s the year 2017. Processors are very fast (Intel’s i7 7700K is a beast at just $325) and hard drive space is cheap. There is little reason to NOT use 96 kHz while recording. It’s a shame that more virtual instrument libraries aren’t 96 kHz.
The old “avoid sample rate conversion at all costs” argument is null these days because converters are very good. I personally recommend Voxengo r8brain (free) or r8brain pro in minimum phase mode. The SoX sample rate converter that can be found in Audacity or as a command line program (if you are good with old school software) is also very nice.
As for delivery sample rates though? My suggestion is to go to 48 kHz. I explain how I reached this conclusion in this video:
I go into more detail about production formats vs. delivery formats in this video:
So, the bottom line:
• Record at 24-bit integer and 96 kHz sample rate broadcast wave or AIFF.
• Mix at 24-bit/96 kHz.
• Mix down, bounce and master to 32-bit floating point 96 kHz.
• Deliver at 16-bit integer 48 kHz FLAC*. Sample rate convert with Voxengo r8brain or SoX. Then AFTER sample rate conversion use either iZotope MBit+ (built into their Ozone plugin) or Airwindows Not Just Another Dither/CD.
*The only exception is when creating MP3/AAC lossy codec files. In this case, changing bit depth isn’t necessary. For sample rate, use 44.1 kHz. For the Opus codec (which hopefully becomes the standard one day) use 48 kHz because that is the only sample rate that it currently supports.