Digital audio

By Nick Kovarik 

An analog recording resembles the waveform of the original sound; they are analogous, in other words. Audiotape, vinyl, and wax cylinders have a contiguous representation of a recorded waveform represented in its surface.

A digital recording however is captured using a numerical representation of the original signal’s frequency and amplitude. To do this, the frequency and amplitude of the signal are captured by an analog-to-digital (ADC) converter, such as a modern computer or its audio interface. Then to reproduce that signal, a digital-to-analog (DAC) is employed. A modern computer or phone has both, but in a high end audio world, sometimes external devices are still used.

In digital, sampling is the process of capturing the waveform over time with regards to the frequency, and quantization is the process of capturing the amplitude at a snapshot in the sample.

  • Sampling
    • The process of taking periodic samples (voltages) of the original signal at fixed intervals, the rate of which is known as the sampling rate
    • Think of sampling as a video cameras frame rate. The higher the sample rate, the more frames of definition we have in the product.
    • The most common sampling rates are 44.1khz (early .mp3 quality) and 48khz (early dvd quality.) However, rates that double those (such as 88.2khz and 96khz) are also quite common. There are also ultra-high definition sample rates, such as 192khz, which is about as close to an analog signal as is necessary in everyday use.
      • Sample rates outside of the ones above are very rare, as they are not easily converted to other sample rates.
    • The Nyquist Theorem states that to be successfully encoded, a signal has to be sampled at a rate at least twice its highest frequency. Consider capturing a 10hz frequency. If you were to sample that wave at 10hz, there would be no variation (compression / rarefaction) to the capture and it would produce no sound. Therefore, a sample rate of 20hz should capture the peak and trough of your waveform.
  • Quantization
    • The number of bits taken per sample, more commonly referred to as Bit Depth.
    • The Bit Depth affects the dynamic range of a wave, the underlying noise, and the distortion present in the recording.

Note: a 16 bit system has 65,536 quantizing levels, and a 24 bit system has 16.8 million quantizing levels.  

  • Bit Depth affects the Dynamic Range of a recording.
    • The dynamic range is the relationship between the quietest and loudest sounds inside a recording.
    • The Signal to Noise Ratio (S/N) (also called the Noise Floor) is the relationship between the quietest sounds in a recording and the underlying noise present. 
    • Even in a recording space with no noise whatsoever (not usually possible on a budget) the system, from mics to cables to audio interface, adds noise at every stage, usually in very low manageable levels.
    • Note: in audio, noise is a broad term that refers to any unwanted sounds, such as background sounds and some types of distortion.
    • The Headroom is the relationship between the loudest sounds in a recording and the maximum amplitude your system can handle.
  • Clipping occurs when the amplitude of your signal overloads the system it is recorded on, typically represented by a red light on the audio channel. For digital recordings, this is digital distortion, and should be avoided wherever possible.
    • Some types of distortion are wanted in recordings. For example: The sound we associate with electric guitars, sometimes called overdrive, is created by increasing the amplitude of a guitars signal to a level that early tube amplifiers could not handle. This is called harmonic distortion, and sounds MUCH better than digital distortion.
    • In brief, the Bit Depth of a recording is the resolution affecting the amplitude of a signal. While it does not guarantee an optimal noise floor or useful headroom, a higher Bit Depth allows more resolution to alter or correct minimal headroom and high S/N.

It is important to remember that for the end product of any media, the quality of your product can only ever be as good as the initial recording (and the sources recorded.) While aspects of poor quality recordings can be fixed or Masked in post production, there is no way to increase resolution after a recording has finished. It always possible in Digital to reduce the output resolution when your destination requires smaller file sizes, and a high quality recording that has been reduced will usually sound better than a low quality recording that has remained the same.

Additionally do not forget that your source is the most important stage of any production. There are songs that were recorded in garages that we still listen to today because they were amazing songs.  

Typical Sample rates and bit depths

To summarize, sample rates and bit depth run from very high to very low quality.

  • Very high quality — 96,000 Hz sampling rate, 32-bit depth, usually used in studio recording. There are tradeoffs using this high rate, for example, higher speed  computer processors and more storage space is needed.    (See articles below).
  • Default recording quality — 44,100 Hz sampling rate, 32-bit depth, stereo audio file  takes up about 20 MB of space per minute of audio.
  • CD quality — 44,100 Hz sampling rate, 16-bit, stereo  takes up about 10 MB per minute.
  • Voice recording, low quality — 22,050 Hz, 8-bit, mono = 1.25 MB per minute. This would be generally acceptable for speech recordings from lower quality sources.

More

Digital Audio basics – Izotope

Should I record at high sample rates?  Sweetwater, 2022