Digital Audio

 

LDS Music World

Promo Song/New Release: White Christmas
Artist: Mormon Tabernacle Choir
Style: Christmas

 

FIND MUSIC

Style

Artist

Popularity


The invention of the phonograph changed everything! Suddenly, musicians could reach an audience beyond their local region, and fans could listen to music in their own homes.

The CD brought recorded music into the digital realm, adding clear benefits to the quality of the music. Signal-to-noise ratio increased dramatically with the introduction of the CD. The ability to separate the archive and display functions also provided a clear benefit - while the CD contained the information itself, the conversion of bits into audio occurred within the electronics of the player. Digital media did not degrade with time or number of uses. With the exception of physical damage, such as scratches, the CD can be played a thousand times and still provide the same quality as it did the first time - this was a vast departure from the concept associated with LP's and cassettes, which are analog formats.

Downloadable audio was born from the combination of two technologies - the internet and audio compression. With access to the internet infiltrating homes over the past decade, only one element was necessary to achieve the obvious desire to get music from websites - audio files which could be downloaded in a reasonable amount of time. CD-quality audio (44 kHz, 16 bit) occupies about 10 MB per minute in storage. A typical song of 30-40 MB would take more than 2 hours to download using conventional modems, making internet distribution impractical - welcome mp3

While many audio compression methods exist, MPEG-1 Audio Layer 3 (or mp3) was the breakthrough everyone was waiting for. Utilizing the concept of masking, this compression technology provided CD-like quality at a fraction of the file size - that 30 MB song, once compressed, took up only 3 MB. Suddenly, music was downloadable at 15-20 minutes per song.

MP3 took the music industry by storm, and whatever your views of Napster or its clones, digital downloadable music is here to stay. Music distribution is in a state of transformation the likes of which we have not seen in generations. As broadband connections to the internet become commonplace, and as the internet reaches more homes, people will choose to get their music over the internet. Musicians should embrace this technology and this method of delivery.


Digital Audio

Background:
Digital audio is often described as 16-bit, 44 kHz, otherwise known as CD-quality audio. Digital audio differs from analog audio in its waveform structure. While analog audio is represented by a continuous waveform, digital audio is a step-function. We hear music as an analog waveform. Our ears perceive sound as small changes in air pressure, and convert those changes to mechanical motions, and then to electrical stimuli which proceed to the brain. All digital audio must be converted to analog audio through a D/A converter before we hear it. This is done within the CD player or computer sound card before the signal is sent to the speakers.

The purpose of digital audio is to store sound in a format which is non-destructive, which means it doesn’t get degraded when copied. Analog sound gets worse with every copy made, as with a cassette for instance. Just as digital is converted to analog through a D/A converter, analog is converted to digital through an A/D converter. The fundamental element in this process is called sampling.

Sample Rate:
Analog-to-digital conversion takes place when the audio waveform is sampled at a fixed interval of time and represented as a series of data values. The sample rate is the number of times per second the waveform is sampled, and is expressed in units of kHz (kilohertz, or one thousand hertz) – one Hz equals one cycle per second. Both sample rate and audio frequency are expressed in units of kHz. Simply stated, frequency corresponds with pitch. For example, the A below middle C has a frequency of 220 Hz, and the A one octave higher has a frequency of 440 Hz.

To produce CD-quality digital audio, the analog audio must be sampled at 44 kHz or higher. The human ear can detect frequencies up to about 20 kHz. According to the Nyquist principal, the sampling frequency must be at least two times the highest frequency that you want represented. Otherwise, aliasing will occur, which means that the frequencies above 1/2 the sample rate will be represented inappropriately as lower frequencies, because they were undersampled.

This is the reason for the 44 kHz – it is roughly twice the 20 kHz that we can hear. Some commercial audio recording software boasts sampling rates as high as 96 kHz, but unless you want your dog to enjoy those higher frequencies which you can’t hear, there’s no physical reason to go above 44 kHz.

Good temporal resolution can be achieved with commercial analog-to-digital conversion technology. A useful analogy is to compare temporal resolution (sound) with spatial resolution (imaging) – audio frequency is expressed in cycles per second, while spatial frequency is expressed in line pairs per millimeter; the greater the frequency, the better the detail. When converting an analog image into a digital image (as with a digital camera), the spatial resolution or detail of the resulting image depends on how many samples you take of the image over its area, or how many pixels you have. Commercial digital cameras have improved in spatial resolution in recent years, but they remain a far cry from the resolution of conventional film, which is on the order of a few microns.

The goal is to sample the image, or the audio waveform, at a sample rate which exceeds two times the highest frequency that can be resolved by the viewer, or listener. In audio, that frequency is 20 kHz, and the sample rate must therefore be above 40 kHz (44 kHz is the current standard for CD-quality audio).

Bit Depth:
The bit depth for CD-quality audio is 16 bits. Think of bit depth as step size. The greater bit depth you use, the finer your steps will be, or the greater number of steps you will have throughout your range. Continuing the analogy with imaging, bit depth describes the number of gray levels in a black and white image. For example, if you have 8 bits, then you have 256 different shades of gray to represent changes in the darkness/lightness of your image (28 = 256). If you have 10 bits, then you have 1024 shades of gray. In 16-bit audio, you have 65,534 numbers (216) available to represent the incoming voltage level of the signal.

In practice, utlilizing a greater bit depth reduces noise in your audio, as well as quantization distortion. Quantization refers to the practice of assigning whole numbers to an input voltage level. Since rounding always occurs, the smaller the step size the better – a greater number of bits achieves this. CD-quality audio uses 16 bits. Sometimes bit depth is referred to as resolution, not to be confused with temporal resolution, which refers to the sample rate.
Dynamic Range and Signal-to-Noise Ratio:

Greater bit depth increases the dynamic range. The dynamic range is the range between the lowest and highest level that can be reproduced by a system. A system with 16-bit resolution has a dynamic range of 96 dB, where dB refers to the decibel. The decibel is one tenth of a Bel (named after Alexander Graham Bell). This is a logarithmic scale and is relative. A 3 dB change is the minimum perceptable change, and a 10 dB change represents a sound that is twice as loud. Log scales are used frequently in both sound and imaging (as with optical density) to condense the dynamic range. Also, log scales often more closely approximate how our ears and eyes perceive sound and images, and thus their use is properly justified.

Signal-to-noise ratio is simply the level of the music divided by the level of the noise. All sound has a noise component. Noise refers to random fluctuations in sound which are not associated with the music. The goal is to reduce this noise to acceptable levels, or to separate the signal as far from the noise as possible. A large signal-to-noise ratio represents better reduction of noise. CDs typically achieve a signal-to-noise ratio of about 90 dB, which is enviable when compared with the SNR in medical imaging technologies, for instance.

File Size Reduction:
The size of a digital audio file can be computed as the product of sample rate, bit-depth, number of channels, and seconds, divided by the number of bits per byte, or 8. As an example, consider a 60 second stereo sound clip. Because it’s stereo, it has two channels. If this is recorded with a 44.1 kHz sample rate at 16 bit resolution, then the file size is 44,100*16*2*60/8 = 10,584,000 bytes or 10.6 MB.

The size of an uncompressed audio file is about 10 MB per minute of audio. This is not a concern from a storage standpoint anymore, with multi-GB drives, but for web-based audio this size is excessive. If you were to download a four minute song using a 28.8 kpbs modem, it would take more than 3 hours. Without compression/decompression algorithms (codecs), the only way to reduce the size of an audio file is through manually degrading its sound quality in one of three ways.
1- Convert a stereo file to a mono file:
Stereo has two tracks, while mono has only one. You can reduce the file size in half by making this conversion. If you are planning to provide users with audio files that you do not expect them to listen to on good sound systems, this file reduction method should be utilized.
This should be your first choice in reducing file size, since the dual-channel nature of the music is lost but no actual degradation of audio quality takes place.
2- Reduce the sample rate:
When you reduce the sample rate from 44 kHz to 22 kHz, , you will cut your file size in half as well, and you will lose frequencies above 11 kHz. This will be noticeable, but the extent of the degredation will depend on nature of the music. If there is a great deal of high frequency components, the loss will be more meaningful. For speech, of course, you should consider going as low as 8 kHz, since high frequencies can be more easily ignored, as they are across our telephone lines.
This should be your second choice in reducing file size. Higher frequencies are sacrificed for smaller file size.
3- Reduce the bit depth:
If you reduce your bit depth from 16 bits to 8 bits, this again will cut the file size in half. The practical result of reducing bit depth is to introduce noise into the recording. In most cases, the result will be unacceptable.
This method should be your last choice in reducing file size.

If all three of these methods are used, you can reduce your file size by a factor of eight. A 31 MB song now takes up only 3.9 MB, which is an acceptable size for downloading over the internet. Unfortunately, the audio quality has been degraded beyond what is acceptable to listeners. This is where compression technology becomes important.


Compression Technology and MP3

Introduction:
If we could all receive data at a few megabits per second, perhaps we wouldn’t concern ourselves so much with compression technology. But normally, when we want to send data to somebody else, we zip it or stuff it or do whatever we can to make it smaller than it really is. For pictures, we compress them to jpeg or gif format. For audio, we use mpeg, or more specifically, mpeg layer 3 – otherwise known as MP3.

In reality, there are many audio compression formats, but MP3 is the format that has taken the music industry by storm in recent years, and is rapidly changing the landscape in the distribution of music over the internet. If you haven’t heard of MP3, then you probably just returned from your mission – or you’d better use that as your excuse, anyway.

Background:
In 1992 the Motion Picture Experts Group (MPEG) approved a compression/decompression algorithm (or codec) which was called MPEG-1 Audio Layer 3, or MP3. But it wasn’t until 56K modems became commonplace that MP3 reached a wide audience and began to transform the way music is heard and distributed. With this combination of compressed music and faster connection speeds, internet users could download an entire song in 15-20 minutes – certainly not instant gratification, but reasonable enough for people to get the music they wanted.

The proliferation of digital music on the internet has since been very rapid. The most commonly downloaded material from the internet is now music. The most commonly requested word in search engines is "mp3". Digital music and the internet are a perfect fit. Why would anyone go to a store, purchase a CD, bring it home and put it in their player, when they can point and click instead? Why would you walk up and down the isles in a music store and sift through the thousands of titles to find the CD you want (and maybe never find it), when you can type in a few keywords and find the title in a matter of seconds?

The future of MP3 and music distribution holds many unanswered questions regarding formats, copyright infringement, and distribution channels. But one thing is clear – digital music distribution over the internet is here to stay and will continue to expand rapidly as broadband access becomes commonplace. When users can download entire CDs in a matter of seconds, there will be little reason to acquire music is any other way.

Audio Compression:
MP3 achieves roughly a 10:1 compression ratio with only minimal loss of audio quality. Unlike the file size reduction methods mentioned, which degrade the audio quality by reducing sample rates or bit depths, the codec used for mp3 utilizes a masking method. The algorithm removes sounds in the audio which are masked by other sounds. The idea behind this is that because that sound is being masked, you aren’t going to hear it anyway. So if it’s removed, you won’t miss it. The audio quality is in reality degraded, because the codec uses lossy compression, but most users find little noticeable degradation.


Audio Formats

Downloadable Audio:
For downloadable audio, there is little reason to look any further than MP3. While many other formats exist, many of these are uncompressed or do not offer the level of compression that MP3 does. The MP3 format is now so widely used that nearly all web users have capabilities either within their browser or through external software for playback of MP3 files. Here’s a brief description of audio formats:
- AIFF (.aif, .aiff): Audio Interchange File Format. This is an uncompressed format, and is not used on the web, since file size is approximately 10 MB per minute of audio. This is the default audio format for computers running the Mac OS. It uses the PCM codec.
- AU (.au, .snd): Sun Audio format. This is a moderately compressed format. It was used frequently in the early days of the web, but is no longer practical due to large file size. This is the default audio format for computers running Unix. It uses the u-law codec.
- Wave (.wav): Microsoft’s Wave format should also be avoided for web use, since it is an uncompressed format. This is the default audio format for computers running Windows. It uses the PCM codec.
- Quicktime (.mov): Quicktime is more than an audio format. It is an architecture to store, edit and play multimedia content, such as synchronized graphics, sound, video, and text. Apple’s quicktime software actually supports all of the major audio formats for playback. It uses a proprietary format from Apple Computer.
- MP3 (.mp3, mp2): MPEG-1 Audio Layer 3 format. MP3 has become the undisputed format of choice for downloadable audio. It provides good quality digital audio at a compression ratio of about 10:1. The significance of MP3 cannot be overstated. Currently, there is no conceivable reason to use any format other than MP3 for the delivery of downloadable audio.

Streaming Audio and Encryption:
Streaming audio differs from downloadable audio in that it begins playback almost immediately after being requested. Instead of waiting until the entire song has been downloaded, the audio is "streamed" to the user’s computer, and it continues streaming during playback. Since delivery time is quicker, audio quality is normally poorer with streaming audio. The purpose of streaming audio is generally two-fold – to deliver the audio to the listener with minimal delay, and to prevent the user from obtaining an actual copy of the music.

Encryption technology is a method of preventing the user from making copies of the music they download. The recording industry is currently developing a standard, termed the Secure Digital Music Initiative (SDMI), which will likely be a part of digital music in the coming years. Some encryption formats exist already. Below is a description of some streaming formats and those which use encryption:
- Real Audio (.ra, .ram, .rm): Real Networks pioneered streaming audio with its introduction of Real Audio several years ago. RealPlayer now supports many streaming formats besides RealAudio. This is by far the most popular format for streaming audio, controlling roughly 80% of the market.
- Shockwave Audio (.swa): Shockwave is Macromedia's contribution to web-based audio. It is a streaming audio format which allows you to choose the level of quality for playback, depending on the modem speed of your audience.Shockwave streams a low bit-rate MP3 file with a different file header. Many players can handle Shockwave audio.
- Windows Media Audio (.wma): Windows Media Audio uses a proprietary compression format, and is a relatively late entry into this realm. It is a streaming format and is aimed squarly at Real Network’s RealPlayer.
- Liquid Audio: Liquid Audio is a streaming format which utilizes licensed technology from Dolby Labs. But it's more than a streaming format. The goal of Liquid Audio is to allow users to preview music and then purchase it one song at a time. Liquid Audio uses a tracking system to make sure the record company, the publisher and the artist get paid. It is meant to be a one-stop solution for digital downloads over the internet.

MIDI:
MIDI (.mid, .midi): Musical Instrument Digital Interface. MIDI is different from the other formats mentioned, because it really isn’t an audio format. MIDI is a language for computers and musical instruments to talk to each other. A MIDI file does not contain music. It contains instructions for a musical instrument to play a song.
You must have a musical instrument to get music from a MIDI file. Fortunately, many people have a musical instrument built right into their computers – Apple’s Quicktime Musical Instruments is one such example. Other software-based synthesizers exist which are more advanced.

Since MIDI files are so small (a few kB), their use is well-suited for web-based delivery of audio. However, it’s important to recognize that the music may sound different to each user, since it is played on the computer’s synthesizer, which may vary among users. MIDI is best reserved for instrumental works, and only when the selection of the instrument which plays the music is not essential, since you cannot control this. If you want to deliver music which sounds exactly the way you hear it, MIDI should be avoided.

Conversion Between Formats:
Many tools exist for converting between different audio formats. High-end audio editing programs such as Sound Forge and Peak offer the most extensive options. However, less expensive alternatives exist, such as Quicktime, SoundJam, Jukebox and WinAmp. A good resource for keeping up with mp3 music, players, news, and much more is the Lycos MP3 site.


The Future of Web-Based Audio
The distribution of music over the internet will continue to proliferate as broadband access becomes more commonplace - this is a given. The unanswered questions have to do with which format will emerge as the standard and how the recording industry will modify its business model to generate profits from web-based distribution.

The widespread acceptance of MP3 can be credited to its open architecture and grass roots support. MP3 is not really a standard, since no company or organization has branded it as such. While consumers have embraced MP3, the recording industry has taken every opportunity to curtail its use, as demonstrated by its numerous lawsuits, first against Diamond Multimedia when the Rio (portable MP3 player) was introduced, and more recently against mp3.com and Napster. The Recording Industry Association of America (RIAA) lost its lawsuit against Diamond Multimedia, but has had more success in the Napster case, where the court ruled in favor of the RIAA. This decision was appealed, and Napster was allowed to continue operations pending an appeals court ruling.

While the RIAA and large record companies have unanimously opposed web-based distribution of MP3 music, many musicians have found MP3 to be an opportunity, especially independent musicians and unsigned artists. Before the internet and MP3, artists had a difficult time getting their music heard and distributed, unless they had a major record deal. Now, artists can place music on their own web pages and music distribution websites which promote their music. Opportunities exist for musicians on mp3.com, iuma.com, and for LDS musicians on ldsmusician.com. Artists are now finding an audience for their music without the help of big record companies. And even major artists are utilizing these new distribution methods.

While the future of web-based audio holds many unanswered questions, the next three to five years will be exciting to watch as developments take place. In the meantime, musicians should take every opportunity that exists to get exposure for their music, using the internet for such purposes.


 

 

 

 

 




 

© 1999-2006 LDS Music World | Owner, Jefferson Fairbanks, PhD
partners: LDSMusician.com, LDSMusic.org
 
medical physics and radiation oncology