Innovate anywhere, anytime withruncode.io Your cloud-based dev studio.
Web Development

Streaming protocols ogg, mp3, aac

2023-06-17

Audio format is a file format for storing the digital audio data on a computer. The encoding of the audio data is done by the encoder according to the file format specifications.The data in the file may be in compressed or uncompressed format.

The audio format is categorized into three types:

Uncompressed audio Format:These audio File formats uses no compression, referred as PCM(Pulse Code Modulation) format. The file size is large and the bit rates are high about 1.4Mb/s for CD, producing the original sound quality. WAV, AIFF are the file formats supporting this format.

Lossless Compression:The audio file is compressed to a reduced size than the original to make free space in disk and to store more music files.The lossless compression does not eliminate any data i.e.,no data is lost during compression, it is similar to a zip file where all the data is restored after decompressing the file, it compresses the file size to about 30 to 50% of the original CD. It does not degrade the audio quality, produces the same as the original CD. Some of the lossless compression file formats are FLAC, WavPack(WV), TTA, ATRAC, Apple Lossless(m4a). Lossless formats can be converted to other lossless formats using the software without any degradation in audio quality.

Lossy Compression:The Lossy compression reduces the file size to large extent compared to lossless compression about 75-95% of original uncompressed file size.Lossy Compression performs the psychoacoustic compression to remove all irrelevant and unnecessary information which is not reachable by the human ear. Data is lost during compression and it results a drop in audio quality. Ogg, MP3, aac, WMP are the lossy formats used, most commonly used lossy format is MP3, at high bit rates it produces the same quality as the original file. The data lost in lossy compression cannot be regained, the conversion between lossy formats is not preferred since it results in poor audio quality.

Ogg is the name of free open source unpatended stream container format i.e.,it holds the digitally encoded data in the form of a software file. Ogg is developed and maintained by Xiphl.org foundation, designed for audio bit streaming and manipulation. Vorbis is the name of Ogg's audio lossy format, .ogg is the file extension used for ogg container format. Ogg vorbis is a new audio streaming format and it allows Internet streaming using Icecast server. libogg/libvorbis/vorbis-tools are the standard tools to encode ogg vorbis files which supports most OS. Ogg format uses CBR and VBR for encoding data. These standard encoding methods produce small file size with good audio quality compared to other lossy formats, at 110kbps providing smaller file size ogg sounds better than MP3 at 128kbps. Inspite of these advantages ogg is not widely used as mp3 format since it does not support some native standards. VLC media player, XMMS(X Multimedia System) in Linux are the standards supporting .ogg, as it is an open source format used by video game developers.

MP3 is the most often used lossy format for downloading and listening music. The format is supported by winamp, quickRealtime, WindowsMediaplayer etc. It is part of MPEG-Layer III referred as MP3 which uses perceptual audio coding, eliminating the unnecessary data.This lossy format reduces the file size to about 1/11 to 1/14 of uncompressed format. MP3 uses a LAME encoder which is free and open source, works at high bit rates. At 190kbps/44.1KHz VBR MP3 format sounds similar to uncompressed format, usually 128kbps/44.1KHz VBR is used for most mp3 format. MP3 uses shoutcast server, Icecast server as streaming server for live streaming application.

AAC is Advanced Audio Coding default Apple lossy format. It is standardized by ISO/ICE norms as a part of MPEG-2 and MPEG-4 specifications. AAC uses wide band coding algorithm which provides high voice quality. It is the successor of MP3 format, improvements over MP3 format are it provides more sampling frequencies about 8KHz-96KHz, supports upto 48 channels, uses the efficient filter bank MDCT(Modified Discrete Cosine Transform) for coding, uses flexible joint stereo(for joining several channels of similar information to obtain higher quality, small file size at low bit rates). It provides efficient audio quality at low bit rates (less than 128Kbps), 96kbps stereo is found to be satisfactory for quality audio.When compared with MP3 format AAC provides better audio quality at low bitrate with small file size, 128kbps aac sounds equivalent to 160kbps MP3. AAC is the default audio format for Apple's iphone, ipod, itunes, sony play station, HDTV, mobile phone, supported by sony erricsson. AAC can be played on KM Player, ffdshoo, Winamp, VLC, itunes, WMP etc. AAC takes a modular approach to encoding. Depending on the complexity of the bit stream to be encoded, to obtain the desired performance for a particular application AAC profiles are created. Some audio profiles defined in MPEG-4 given below

MPEG-4 AAC- aac

AAC -HE -aach

AAC-LD -aacl

AAC-ELD -aace

AAC -HE v1 -aacp

AAC-HE v2 -aacplus v2

AAC -Spatial -aacs

AAC -ELD-SBR-aacf

AAC-LC low complexity is the most widely used audio coding, it is a part of MPEG-4 audio profiles, it uses perceptual noise substitution(pns) to reduce the signal to noise ratio. It gives better results than MP3 format, provides good audio quality at 80kbps mono and 128kbps stereo, offers multi channel, and used in applications such as Digital Audio Broadcasting, portable audio systems, Internet streaming etc.

HE-AAC High Efficiency Advanced audio coding defined as an MPEG-4 audio profile in ISO/ICE 14496-3 for digital audio as an extension to AAC-LC to produce the quality audio at low bit rates. HE-AAC has two versions (i)HE_AAC v1 also referred as aacplus or aacp or AAC+, (ii)HE-AAC v2 referred as aacplus v2 or eAAC+.

HE-AAC v1 uses spectral band replication(SBR) along with AAC LC to provide efficient compression at low bit rates. SBR encodes and transmits high frequency components of the encoder allowing high quality at low bit rates. AAC+ bit rates range from 32-128kbps, 56kbps is used for live radio application. AAC+ is used in applications of internet, mobile, broadcast systems like HD Radio, Digital Radio Mondiale, XM Satellite Radio. AAC+ uses faac library as encoder and Darwin Streaming server for streaming audio.

HE-AAC v2 is the extension of aacp, it uses parametric stereo along with SBR and AAC-LC to provide most efficient audio quality at lower bit rates of about 16-48kbps stereo. Parametric stereo is used in MPEG-4 audio to further enhance efficiency in bandwidth stereo media. HE-AAC v2 is 50% more efficient than aacp, at low bit rate of 32kbps stereo provides quality audio. HE AAC v2 uses libaacplus library as encoder for efficient audio streaming.