MP3 File Structure

File structure is something that make up a file. File is the smaller form of a filesystem. A file can’t be called “file” when it is not structured. Structure of a file is important to make that file readable because when it is not properly structurized I bet the system won’t be able to read it. And I think there’s also a standarization of a file structure, so a pdf file named A will have the same structure as the B,C,D or the other pdf files. The file structure of each filetype is also different. An .avi files will have different file structure with a .doc file.
In this post I’ll try to explain about the structure of an mp3 file.   🙂

Who don’t know MP3? MP3 or to be more specific, MPEG-1 or MPEG-2 Audio Layer III, is a patented digital audio encoding format using a form of lossy data compression. It is a common digital audio format used widely in the world. When we talk about mp3, it cannot detached from music. I think everyday we always listen to music. And I think you already know, most of the music is encoded with this format.

MP3 audio format was designed by the Moving Picture Experts Group(MPEG) as part of its MPEG-1 standard and later extended in MPEG-2 standard. The use of the lossy compression algorithm is designed to greatly reduce the amount of data required to represent the audio recording and still sound like a faithful reproduction of the original uncompressed audio for most listeners.

An MP3 file is made up of multiple MP3 frames, which consist of a header and a data block. This sequence of frames is called an elementary stream. Frames are not independent items and therefore cannot be extracted on arbitrary frame boundaries. The MP3 data blocks contain the compressed audio information like the frequencies and amplitudes.

(click the images to see the full resolution)
From the diagram above we can see that the MP3 header consist of a sync word, which is used to identify the beginning of a valid frame. This followed by a bit indicating that this is the MPEG standard and two bits that indicated that layer 3 is used. After this the values will different depends on the MP3 file. Most MP3 files today contain ID3 metadata, which precedes of follows the MP3 frames as showed in the diagram.

MP3 file structure also can be expressed in this scheme :
[TAG v2] Frame1 Frame2 Frame3… [TAG v1]

MP3 file is divided into small blocks of frames. Each frame has constant time length of 0.026 sec. But the size of one frame is vary depending on its bitrate. For example 128kbps song is normally 417Bytes and for 192kpbs song is 626Bytes. The first 4 Bytes of each frame is the frame header and the rest is the audio data. Where the TAG is the name for data space in MP3 file where some text informations like song name, artist, genre, album, and the others can be stored.

Frame header consist of information about frame (bitrate, stereo mode..) and because of that, frames are independent items. Each of them can have its own characteristic. It is used in Variable Bitrate files, where each frame can have different bitrate. This is the structure of the frame header. Each letter is counted one bit.


A = Frame Synchronizer
B = MPEG version ID
C = Layer
D = CRC Protection
E = Bitrate Index
F = Sampling rate frequency index
G = Padding
H = Private bit
I = Channel
J = Mode Extension
K = Copyright
L = Original
M = Emphasis
You can find more detail here.

“the quieter you become, the more you are able to hear..”