JPEG

YP_bP_r

The first notable thing about JPEG compression is the way in which the colour of each pixel is stored. Each pixel of the image is assigned 3 bytes to define its colour. All three bytes can have any value from 0 to 255 and every possible combination of the three bytes stands for another colour. In most file formats, the RGB format is used for defining the colour. RGB stands for Red Green Blue. It's named this way because the first of the three bytes tells you how much red there is in the pixel's colour. The second byte tells you how much green there is in the colour and the third byte how much blue. The higher value the first byte has, the more red the pixel looks.

JPEG also uses three bytes for each pixel, but instead it's using the YP_bP_r (also known as YC_bC_r) format. Here, the colour is split into 3 unique "channels". The first channel, known as the Y channel, tells us how bright a pixel is. The second channel (P_b) tells us how blue a pixel is. The third channel (P_r) tells us how red a pixel is. Using this colour format, the brightness information is stored separate to the colour information. This is useful because the human eye is better at seeing brightness than colour, so we can apply greater compression to the colour channels (the P_b-channel and the P_r-channel) while keeping acceptable quality.

Images are most often stored in the RGB format, so the first step of JPEG compression is usually to change the format to the YP_bP_r.

Downsampling

Our eyes can detect brightness more accurately than they can detect colour, so in this stage, we discard data from the P_b and P_r channels. This reduces the size of the channel, meaning it takes up less space in the file, but we can never get this data back, reducing quality.

Discrete Cosine Transform

JPEG uses cosine functions to represent an image. Therefore, we are going to talk a little bit about cosine functions. This is what a cosine function could look like:

To have the cosine function represent the colour of a pixel, we say that the higher the value of the cosine function, the brighter the pixel. If we had a set of pixels that went bright-dark-bright, we could use the function above to define them.

The function could also have a higher frequency. Like this:

But here's where it gets interesting. We can also create different functions by combining different cosine functions. Here is what it would look like if we combined the two functions above:

In JPEG, each colour channel is split into blocks of 8 × 8 pixels, and a DCT is applied to re-represent it with cosine waves.

Quantisation

In this step, we aim to compress the image by filtering out information our eyes cannot see. For every block of 8 × 8 pixels, the cosine waves are represented with more or less data, depending on how clearly our eyes can see them. Since our eyes are best at seeing low frequency data, we store these frequencies with more data and precision. Our eyes are less accurate for high frequency data, so we can store them with less data. However, because this step destroys data, the quality of the decompressed image is worse.

Many values will now be 0, which means they can be very easily compressed. This is done using Huffman coding. Huffman coding is the last step of JPEG compression. It is also the only step in which the data is compressed instead of destroyed.

How it works

YP_bP_r

Downsampling

Discrete Cosine Transform

Quantisation

Structure

Other websites

Wikiwand - on

How it works

YPbPr

Downsampling

Discrete Cosine Transform

Quantisation

Structure

Other websites

YP_bP_r