Jpg learning notes 5 (with complete code)

Time:2021-2-23

The fourth step of JPG compression is Huffman coding. The following is mainly about the Huffman coding of JPEG.

The picture is quoted from “compressed image file formats, JPEG, PNG, GIF, XBM, BMP – John Miano” [1]

1. Huffman symbol of AC data

For AC data, the first four bits to be encoded represent how many zeros there are before the data, and the last four bits represent the magnitude value of the current value.

AC data is encoded in zigzag order.

Take the following figure as an example, starting from 1, there are 0 zeros in front of it, the value size is 1, the magnitude value is 1, and the symbol to be encoded is 0x01;

Then go to 3, there are 5 zeros in front, the value size is 3, the magnitude value is 2, and the symbol to be encoded is 0x52;

and so on:

There are only two additional cases

0x00 means that the following data are all 0

0xf0 represents 16 zeros

 

 

The total number of symbols = (number of 0) 16 * 10 (different mapping) + 2 (special case) = 162.

2. Huffman symbol of DC data

DC data stores difference, that is, the DC value of the current block minus the DC value of the previous block.

As shown below, there are 12 DC symbols in total

 

3. JPEG default Huffman encoding

JPEG provides the default Huffman table (essentially good) [2], as follows

 

 

 

 

Citation“ https://www.impulseadventure.com/photo/optimized-jpeg.html “

You can also generate Huffman code according to the image. The code is as follows

void JPG::huffmanCoding() {
    /*****************************************Create YDC_ Table*********************************************/
    int lastYDC = 0;
    uint componentID = 1;
    //Create YDC_ Table
    for (uint i = 0; i < mcuHeight; i++) {
        for (uint j = 0; j < mcuWidth; j++) {
            MCU& currentMCU = data[i * mcuWidth + j];
            //Iterate over every component Y, CB, Cr
            //Traversing block
            for(uint ii = 0; ii < getVerticalSamplingFrequency(componentID); ii++) {
                for(uint jj = 0; jj < getHorizontalSamplingFrequency(componentID); jj++) {
                    Block& currentBlock = currentMCU[componentID][ii * getHorizontalSamplingFrequency(componentID) + jj];
                    Int difference = currentblock [0] - lastydc; // DC component is encode difference
                    lastYDC = currentBlock[0];
                    Byte symbol = getbinarylengthbyvalue (difference); // the binary length of Y is the value of symbol
                    yDC.countOfSymbol[symbol]++;
                }
            }
        }
    }
    yDC.generateHuffmanCode(); 
     /*****************************************Create YAC_ Table*********************************************/
    for (uint i = 0; i < mcuHeight; i++) {
        for (uint j = 0; j < mcuWidth; j++) {
            MCU& currentMCU = data[i * mcuWidth + j];
            //Traversing block
            for(uint ii = 0; ii < getVerticalSamplingFrequency(componentID); ii++) {
                for(uint jj = 0; jj < getHorizontalSamplingFrequency(componentID); jj++) {
                    Block& currentBlock = currentMCU[componentID][ii * getHorizontalSamplingFrequency(componentID) + jj];
                    uint numZero = 0;
                    for(uint k = 1; k < 64; k++) {
                        if(currentBlock[ZIG_ZAG[k]] == 0) {
                            numZero++;
                            if(numZero == 16) {
                                if(isRemainingAllZero(currentBlock, k + 1)) {
                                    yAC.countOfSymbol[0x00]++;
                                    break;
                                } else {
                                    yAC.countOfSymbol [0xf0] + +; // 16 zeros
                                    numZero = 0;
                                }
                            }
                        } else {
                            byte lengthOfCoefficient = getBinaryLengthByValue(currentBlock[ZIG_ZAG[k]]);
                            byte symbol = (numZero << 4) + lengthOfCoefficient;
                            yAC.countOfSymbol[symbol]++;
                            numZero = 0;
                        }
                    }
                }
            }
        }
    }
    yAC.generateHuffmanCode();
    /*****************************************Create chromadc_ Table*********************************************/

    int lastChromaDC = 0;
    for(uint componentID = 2; componentID <=3; componentID++) {
        for (uint i = 0; i < mcuHeight; i++) {
            for (uint j = 0; j < mcuWidth; j++) {
                MCU& currentMCU = data[i * mcuWidth + j];
                //Iterate over every component Y, CB, Cr
                //Traversing block
                for(uint ii = 0; ii < getVerticalSamplingFrequency(componentID); ii++) {
                    for(uint jj = 0; jj < getHorizontalSamplingFrequency(componentID); jj++) {
                        Block& currentBlock = currentMCU[componentID][ii * getHorizontalSamplingFrequency(componentID) + jj];
                        Int difference = currentblock [0] - lastchromadc; // DC component is encode difference
                        lastChromaDC = currentBlock[0];
                        Byte symbol = getbinarylengthbyvalue (difference); // the binary length of Y is the value of symbol
                        chromaDC.countOfSymbol[symbol]++;
                    }
                }
            }
        }
    }
    chromaDC.generateHuffmanCode();
    /*****************************************Create chromaac_ Table*********************************************/
    for(uint componentID = 2; componentID <=3; componentID++) {
        for (uint i = 0; i < mcuHeight; i++) {
            for (uint j = 0; j < mcuWidth; j++) {
                MCU& currentMCU = data[i * mcuWidth + j];
                //Traversing block
                for(uint ii = 0; ii < getVerticalSamplingFrequency(componentID); ii++) {
                    for(uint jj = 0; jj < getHorizontalSamplingFrequency(componentID); jj++) {
                        Block& currentBlock = currentMCU[componentID][ii * getHorizontalSamplingFrequency(componentID) + jj];
                        uint numZero = 0;
                        for(uint k = 1; k < 64; k++) {
                            if(currentBlock[ZIG_ZAG[k]] == 0) {
                                numZero++;
                                if(numZero == 16) {
                                    if(isRemainingAllZero(currentBlock, k + 1)) {
                                        chromaAC.countOfSymbol[0x00]++;
                                        break;
                                    } else {
                                        chromaAC.countOfSymbol [0xf0] + +; // 16 zeros
                                        numZero = 0;
                                    }
                                }
                            } else {
                                byte lengthOfCoefficient = getBinaryLengthByValue(currentBlock[ZIG_ZAG[k]]);
                                byte symbol = (numZero << 4) + lengthOfCoefficient;
                                chromaAC.countOfSymbol[symbol]++;
                                numZero = 0;
                            }
                        }
                    }
                }
            }
        }
    }
    chromaAC.generateHuffmanCode();

}
void generateHuffmanCode() {
        std::vector symbols;
        //Traverse each symbol that appears, add to vectors
        for(uint symbol = 0; symbol < 256; symbol++) {
            if(countOfSymbol[symbol] == 0) 
                continue;
            Symbol* s = new Symbol(symbol, countOfSymbol[symbol], 0, nullptr);
            LinkedSymbol linkedSymbol;
            linkedSymbol.symbol = s;
            linkedSymbol.weight = s->weight;
            symbols.push_back(linkedSymbol);
        }
        
        
        //FF is a symbol that will not appear. As our dummy symbol, we can prevent the appearance of one bit stream, such as 11111, so that we can prevent the possibility of FF appearing in compressed data
        Symbol* dummySymbol = new Symbol(0xFF, 1, 0, nullptr); 
        LinkedSymbol dymmyLinkedSymbol;
        dymmyLinkedSymbol.symbol = dummySymbol;
        dymmyLinkedSymbol.weight = dummySymbol->weight;
        symbols.push_back(dymmyLinkedSymbol);
        
        

        //The process of merging
        while(symbols.size() != 1) {
            
            //leastWeight
            LinkedSymbol least = getLeastWeightLinkedSymbol(symbols);
            //second Least Weight
            LinkedSymbol second = getLeastWeightLinkedSymbol(symbols);
            //add two weights
            least.weight = least.weight + second.weight;

            //linked two linkedsymbols;
            Symbol* temp = second.symbol;
            while(temp->nextSymbol != nullptr)
                temp = temp->nextSymbol;
            temp->nextSymbol = least.symbol;
            least.symbol = second.symbol;
            //Add 1 codelength to each symbol and add it to the 
            for(auto i = least.symbol; i != nullptr; i = i->nextSymbol) {
                i->codeLength++;
            }
            symbols.push_back(least);
        }

        //Put in sortedsymbols
        for(Symbol* i = symbols[0].symbol; i != nullptr; i = i->nextSymbol) {
            sortedSymbol.push_back(*i);
        }

        //Sort, and put dummy symbol at the end;
        std::sort(sortedSymbol.begin(), sortedSymbol.end(), comp);

        //Free memory
        Symbol* temp = symbols[0].symbol;
        while(temp != nullptr) {
            auto t = temp->nextSymbol;
            delete temp;
            temp = t;
        }

        //The number of codes of length n
        //Generate codelength count for each codelength;
        for (auto it = sortedSymbol.cbegin(); it != sortedSymbol.cend(); it++) {
            codeCountOfLength[it->codeLength]++;
        }

        //The code length should not be greater than 16, which is implemented by applying the method in the book
        for(uint ii = 32; ii >= 17; ii--) {
            while(codeCountOfLength[ii] != 0) {
                uint jj = ii - 2;
                while(codeCountOfLength[jj] == 0)
                    jj--;
                codeCountOfLength[ii] = codeCountOfLength[ii] - 2;
                codeCountOfLength[ii - 1] = codeCountOfLength[ii - 1] + 1;
                codeCountOfLength[jj + 1] = codeCountOfLength[jj + 1] + 2;
                codeCountOfLength[jj] = codeCountOfLength[jj] - 1;
            }
        }

        Uint index = 1; // codelength
        for (auto it = sortedSymbol.begin(); it != sortedSymbol.end(); it++) {
            if(codeCountOfLength[index] != 0) {
                it->codeLength = index;
                codeCountOfLength[index]--;
            } else {
                index++;
                it--;
            }
        }

        
        //Generating huffmancode for each symbol
        uint huffmanCode = 0;
        uint currentLength = 1;
        for (auto it = sortedSymbol.begin(); it != sortedSymbol.end(); it++) {
            if(currentLength == it->codeLength) {
                it->code = huffmanCode++;
                codeOfSymbol[it->symbol] = it->code;
                codeLengthOfSymbol[it->symbol] = it->codeLength;
            } else {
                huffmanCode = huffmanCode << 1;
                currentLength++;
                it--;
            }
        }
    }

All codes in https://github.com/Cheemion/JPEG_ COMPRESS/tree/main/Day5

end

 Thanks for reading.

                                                                                                                                                                                                                                                                                >>>> Jpg Learning Notes 6


 

reference material

[1]https://github.com/Cheemion/JPEG_COMPRESS/blob/main/resource/Compressed%20Image%20File%20Formats%20JPEG%2C%20PNG%2C%20GIF%2C%20XBM%2C%20BMP%20-%20John%20Miano.pdf

[2]https://www.impulseadventure.com/photo/optimized-jpeg.html