True Android Zxing scan code Chinese garbled solution

Time:2022-8-6

Zxing3.2.1I have encountered the problem of garbled code scanning on Android Zxing before, and I searched the Internet and solved it. Had the problem again today. Still garbled.

The study concludes as follows:

Zxing can add default encoding format in Hints. This encoding format is used to interpret Byte data.

1. If no encoding set is specified in the code, this encoding format is used by default.

2. If an encoding set is specified, use the character set specified by the encoding.

not mandatory, and there are two main encoding formats in Chinese,GBKandUTF-8。

Note that there are many references to the "ISO-8859-1" encoding set on the Internet. This encoding set is a simple encoding set with one character per byte. It is not an encoding set that can represent Chinese.

In some cases, you can use his single-byte character characteristics to convert bytes and characters.

But because of the existence of rule 2 above, using this code set to make bytes is somewhat problematic. (If the UTF-8 encoding is specified, the returned result is already utf8. Using ISO8859 to obtain bytes will result in garbled characters, because ISO cannot represent UTF-8 characters, and will be replaced with a ? sign).

Because the code generation is different, it will include whether to specify the encoding format, and the encoding itself may be Utf8 or GBK, and the binary information will be lost in the process of String conversion.

Therefore, in terms of thinking, if the original byte array can be obtained, it can be judged that it is displayed with the correct character set. Obtaining by String conversion should be avoided.

The solution is also relatively simple. After reading the source code, I found that the scanning result actually contains this information.

//In the source code QRCodeReader public final Result decode(BinaryBitmap image, Map hints) method // put some extra information into Result Metadata. This is a Map
……
result.putMetadata(ResultMetadataType.BYTE_SEGMENTS,byteSegments);
……

This result is the result of scanning the code. In the instance, the code scan result of MipcaActivityCapture will be called back

public void handleDecode(Result result, Bitmap barcode)

byteSegments records the original binary data, and the binary data format can be directly judged. Note that it is empty. For example, if scanning barcodes is supported at the same time, there is no Metadata information.

List byteSegments = (List) result.getResultMetadata().get(ResultMetadataType.BYTE_SEGMENTS);
                StringBuffer buffer1 =new StringBuffer();
                for (int i = 0; i < byteSegments.size(); i++) {
                    byte[] buffer =byteSegments.get(i);
                    String tempStr = "";
                    // Guess the encoding format
                    if(isUtf8(buffer)){
                        tempStr = new String(buffer, "utf-8");
                    }else{
                        tempStr = new String(buffer, "GBK");
                    }

                    buffer1.append(tempStr);
                }
                resultString = buffer1.toString();

isUtf8 is a tool function copied from the Internet, as follows

True Android Zxing scan code Chinese garbled solutionTrue Android Zxing scan code Chinese garbled solution

    public static Boolean isUtf8(byte[] buffer) {
        boolean isUtf8 = true;
        int end = buffer.length;
        for (int i = 0; i < end; i++) {
            byte temp = buffer[i];
            if ((temp & 0x80) == 0) {// 0xxxxxxx
                continue;
            } else if ((temp & 0xC0) == 0xC0 && (temp & 0x20) == 0) {// 110xxxxx 10xxxxxx
                if (i + 1 < end && (buffer[i + 1] & 0x80) == 0x80 && (buffer[i + 1] & 0x40) == 0) {
                    i = i + 1;
                    continue;
                }
            } else if ((temp & 0xE0) == 0xE0 && (temp & 0x10) == 0) {// 1110xxxx 10xxxxxx 10xxxxxx
                if (i + 2 < end && (buffer[i + 1] & 0x80) == 0x80 && (buffer[i + 1] & 0x40) == 0
                        && (buffer[i + 2] & 0x80) == 0x80 && (buffer[i + 2] & 0x40) == 0) {
                    i = i + 2;
                    continue;
                }
            } else if ((temp & 0xF0) == 0xF0 && (temp & 0x08) == 0) {// 11110xxx 10xxxxxx 10xxxxxx 10xxxxxx
                if (i + 3 < end && (buffer[i + 1] & 0x80) == 0x80 && (buffer[i + 1] & 0x40) == 0
                        && (buffer[i + 2] & 0x80) == 0x80 && (buffer[i + 2] & 0x40) == 0
                        && (buffer[i + 3] & 0x80) == 0x80 && (buffer[i + 3] & 0x40) == 0) {
                    i = i + 3;
                    continue;
                }
            }
            isUtf8 = false;
            break;
        }
        return isUtf8;
    }

View Code

problem solved.