The trouble caused by translation: the checksum of IP, ICMP, TCP and UDP packets

Time:2021-8-10

1、 Network protocol checksum definition

RFC 1071The definition of checksum in:

(1) Adjacent octets to be checksummed are paired to form 16-bit integers, and the 1’s complement sum of these 16-bit integers is formed.
(2) To generate a checksum, the checksum field itself is cleared, the 16-bit 1’s complement sum is computed over the octets concerned, and the 1’s complement of this sum is placed in the checksum field.
(3) To check a checksum, the 1’s complement sum is computed over the same set of octets, including the checksum field. If the result is all 1 bits (-0 in 1’s complement arithmetic), the check succeeds.

The IP checksum is the 16 bit complement of the sum of the complements of all 16 bit words in the header.
A question many people may ask is “what is the complement sum of 1?”. This is because all computers use the ‘complement of 2’ instead of the ‘complement of 1’. The following is a brief introduction.

So what are the ones’ complement sum of 1 and the two’s complement sum of 2

1.1 what is the complement of 1 (inverse code, ones’s complement sum, 1’s complement sum)

Ones’s complement sum of 1: also known asInverse code。 The reason why it is called “inverse code” is that the negative number represented by inverse code can be regarded as the “inverse by bit” of its corresponding positive number (its opposite number) (replace all 0 with 1 and 1 with 0). For example, the decimal number 6 is represented as 00000110 by 8-bit binary number, and – 6 is represented as 11111001.
The trouble caused by translation: the checksum of IP, ICMP, TCP and UDP packets

This name intuitively reflects the appearance of the inverse code and is easy to understand. But the name “1’s complement” is a little confusing. I don’t understand what it means. In fact, this is the pot of translation. The English name of the inverse code is “ones’ complement” (note that it is not one’s complement), which means the complement of “(plural) one”. Because nouns in Chinese can not reflect the singular and plural through morphological changes, we can not understand the meaning of the name through translation. Still taking the above example, we add + 6 and – 6 of the inverse code according to binary addition, and find that “a string of one” is obtained.
The trouble caused by translation: the checksum of IP, ICMP, TCP and UDP packets

When you think about it carefully, you can find that the “sum” of any pair of inverse codes of opposite numbers is a string, so we can define the n-bit binary inverse code of negative number n as follows:
The trouble caused by translation: the checksum of IP, ICMP, TCP and UDP packets
This is why the inverse code is called the complement of (plural) one.

1.2 what is the complement of 2 (inverse code, two’s completion sum, 2’s completion sum)

Then do you think the “two complement” is like this:
The trouble caused by translation: the checksum of IP, ICMP, TCP and UDP packets

not! ‘2’ does not appear in binary numbers. In fact, “the complement of 2” in English is two’s complement (not two’s)   Therefore, it is the complement of “(a) 2”, which is not the same as “the complement of 1”. The complement of negative numbers is defined as follows:
The trouble caused by translation: the checksum of IP, ICMP, TCP and UDP packets
“Two” here refers to “2” in the above formula.

Summary:

"1's complement" refers to "taking a string of one as a module for complement"
  "The complement of 2" means "taking the nth power of two as the module to find the complement"

1.3 take a chestnut

Complement fixed-point integer of 2 (8-bit)

Binary Decimal Hex
00000000 0 00
00000001 1 01
00000010 2 02
00000011 3 03
11111111 -1 FF
11111110 -2 FE
11111101 -3 FD

Add two integers:
-3 + 5 = 2
FD + 05 = 01 02
Discarding carry (01) will get the correct result.

Complement fixed-point integer of 1 (8-bit)

Binary Decimal Hex
00000000 0 00
00000001 1 01
00000010 2 02
00000011 3 03
11111111 -0 FF
11111110 -1 FE
11111101 -2 FD
11111100 -3 FC

Add the same two numbers:
-3 + 5 = 2
FC + 05 = 01 01
Adding carry (01) to low order (01) will get the correct result:
01 + 01 = 02

Therefore, the complement sum of 1 is completed by summing the numbers and adding one or more carry to the result.

A simple example

Suppose we have an 8-bit machine using the complement of 2 and send a packet:
FE 05 00
00 is the checksum field.

Let’s calculate and verify the network checksum. The result of this ordinary addition is:
FE + 05  =  01 03

The complement and requirement of 1 add carry (01) to the result:
03 + 01 = 04 

Therefore, the complement sum of 1 of Fe + 05 is 04.

The complement sum of 1 is:
~04  = FB

The packet will be:
FE 05 FB 

Now, at the receiving end, we add the received bytes, including checksum:
FE + 05 + FB  = 01 FE 

The complement sum of this 1 is:
FE + 01 = FF = -0 

If the checksum is – 0, it indicates that it is OK.

Another more complex example (32-bit machine):

Pseudo packet:
01 00 F2 03 F4 F5 F6 F7 00 00
(00 is the checksum field)

Group by 16 bits:
0100 F203 F4F5 F6F7

Calculation and summation:
0100 + F203 + f4f5 + f6f7 = 0002 Deef (value exists in 32-bit memory)

Add carry (0002) to the result to obtain the complement of 1 and:
DEEF + 002 = DEF1

Calculate 1’s complement of the 1’s complement sum:
~DEF1 = 210E

Packets containing checksum (21 0e):
01 00 F2 03 F4 F5 F6 F7 21 0E

At the receiving end:
0100 + F203 + F4F5 + F6F7 + 210E = 0002 FFFD
FFFD + 0002 = FFFF

The verification result is ffff, indicating that the verification is passed.

1.4 summary

Using 1’s complement addition on a 2’s complement machine may seem general. However, this method has its own advantages.
Perhaps most importantly, it is byte order independent. The small endian byte order computer puts LSB (least significant bit) last (such as Intel processor). Large endian computers put LSB first (for example, IBM mainframes). When a carry is added to the LSB to form a complement sum of 1 (see example), it doesn’t matter whether we add 03 + 01 or 01 + 03. The results are the same.

1. LSB (least significant bit) — least significant bit
LSB represents the smallest unit in binary and can be used to indicate small changes in numbers. In other words, LSB is the 0th bit (i.e. the lowest bit) of a binary number with a weight of 2 ^ 0, which can be used to detect the parity of numbers.
2. MSB (most significant bit) — most significant bit
MSB represents the N-1 bit of an n-bit binary number with the highest weight of 2 ^ (n-1). For signed binary numbers, negative numbers are in the form of inverse code or complement. At this time, MSB is used to represent symbols. MSB is 1, which represents negative numbers, and 0 represents positive numbers.

Other benefits include ease of checking transmission and checksum calculations, and a variety of ways to speed up calculations by updating only the fields of the changed IP protocol.

2、 ICMP Protocol and checksum

2.1 ICMP Protocol

Wiki introduction to ICMP Protocol

3、 ICMP implementation [trust version]

Unfinished to be continued

reference resources:
1、《Short description of the Internet checksum》
2、Inverse code, complement, why is it also called “complement of one” and “complement of two”
3、RFC 1071
4、