Using shell to simulate TCP protocol

Time:2021-6-14

Background of the problem

The company has a set of message push system (hereinafter referred to as GCM), which takes over the client part due to personnel changes. After looking at the document, there are only a few pages of simple instructions in the communication protocol part. The code is too many and messy to figure out a clue. Because the message is pushed from the background to the end, the TCP long connection channel is used to ensure the timeliness of the message. A bunch of HTTP based analysis tools (such as postman) have no use at all. Therefore, we decided to write a small tool to simulate the communication protocol on TCP as a warm-up before we get familiar with the code.

The solution of the problem

At the beginning, I wanted to use C + + to write this tool, but when I think of a series of classic (socket / bind / connect / send / recv…) tedious calls to socket, I forget it. I have written several small tools with shell before, which is very comfortable, but they all use curl command to process HTTP protocol, and curl is certainly powerless to TCP protocol, Because after the command is executed, the connection will be broken, and the long connection cannot be simulated. Can’t you write in shell? No.

Connection establishment and disconnection

It suddenly occurred to me that the shell itself seems to support opening TCP connections as files

exec N <> /dev/tcp/host/port

The above script can open the TCP connection to host and port on the file with handle n, and can read and write in both directions. So I tried it in msys2

1 exec 3<>/dev/tcp/$gcm_host/$gcm_port
2 ret=$?
3 echo "open tcp $ret"
4 if [ $ret != 0 ]; then
5     echo "connect to gcmserver failed"
6     exit 1
7 fi
8 
9 echo "connect with server"

 

Here, the script directly uses handle 3 after standard input (0), output (1) and error (2) as the connection handle. After a run, it seems that nothing happened

 

Fortunately, there is a procexp tool on windows, which can view all the data created by the process   TCP connection:

 

It seems that the connection was established successfully. Of course, you can also use the netstat command on windows to view:

C:\Users\yunh>netstat -no

Active connection

  Protocol local address external address status PID
  TCP    10.2.56.38:1993        10.100.200.2:10003     ESTABLISHED     10320
  TCP    10.2.56.38:2346        175.27.0.15:80         ESTABLISHED     14808
  TCP    10.2.56.38:2474        121.51.139.161:8080    ESTABLISHED     15092
  TCP    10.2.56.38:3147        10.2.56.13:7680        ESTABLISHED     8816
  TCP    10.2.56.38:3576        47.97.243.182:80       ESTABLISHED     11292
  TCP    10.2.56.38:3602        10.0.24.13:28888       ESTABLISHED     16224
  TCP    10.2.56.38:3720        113.96.233.143:443     ESTABLISHED     15252
  TCP    10.2.56.38:5006        10.2.61.20:7680        ESTABLISHED     8816
  TCP    10.2.56.38:5022        10.2.25.16:7680        ESTABLISHED     8816
  TCP    10.2.56.38:5303        49.232.126.211:443     ESTABLISHED     11292
  TCP    10.2.56.38:6182        10.0.109.249:443       ESTABLISHED     16168
  TCP    10.2.56.38:6183        10.0.109.249:443       ESTABLISHED     16168
  TCP    10.2.56.38:6357        52.11.109.209:443      ESTABLISHED     11292
  TCP    10.2.56.38:6697        40.90.189.152:443      ESTABLISHED     5268
  TCP    10.2.56.38:7065        117.18.237.29:80       CLOSE_WAIT      4724
  TCP    10.2.56.38:7100        220.170.53.122:443     TIME_WAIT       0
  TCP    10.2.56.38:7113        220.181.174.166:443    TIME_WAIT       0
  TCP    10.2.56.38:7117        180.163.150.166:443    ESTABLISHED     11292
  TCP    10.2.56.38:7135        140.143.52.226:443     TIME_WAIT       0
  TCP    10.2.56.38:7141        10.0.24.13:8888        CLOSE_WAIT      16224
  TCP    10.2.56.38:7143        101.201.169.146:443    TIME_WAIT       0
  TCP    10.2.56.38:7144        103.15.99.107:443      TIME_WAIT       0
  TCP    10.2.56.38:7148        203.119.214.115:443    TIME_WAIT       0
  TCP    10.2.56.38:7149        61.151.167.89:443      TIME_WAIT       0
  TCP    10.2.56.38:7150        203.119.169.141:443    TIME_WAIT       0
  TCP    10.2.56.38:7151        203.119.144.59:443     TIME_WAIT       0
  TCP    10.2.56.38:7159        114.55.187.58:443      ESTABLISHED     11292
  TCP    10.2.56.38:7160        42.121.254.191:443     TIME_WAIT       0
  TCP    10.2.56.38:7162        118.178.109.187:443    TIME_WAIT       0
  TCP    10.2.56.38:7165        47.110.223.99:443      TIME_WAIT       0
  TCP    10.2.56.38:7166        116.62.93.118:443      TIME_WAIT       0
  TCP    10.2.56.38:7195        123.150.76.171:80      CLOSE_WAIT      10772
  TCP    10.2.56.38:6974        ##################     ESTABLISHED     10984
  TCP    10.2.56.38:7215        192.168.0.9:80         CLOSE_WAIT      4700
  TCP    10.2.56.38:7218        10.2.100.217:7680      SYN_SENT        8816
  TCP    10.2.56.38:7219        192.168.56.1:7680      SYN_SENT        8816
  TCP    10.2.56.38:7680        10.2.102.27:53199      ESTABLISHED     8816
  TCP    10.2.56.38:9763        192.168.23.23:49156    ESTABLISHED     4600
  TCP    10.2.56.38:10267       125.39.132.161:80      ESTABLISHED     10772
  TCP    10.2.56.38:10816       60.205.204.27:80       ESTABLISHED     10872
  TCP    127.0.0.1:443          127.0.0.1:7216         ESTABLISHED     8108
  TCP    127.0.0.1:2002         127.0.0.1:2003         ESTABLISHED     11292
  TCP    127.0.0.1:2003         127.0.0.1:2002         ESTABLISHED     11292
  TCP    127.0.0.1:2013         127.0.0.1:2014         ESTABLISHED     9600
  TCP    127.0.0.1:2014         127.0.0.1:2013         ESTABLISHED     9600
  TCP    127.0.0.1:2015         127.0.0.1:2016         ESTABLISHED     12948
  TCP    127.0.0.1:2016         127.0.0.1:2015         ESTABLISHED     12948
  TCP    127.0.0.1:2040         127.0.0.1:2041         ESTABLISHED     13960
  TCP    127.0.0.1:2041         127.0.0.1:2040         ESTABLISHED     13960
  TCP    127.0.0.1:2109         127.0.0.1:2110         ESTABLISHED     15092
  TCP    127.0.0.1:2110         127.0.0.1:2109         ESTABLISHED     15092
  TCP    127.0.0.1:2349         127.0.0.1:50051        ESTABLISHED     6308
  TCP    127.0.0.1:2566         127.0.0.1:30031        ESTABLISHED     10624
  TCP    127.0.0.1:3032         127.0.0.1:3033         ESTABLISHED     20276
  TCP    127.0.0.1:3033         127.0.0.1:3032         ESTABLISHED     20276
  TCP    127.0.0.1:3517         127.0.0.1:3518         ESTABLISHED     18200
  TCP    127.0.0.1:3518         127.0.0.1:3517         ESTABLISHED     18200
  TCP    127.0.0.1:3768         127.0.0.1:3769         ESTABLISHED     14076
  TCP    127.0.0.1:3769         127.0.0.1:3768         ESTABLISHED     14076
  TCP    127.0.0.1:3854         127.0.0.1:3855         ESTABLISHED     17380
  TCP    127.0.0.1:3855         127.0.0.1:3854         ESTABLISHED     17380
  TCP    127.0.0.1:4895         127.0.0.1:4896         ESTABLISHED     15524
  TCP    127.0.0.1:4896         127.0.0.1:4895         ESTABLISHED     15524
  TCP    127.0.0.1:5320         127.0.0.1:5321         ESTABLISHED     16736
  TCP    127.0.0.1:5321         127.0.0.1:5320         ESTABLISHED     16736
  TCP    127.0.0.1:6688         127.0.0.1:10803        ESTABLISHED     10872
  TCP    127.0.0.1:6688         127.0.0.1:10824        ESTABLISHED     10872
  TCP    127.0.0.1:6688         127.0.0.1:10841        ESTABLISHED     10872
  TCP    127.0.0.1:6688         127.0.0.1:10849        ESTABLISHED     10872
  TCP    127.0.0.1:6689         127.0.0.1:10819        ESTABLISHED     10672
  TCP    127.0.0.1:7187         127.0.0.1:443          TIME_WAIT       0
  TCP    127.0.0.1:7216         127.0.0.1:443          ESTABLISHED     10548
  TCP    127.0.0.1:8419         127.0.0.1:8420         ESTABLISHED     14716
  TCP    127.0.0.1:8420         127.0.0.1:8419         ESTABLISHED     14716
  TCP    127.0.0.1:10803        127.0.0.1:6688         ESTABLISHED     2256
  TCP    127.0.0.1:10819        127.0.0.1:6689         ESTABLISHED     13436
  TCP    127.0.0.1:10824        127.0.0.1:6688         ESTABLISHED     10672
  TCP    127.0.0.1:10841        127.0.0.1:6688         ESTABLISHED     15448
  TCP    127.0.0.1:10849        127.0.0.1:6688         ESTABLISHED     9772
  TCP    127.0.0.1:30031        127.0.0.1:2566         ESTABLISHED     10608
  TCP    127.0.0.1:50051        127.0.0.1:2349         ESTABLISHED     10608
  TCP    [::1]:5900             [::1]:5901             ESTABLISHED     10548
  TCP    [::1]:5901             [::1]:5900             ESTABLISHED     10548
  TCP    [::1]:7188             [::1]:8307             FIN_WAIT_2      8108
  TCP    [::1]:7217             [::1]:8307             ESTABLISHED     8108
  TCP    [::1]:8307             [::1]:7188             CLOSE_WAIT      8108
  TCP    [::1]:8307             [::1]:7217             ESTABLISHED     8108

 

Here is mainly through filtering process ID to achieve rapid positioning. The connection can also be actively closed, which requires the following redirection syntax (in fact, closing a normal file)

exec N < &-

Where n is the file handle just opened, which can be replaced with > equivalent

 

Finally, it is still through the procexp tool or netstat command to view the execution results. In addition, use echo $? Getting the exec execution result to 0 doesn’t seem to confirm that the connection has been established, because I can still get 0 by using exec for a wrong host + port.

Machine up and down

After the connection is established, some basic information of the machine needs to be reported to the background. This protocol is called machine online

 1 function send_request_100 ()
 2 {
 3     local msg=$(cat protocol/100.gcm)
 4     # do replace
 5     msg=$(echo "$msg" | jq --arg guid "$devid" --arg hwid "$hardid" -c '{ version, msgtype, guid: $guid, devinfo: { hwid: $hwid, devid: $guid, os: .devinfo.os, os_version: .devinfo.os_version, sysbit: .devinfo.sysbit, languageid: .devinfo.languageid } }')
 6     echo $msg >&3
 7     local ret=$?
 8     if [ $ret -ne 0 ]; then 
 9         echo "connection break, send failed"
10         exit 3
11     fi
12 }
13 
14 # online myself
15 send_request_100

 

The process of putting the machine online is encapsulated as a function: send_ request_ 100, here 100 is the message number of the machine online. In fact, message sending is a matter of code (line 6). The main work of this function is to assemble the content of 100 protocol (line 3-5). Messages are strings in JSON format. In order to reduce the coupling between code and protocol, each protocol is put in a separate file. For example, the “100. GCM” file above stores the message template of the machine online

{
    "version": "3.1",
    "msgtype": "100",
    "guid": "",
    "devinfo": {
        "hwid": "",
        "devid": "",
        "os": "Windows",
        "os_version": "7",
        "sysbit": "64",
        "languageid": "2052"
    }
}

 

After reading the local variables from the file, you need to do some filling work (GUID / hwid / devid… Fields). Here, you use the — Arg option of JQ command to pass external parameters and re knead the JSON string based on them. These parameters (devid / hard did) are read from the registry and passed in before the script starts. After the machine is online, you can upload and offline the product. Correspondingly, when the client stops, you should also tell the background machine to offline:

 1 function send_request_101
 2 {
 3     local msg=$(cat protocol/101.gcm)
 4     # do replace
 5     msg=$(echo "$msg" | jq --arg guid "$devid"  -c '{ version, msgtype, guid: $guid }')
 6     echo $msg >&3
 7     local ret=$?
 8     echo "send 101 msg to gcm $ret" >> log.txt
 9     if [ $ret -ne 0 ]; then 
10         echo "connection break, send failed"
11         exit 3
12     fi
13 
14     # no response for 101 message
15     echo "offline success! devid=$devid"
16 }
17 
18 # -1st offline myself
19 send_request_101

 

This process is encapsulated in send_ request_ 101 function, where 101 is the message number of the machine offline. Similarly, this message has a template file:

{
    "version": "3.1",
    "msgtype": "101",
    "guid": ""
}

 

It’s relatively simple. The protocol subdirectory contains all the message protocol templates:

$ ls -lhrt
total 7.0K
-Rw-r -- R -- 1 yunh 1049089 312 May 28 2019 102.gcm
-Rw-r -- R -- 1 yunh 1049089 102 May 28 2019 101.gcm
-Rw-r -- R -- 1 yunh 1049089 350 May 28 2019 100.gcm
-Rw-r -- R -- 1 yunh 1049089 141 May 28 2019 412.gcm
-Rw-r -- R -- 1 yunh 1049089 166 May 28 2019 108.gcm
-Rw-r -- R -- 1 yunh 1049089 193 May 28 2019 103.gcm
-Rw-r -- R -- 1 yunh 1049089 478 July 26 2019 custom.gcm

 

There is not much difference between the other processing flow of offline machine and online machine, so I won’t go into details here. Later, we will not give a detailed introduction to the content of the message, mainly related to the confidentiality of the agreement.

Product online and offline

When the machine is on-line, the product is on-line when it is on-line. In this way, when there is push content in the background, the corresponding message can be pushed (not for products that are not on-line)

 1 # $1: app name
 2 # $2: app version
 3 # $3: user id
 4 # $4: device id
 5 function send_request_102 ()
 6 {
 7     local guid=$(echo "$4$1$3" | sha1sum | awk '{ print $1 }')
 8     local msg=$(cat protocol/102.gcm)
 9     # do replace
10     msg=$(echo "$msg" | jq --arg appname "$1" --arg appver "$2" --arg userid "$3" --arg guid "$guid" -c '{ version, msgtype, guid: $guid, appclientid: $appname, appuserid: $userid, clientinfo: { appversion: $appver, platform: .clientinfo.platform, bits: .clientinfo.bits } }')
11     echo $msg >&3
12     local ret=$?
13     echo "send 102 msg to gcm $ret" >> log.txt
14     if [ $ret -ne 0 ]; then 
15         echo "connection break, send failed"
16         exit 3
17     fi
18 }
19 
20 # online GCMPopBox/GUX/GSUP
21 send_request_102 "GCMPopBox" "2.0.0.0" "$hardid" "$devid"
22 send_request_102 "GUX" "$version" "$devid" "$devid"
23 send_request_102 "GSUP" "$version" "$devid" "$devid"

 

This process is encapsulated in send_ request_ 102 function, where 102 is the message number of the product online. This function receives four parameters: Product ID, product version, user ID and machine ID. After the machine goes online, there will be three fixed products: gcmpopbox, Gux and GSUP, which are several service products of the client. When the product is closed, the product offline message should be sent to the background

 1 # $1: app name
 2 # $2: user id
 3 # $3: device id
 4 function send_request_103
 5 {
 6     local guid=$(echo "$3$1$2" | sha1sum | awk '{print $1}')
 7     local msg=$(cat protocol/103.gcm)
 8     # do replace
 9     msg=$(echo "$msg" | jq --arg appname "$1" --arg userid "$2" --arg guid "$guid" -c '{ version, msgtype, guid: $guid, appclientid: $appname, appuserid: $userid }')
10     echo $msg >&3
11     local ret=$?
12     echo "send 103 msg to gcm $ret" >> log.txt
13     if [ $ret -ne 0 ]; then 
14         echo "connection break, send failed"
15         exit 3
16     fi
17 
18     # no response for 103 message
19     echo "$1 offline success! userid=$2"
20 }
21 
22 # -2nd offline GCMPopBox/GUX/GSUP
23 send_request_103 "GCMPopBox" "$hardid" "$devid"
24 send_request_103 "GUX" "$devid" "$devid"
25 send_request_103 "GSUP" "$devid" "$devid"

 

This process is encapsulated in send_ request_ 103 function, where 103 is the message number of the product offline. Compared with the product launch message, there is no need to provide the product version, other aspects are similar. Before the machine goes offline, it is necessary to offline several client service products (gcmpopbox / Gux / GSUP). In addition to fixed products, users can also specify a product to go online on the command line. When the tool runs, it looks like this:

 

The part in the red box is actually a cycle. Users can continuously input the products to be online and offline for operation. The code of this part is also in a section   In the while loop:

 1 # online/offline products
 2 while :
 3 do 
 4     echo "-------------------------------------------"
 5     echo -n "product name to operate (exit|quit to quit): "
 6     read product
 7     if [ "$product" == "exit" -o "$product" == "quit" ]; then 
 8         break; 
 9     fi
10 
11     echo -n "operation (online|offline): "
12     read resp
13     online=0
14     case "$resp" in
15       ""|"o"|"O"|"on"|"ON"|"online"|"ONLINE")
16         online=1
17         ;;
18       *)
19         ;;
20     esac
21 
22     if [ $online == 1 ]; then 
23         echo -n "version: "
24         read version
25     fi
26 
27     echo -n "user id: "
28     read userid
29 
30     if [ $online == 1 ]; then 
31         send_request_102 "$product" "$version" "$userid" "$devid"
32     else 
33         send_request_103 "$product" "$userid" "$devid"
34     fi
35 
36     sleep 1
37 done 

Here’s a brief explanation:

  • Line 4-9: if the user enters quit or exit, exit the loop and exit the whole script. Otherwise, the product identification to be operated will be collected;
  • Line 11-20: prompt the user to input the operation, online or offline;
  • Line 22-25: if the operation is online, the user needs to input the product version;
  • Line 27-28: prompt user to input user ID;
  • Line 30-34: according to the operation type entered by the user, call the previously encapsulated function to complete the product loading and unloading.

 

Receive push message

After the successful launch of the product, you can receive “greetings” from the background. This is different from the previous question and answer mode. It needs to process the data arriving on the connection asynchronously. My first reaction is to open a thread to handle it, but there is no thread in the shell, only the child process can use it. The question is, can the original handle (3) represent the previous connection after opening the child process? There is no doubt about this on Linux, but there is no fork on windows. How can we ensure that the newly started child process can copy the user space of the parent process? With doubts, I tried the following code:

1 echo "connect with server"
2 on_recv &
3 cpid=$!

 

Encapsulate the code related to receiving message in on_ Recv function, you can directly start a separate process with ‘&’ to run this function! As a test, I was only on at the beginning_ Several simple response messages (100 – > 201102 – > 301…) are processed in recv

 1 function on_recv
 2 {
 3     # can not break read !
 4     #trap "echo recv exit signal from parent" INT
 5     while :
 6     do
 7         #read msg 

 

When receiving these response messages, the key fields will be printed on the screen. Like the request message, the response message is also in pure JSON format, so here we use JQ for parsing (line 17-33).

But the difficulty is not here. What really bothers me for a long time is reading. Some people may have said that it’s difficult to read. Can’t I just read it directly? This is what I did at the beginning (line 7). However, the read will always be stuck there to read the data. Even if a message has been read, it will not return. Simple analysis: it seems that read is waiting for an end flag, which is generally a new line ‘\ n’. This is why you can input content in the console interface until the end of the carriage return. However, the background response message does not end with a new line character, so I tried another scheme, using tail – F to read the content of the connection (line 8), but there is no improvement.

So here I try the third scheme (line 9), which adds a timeout time (1s) for read, so that when the time is long enough, the message read before can be returned. The disadvantage is that the read will be interrupted every second; However, I have overlooked a more serious problem, that is, when the product has a lot of messages overstocked but not pushed, when it goes online, the background will push multiple messages to it at the same time. In this way, the read with timeout will often return multiple messages together, leading to later parsing errors.

So I tried the present scheme (line 11), telling read to read until it meets the ‘}’ ending character of JSON. Of course, this is not completely safe, because there may be nested substructures in JSON, which may lead to the internal ‘}’, but fortunately, in the existing protocols, the response message is relatively simple. Basically, there will be no more curly braces in a pair of curly braces, so this can be done. In fact, the effect of this script in running has been shown in the previous figure. This time, we need to draw a new emphasis

 

As you can see, the new child process can well receive the response message of the machine and product online (there is no response message when offline), which looks like it shares this connection with the parent process. After verifying that msys2 has no problem, we will start our play: receiving messages pushed by the background:

 1           "105")
 2             local guid=$(echo "$msg" | jq -r '.guid')
 3             local appclientid=$(echo "$msg" | jq -r '.appclientid')
 4             local msgid=$(echo "$msg" | jq -r '.msgid')
 5             local msgbody=$(echo "$msg" | jq -r '.msgbody')
 6             local appuserid=$(echo "$msg" | jq -r '.appuserid')
 7             local dstuserid=$(echo "$msg" | jq -r '.dstuserid')
 8             echo ""
 9             echo "*******************************************"
10             echo "receive customer message "
11             echo "product: $appclientid"
12             echo "userid : $appuserid"
13             echo "msgid  : $msgid"
14             local body_utf8=$(echo "$msgbody" | base64 -d)
15             local body=$(echo "$body_utf8" | iconv -f utf-8 -t gb2312)
16             echo "content: $body"
17             echo "*******************************************"
18             send_request_108 "$guid" "$appclientid" "$appuserid" "$msgid"
19             ;;

 

Here, the case statement is added directly. 105 is a user-defined message. This application just “stealthily” handles it without showing it to users. Here, for the purpose of demonstration, it directly prints the message content on the screen (there are some Base64 decoding and UTF8 encoding conversion works: Line 14-15). After receiving the message, reply a 108 message to the background to indicate successful reception_ request_ 108 is similar to other send functions, which will not be explained here. The really complicated part is receiving pop-up messages:

 1           "401")
 2             local guid=$(echo "$msg" | jq -r '.guid')
 3             local appclientid=$(echo "$msg" | jq -r '.appclientid')
 4             local msgid=$(echo "$msg" | jq -r '.msgid')
 5             local msgbody=$(echo "$msg" | jq -r '.msgbody')
 6             local appuserid=$(echo "$msg" | jq -r '.appuserid')
 7             local dstuserid=$(echo "$msg" | jq -r '.dstuserid')
 8             echo ""
 9             echo "*******************************************"
10             echo "receive popup message "
11             echo "product: $appclientid"
12             echo "userid : $appuserid"
13             echo "msgid  : $msgid"
14             echo ""
15             local body=$(echo "$msgbody" | base64 -d)
16             local title_utf8=$(echo "$body" | jq -r '.title')
17             local title=$(echo "$title_utf8" | iconv -f utf-8 -t gb2312)
18             local content_utf8=$(echo "$body" | jq -r '.content')
19             local content=$(echo "$content_utf8" | iconv -f utf-8 -t gb2312)
20             local ctxurl=$(echo "$body" | jq -r '."content-url"') # - is not recognized
21             local image=$(echo "$body" | jq -r '.image')
22             local imgurl=$(echo "$body" | jq -r '."image-url"') # - is not recognized
23             echo "title  : $title"
24             echo "content: $content"
25             echo "ctxurl : $ctxurl"
26             echo "image  : $image"
27             echo "imgurl : $imgurl"
28             echo "*******************************************"
29 
30             # prepare 108 message
31             send_request_108 "$guid" "$appclientid" "$appuserid" "$msgid"
32             sleep 1
33             # prepare 402 message
34             send_request_402 "$msg" "$hardid"
35             ;;

 

401 is a pop-up message. Originally, it was meant to pop up a small window in the lower right corner of the screen for users to display. Here, in order to simplify the problem, it is also printed directly on the screen. After receiving the 401 message, you should first reply to the background with a 108 to indicate successful reception, and then reply with a 402 to indicate the final result of the pop-up window, such as the user clicking, closing, viewing details, and so on. Here, you can directly return to the user closing as a simulation. The following is the effect of receiving push messages after the product goes online:

 

Here we demonstrate two messages, pop-up messages and custom messages. You can see that they can be parsed and displayed normally. The background can also normally count the push status of these two messages:

 

Finally, when the user exits the operation cycle, the subprocess needs to be recycled in time

1 exec 3>&-
2 kill -INT $cpid
3 wait

 

Here, the child process is informed to exit the receiving loop by killing the int message, and then wait for the child process to exit completely by waiting. Previously, I tried to trap int signal in the subprocess and exit gracefully. However, I found that adding this trap in the windows environment led to the read not being interrupted, so gave it up. Now this way may be to kill the child process directly. Although it is “violent”, it can at least work normally.

Postscript

By building this gadget, I even found errors or unknowns in the protocol documents. But what makes me most curious is how two processes share a connection handle on windows? In order to solve this problem, we offer the procexp killer

 

You can see that the connection is only displayed in the parent process (20612), and there is no corresponding connection in the child process (16844). How does it read data on the connection? Looking left and right, I don’t see any clue. I want to use the lsof command of msys2 to view the process handle, but I can’t find this command even after searching the installation directory. It seems that msys2 has not transplanted all the commands. Fortunately, lsof is also implemented through the proc file subsystem. Can you view the proc directory of the process? The answer is yes

$ ls -l /proc
total 0
Dr-xr-xr-x 3 yunh 1049089 0 Nov 24 14:47 16796/
Dr-xr-xr-x 3 yunh 1049089 0 Nov 24 13:35 17992/
Dr-xr-xr-x 3 yunh 1049089 0 Nov 24 14:47 18468/
Dr-xr-xr-x 3 yunh 1049089 0 Nov 25 15:36 20828/
Dr-xr-xr-x 3 yunh 1049089 0 Nov 24 13:35 7464/
-R -- R -- R -- 1 yunh 1049089 0 November 25 15:36 cpuinfo
Lrwxrwxrwx 1 yunh 1049089 0 November 25 15:36 cygdrive - >//
-R -- R -- R -- 1 yunh 1049089 0 November 25 15:36
-R -- R -- R -- 1 yunh 1049089 0 November 25 15:36 file systems
-R -- R -- R -- 1 yunh 1049089 0 November 25 15:36 loadavg
-R -- R -- R -- 1 yunh 1049089 0 November 25 15:36 meminfo
-R -- R -- R -- 1 yunh 1049089 0 November 25 15:36 Misc
Lrwxrwxrwx 1 yunh 1049089 0 November 25 15:36 mounts - > self / mounts
Dr-xr-xr-x 2 yunh 1049089 0 November 25 15:36 net/
-R -- R -- R -- 1 yunh 1049089 0 November 25 15:36 partitions
Dr-xr-xr-x 8 yunh 1049089 0 November 25 15:36 registry/
Dr-xr-xr-x 8 yunh 1049089 0 November 25 15:36 registry32/
Dr-xr-xr-x 8 yunh 1049089 0 November 25 15:36 registry64/
Lrwxrwxrwx 1 yunh 1049089 0 November 25 15:36 self - > 20828/
-R -- R -- R -- 1 yunh 1049089 0 November 25 15:36 stat
-R -- R -- R -- 1 yunh 1049089 0 November 25 15:36 swaps
Drwxrwx --- 1 administrators 18 0 November 25 15:36 sys/
Dr-xr-xr-x 2 yunh 1049089 0 Nov 25 15:36 sysvipc/
-R -- R -- R -- 1 yunh 1049089 0 November 25 15:36 uptime
-R -- R -- R -- 1 yunh 1049089 0 November 25 15:36 version

 

This is what I printed on another terminal. However, after searching, the directory corresponding to the above two process IDs was not found. What about printing directly in scripts?

ls -lhrt /proc/self/

Self is self. Place this code in the position of the parent process after the connection is established to connect with the child process_ At the beginning of the recv function, you get the following output:

connect with server
total 0
lrwxrwxrwx 1 yunh Domain Users 0 Nov 25 15:31 0 -> /dev/null
lrwxrwxrwx 1 yunh Domain Users 0 Nov 25 15:31 1 -> /dev/cons0
lrwxrwxrwx 1 yunh Domain Users 0 Nov 25 15:31 2 -> /tools/gsupgo/error.txt
lrwxrwxrwx 1 yunh Domain Users 0 Nov 25 15:31 3 -> socket:[1]
lrwxrwxrwx 1 yunh Domain Users 0 Nov 25 15:31 4 -> /proc/8532/fd
total 0
lrwxrwxrwx 1 yunh Domain Users 0 Nov 25 15:31 0 -> /dev/cons0
lrwxrwxrwx 1 yunh Domain Users 0 Nov 25 15:31 1 -> /dev/cons0
lrwxrwxrwx 1 yunh Domain Users 0 Nov 25 15:31 2 -> /tools/gsupgo/error.txt
lrwxrwxrwx 1 yunh Domain Users 0 Nov 25 15:31 3 -> socket:[1]
lrwxrwxrwx 1 yunh Domain Users 0 Nov 25 15:31 4 -> /proc/15580/fd

 

The above section is the output of the parent process, and handle 3 corresponds to a TCP connection; The following section is the output of the child process, which looks the same as the parent process. The most interesting thing is the 4 file handle of the two processes, showing their respective PID, which is obviously different from their process ID on windows. Maybe that’s why I couldn’t find them in the / proc directory before. However, if you look at the / proc directory again, you still don’t have the above two PIDs. It can be seen that this PID may only be limited to the (?) of this process group, It’s not a global sharing, so it’s of little value.

When you explore this, you will come to a dead end. If you know the great God of msys2 implemented on windows, please don’t hesitate to give me your advice.

 

Finally, the gadget has no resources to download – it’s about the security of intra company protocols. But with so much writing, I believe it’s not difficult to rewrite one of my own~

reference resources

[1]. TCP and UDP connection in Linux shell script

[2]. Netstat — view the number of server [Effective] connections — count the port concurrency — access.log analysis

[3]. jq add or update a value with multiple –arg