Solution to the problem of unsuccessful matching numbers with “\ \ D” in grep

Time:2020-8-18

First of all, regular expressions are divided into three categories (man grep can see that they are basic regexs, extended regexs, Perl regexs)

Regular expression: in computer science, a single string used to describe or match a series of strings that conform to a certain syntactic rule. In many text editors or other tools, regular expressions are often used to retrieve or replace text content that conforms to a certain pattern. The concept of regular expression was first popularized by UNIX tools such as sed and grep.

1、 Regular expression classification:

1. Basic regular expression (bres)

2. Extended regular expression (eres)

3. Perl regular expression (also known as Perl regex, pres for short)

Note: only by mastering regular expressions, can we fully grasp the usage of common text tools (such as grep, egrep, gunsed, awk, etc.) under Linux

2、 The relationship between common text tools and regular expressions in Linux

It is helpful for us to use regular expressions better by grasping the characteristics of several common text tools under Linux

The features of grep and egrep regular expressions are as follows

1) Grep supports: bres, eres, pres regular expressions

“Bres” is used if the grep directive is not followed by any parameters“

Grep instruction followed by the “- e” parameter indicates that “eres” is to be used“

Grep instruction followed by “- P” parameter indicates that “pres” is to be used

2) Egrep support: eres, pres regular expressions

“Eres” is used if the egrep instruction is not followed by any parameters

The egrep instruction followed by the “- P” parameter indicates that “pres” is to be used

3) Grep and egrep regular matching file, processing file method

a. Grep and egrep processing object: text file

b. The processing of grep and egrep: find whether the “keyword” to be searched in the text file (the keyword can be a regular expression). If there is a “key word” to be searched, the content of the line containing the “key word” in the text file will be returned by default and displayed in the standard output, unless the “>” redirection symbol is used,

c. Grep and egrep process text files by line

Characteristics of SED regular expression

1) Sed text tool support: bres, eres

The SED instruction uses “bres” by default

The SED command parameter “- R” indicates that “eres” is to be used“

2) The function and function of sed

a. Sed processing object: text file

b. Sed processing operation: search, replace, delete and add the contents of the text file

c. Sed also processes text files by line

Characteristics of awk (gawk) regular expression

1) Awk text tool support: eres

The awk instruction uses “eres” by default“

2) The characteristics of awk text tools in dealing with text

a. Awk processing object: text file

b. Awk processing operation: mainly for column operation

3、 Comparison of type regular expressions in common 3

character explain Basic RegEx Extended RegEx python RegEx Perl regEx
Paraphrase \ \ \ \
^ Match the beginning of a line, for example ‘^ dog’ matches a line that begins with a string dog (Note: in awk instructions’ ^ ‘is the beginning of the matching string) ^ ^ ^ ^
$ Match the end of a line, for example: ‘^, dog $’ matches the line ending with the string dog (Note: in awk instruction, ‘$’ is the end of the matching string) $ $ $ $

^$

Match blank lines

^$ ^$ ^$ ^$
^string$ Match rows, for example: ‘^ dog $’ matches rows with only one string dog ^string$ ^string$ ^string$ ^string$
\< Match words, such as’ \ < frog ‘(equivalent to’ \ \ bfrog ‘), matches words that start with frog \< \< I won’t support it Does not support (but you can use the \ \ B to match words, for example: ‘\ bfrog’)

\>

Match words, e.g., ‘Frog \ >’ (equivalent to ‘Frog / B’), match words ending with frog \> \> I won’t support it Does not support (but you can use ‘B’ to match words, e.g. ‘Frog / B’)

\<x\>

Match a word or a specific character, such as’ \ < frog \ > ‘(equivalent to’ \ \ bfrog \ > ‘,’ \ < g \ > ‘ \<x\> \<x\> I won’t support it Does not support (but you can use the ⁃ B ﹣ B ⁃ B ﹣ to match words, such as’ ﹣ bfrog ﹣ B ‘

()

Match expression, e.g. ‘(frog)’ is not supported Not supported (but can use \ (\), e.g. \ (dog \) () () ()

\(\)

Match expression, e.g. ‘(frog)’ is not supported \(\) Not supported (same as ()) Not supported (same as ()) Not supported (same as ())

Match the previous subexpression 0 or 1 times (equivalent to {0,1}), for example: where (is)? Can match “where” and “where is” Not supported
\? Match the previous subexpression 0 or 1 times (equivalent to ‘\ {0,1}’), for example, ‘where’ (is’) \? ‘can match “where” and “where is” \? Not supported Not supported Not supported
? When the character follows any other qualifier (*, +,?, {n}, {n,}, {n, m}), the matching pattern is non greedy. The non greedy pattern matches as few strings as possible, while the default greedy pattern matches as many strings as possible. For example, for the string “oooo”,’O +? ‘will match a single “O”, while’ O + ‘will match all’ o ‘ I won’t support it I won’t support it I won’t support it I won’t support it
. Matches any single character other than the newline character (‘\ n’) (Note: the period in the awk instruction can match the newline character) . (if you want to match any one of the characters, including “‘(^ $) | (.) . (if you want to match any one of the characters, including “‘[. N]’
* Match the previous subexpression 0 or more times (equivalent to {0,}), for example: Zo * can match “Z” and “zoo” * * * *
\+ Match the previous subexpression one or more times (equivalent to ‘\ {1, \}’), for example, ‘where \ (is’) \ +’ can match “where is” and “where is” \+ Not supported (same as +) Not supported (same as +) Not supported (same as +)
+ Match the preceding subexpression one or more times (equivalent to {1,}), for example: Zo + can match “Zo” and “zoo”, but not “Z” Not supported (same as \ +) + + +

{n}

N must be a 0 or a positive integer, matching the subexpression n times, for example: Zo {2} can match Not supported (same as {n}) {n} {n} {n}
{n,} “Zooz”, but cannot match “Bob” n. It must be a 0 or a positive integer, and the matching sub expression is greater than or equal to N times, for example: go {2,} Not supported (same as \ {n, \}) {n,} {n,} {n,}
{n,m} Can match “good”, but can not match. Both godm and N are nonnegative integers, where n < = m, matches at least N times and at most m times, for example: O {1,3} will match the first three o in “food” (note that there can be no space between comma and two numbers) Not supported (same as {n, m}) {n,m} {n,m} {n,m}

x|y

Match X or Y, for example, ‘z| (food)’ is not supported to match “Z” or “food”; ‘(z| f) ood’ matches “zood” or “food” Not supported (same as X \ | y) x|y x|y x|y

[0-9]

Match any numeric character from 0 to 9 [0-9] [0-9] [0-9] [0-9]

[xyz]

A set of characters that match any of the included characters, for example: ‘[ABC]’ can match ‘a’ in ‘lay’ (Note: if metacharacters, such as… * and so on, are placed in [], they will become a normal character.) [xyz] [xyz] [xyz] [xyz]

[^xyz]

A set of negative characters to match any character that is not included (Note: line breaks are not included). For example: ‘[^ ABC]’ can match ‘l’ in ‘lay’ (Note: [^ XYZ] in awk instruction, it matches any character not included + newline character) [^xyz] [^xyz] [^xyz] [^xyz]
[A-Za-z] Match any character in uppercase or lowercase letters [A-Za-z] [A-Za-z] [A-Za-z] [A-Za-z]
[^A-Za-z] Match any character except uppercase and lowercase letters [^A-Za-z] [^A-Za-z] [^A-Za-z] [^A-Za-z]

\d

Matches any numeric character from 0 to 9 (equivalent to [0-9]) I won’t support it I won’t support it \d \d

\D

Matches non numeric characters (equivalent to [^ 0-9]) I won’t support it I won’t support it \D \D
\S Matches any non white space characters (equivalent to [^ ^ f / N / R / T / v]) I won’t support it I won’t support it \S \S
\s Matches any white space characters, including spaces, tabs, page breaks, and so on (equivalent to [\ f / N / R / T / v]) I won’t support it I won’t support it \s \s
\W

Matches any non word character (equivalent to [^ a-za-z0-9_ )

\W \W \W \W
\w Matches any word character including an underline (equivalent to [a-za-z0-9_ ]) \w \w \w \w
\B Match non word boundaries, for example: ‘er / B’ matches’ er ‘in’ verb ‘, but not’ er ‘in’ never ‘ \B \B \B \B

\b

Match the boundary of a word, that is, the position between the word and the space. For example, ‘er / B’ can match ‘er’ in ‘never’, but can’t match ‘er’ in ‘verb’ \b \b \b \b
\t Matches a horizontal tab character (equivalent to both the I won’t support it I won’t support it \t \t
\v Matches a vertical tab character (equivalent to both the I won’t support it I won’t support it \v \v
\n Matches a newline character (equivalent to both a and CJ) I won’t support it I won’t support it \n \n
\f Match a page break (equivalent to both the \ \ x0c and the \ \ CL) I won’t support it I won’t support it \f \f
\r Match a carriage return character (equivalent to both the I won’t support it I won’t support it \r \r
\\ Match escape character itself ‘\ “ \\ \\ \\ \\

\cx

Match the control character indicated by X, for example: cm matches a control-m or carriage return character. The value of X must be one of A-Z or A-Z. otherwise, C is treated as an original ‘C’ character I won’t support it I won’t support it \cx

\xn

Matches n, where n is a hexadecimal escape value. The hexadecimal escape value must be a certain two digit length, e.g. ‘x41′ matches’ a ‘. ‘x041’ is equivalent to ‘\ X04’ & “1”. ASCII encoding can be used in regular expressions I won’t support it I won’t support it \xn

\num

Matches num, where num is a positive integer. Represents a reference to the obtained match I won’t support it \num \num
[:alnum:] Matches any letter or number ([a-za-z0-9]), for example: ‘[[: alnum:]]]’ [:alnum:] [:alnum:] [:alnum:] [:alnum:]
[:alpha:] Match any letter ([a-za-z]), for example: ‘[[: alpha)]]’ [:alpha:] [:alpha:] [:alpha:] [:alpha:]
[:digit:] Match any number ([0-9]), for example: ‘[[: digit)]]’ [:digit:] [:digit:] [:digit:] [:digit:]
[:lower:] Matches any lowercase letter ([A-Z]), for example: ‘[[: lower)]]’ [:lower:] [:lower:] [:lower:] [:lower:]
[:upper:] Match any capital letter ([A-Z]) [:upper:] [:upper:] [:upper:] [:upper:]
[:space:] Any white space character: support tab character, space, for example: ‘[[: Space:]]’ [:space:] [:space:] [:space:] [:space:]
[:blank:] Spaces and tabs (horizontal and vertical), for example: ‘[[: blank]]’ ó ‘[\ s / T / v]’ [:blank:] [:blank:] [:blank:] [:blank:]
[:graph:] Any character that can be seen and printed (Note: spaces and line breaks are not included), for example: ‘[[: graph:]]’ [:graph:] [:graph:] [:graph:] [:graph:]
[:print:] Any character that can be printed (Note: does not include: [: CNTRL:], string terminator ‘\ \ 0’, EOF file Terminator (- 1), but includes space symbols), for example: ‘[[: Print:]]’ [:print:] [:print:] [:print:] [:print:]

[:cntrl:]

Any control character (the first 32 characters in the ASCII character set, i.e. decimal representation from 0 to 31, such as line feed, tab, etc.), for example: ‘[[: CNTRL:]]’

[:cntrl:]

[:cntrl:]

[:cntrl:]

[:cntrl:]

[:punct:] Any punctuation mark (excluding the character sets: [: alnum:], [: CNTRL:], [: Space:]) [:punct:] [:punct:] [:punct:] [:punct:]
[:xdigit:] Any hexadecimal number (i.e. 0-9, A-F, A-F) [:xdigit:] [:xdigit:] [:xdigit:] [:xdigit:]

This article on the grep use “﹣ D” matching number is not successful to solve the article here, more related grep “﹣ D” matching number content, please search the previous articles of developpaer or continue to browse the related articles below, I hope you can support developeppaer more in the future!