C # regular expression

Time:2022-1-2

C # regular expression

1、 Introduction

A regular expression is a pattern that matches input text Net framework provides a regular expression engine that allows this matching. The pattern consists of one or more characters, operators and structures. Common characters, operators, and structures used to define various categories of regular expressions are listed below.

Character escape:

The backslash character (\) in a regular expression indicates that the character following it is a special character or should be interpreted as it is. 

Escape character describe pattern matching
\a Matches the alarm (Bell) character \ u0007. \a “Warning!” + ‘ \”\ u0007″ in u0007 ‘
\b In the character class, match the backspace key \ u0008. [\b]{3,} “\ B \ B \ B \ B” in “\ B \ B \ B \ B”
\t Matches tab \ u0009. (\w+)\t “Name \ T” and “addr \ T” in “name \ taddr \ T”
\r Matches the carriage return character \ u000d. (\ R is not equivalent to the newline \ n character.) \r\n(\w+) “\r\nHello\nWorld.” “\ R \ nhello” in “
\v Matches the vertical tab \ u000b. [\v]{2,} “\ V \ V \ V” in “\ V \ V \ V”
\f Matches the page feed character \ u000c. [\f]{2,} “\ f \ f \ F” in “\ f \ f \ F”
\n Matches the newline character \ u000a. \r\n(\w+) “\r\nHello\nWorld.” “\ R \ nhello” in “
\e Matches the escape character \ u001b. \e “\ x001b” in “\ x001b”
\ nnn Specifies a character in octal representation (NNN consists of two to three digits). \w\040\w “A B” and “C D” in “a BC D”
\x nn Specifies the character in hexadecimal representation (NN consists of exactly two digits). \w\x20\w “A B” and “C D” in “a BC D”
\c X \c x Matches the ASCII control character specified by X or X, where X or X is the letter of the control character. \cC “\ x0003” in “\ x0003” (ctrl-c)
\u nnnn Matches a Unicode character (four digits represented by nnnn) in hexadecimal representation. \w\u0020\w “A B” and “C D” in “a BC D”
\ Matches an unrecognized escape character when it is followed. \d+[\+-x\*]\d+\d+[\+-x\*\d+ “2 + 2” and “3 * 9” in “(2 + 2) * 3 * 9”

 Character class:

A character class matches any one of a set of characters.

Character class describe pattern matching
[character_group] Match character_ Any single character in the group. By default, matching is case sensitive. [mn] “M” in “mat”, “m” and “n” in “Moon”
[^character_group] Non: and not in character_ Match any single character in the group. By default, character_ Characters in a group are case sensitive. [^aei] “V” and “L” in “available”
[ first – last ] Character range: matches any single character in the range from first to last. [b-d] [B-D] IRDS can match birds, cirds and dirds
. Wildcard: matches any single character except \ n.
To match the original period character (. Or \ u002e), you must precede the character with an escape character (\.).
a.e “Ave” in “have” and “ate” in “mate”
\p{ name } AndnameMatches any single character in the specified Unicode generic category or named block. \p{Lu} “C” and “L” in “city lights”
\P{ name } Not withnameMatches any single character in the specified Unicode generic category or named block. \P{Lu} ‘I’,’t ‘and’ y ‘in’ city ‘
\w Matches any word character. \w “R”, “O”, “m” and “1” in “room #1”
\W Matches any non word character. \W “#” in “room #1”
\s Matches any white space characters. \w\s “D” in “ID a1.3”
\S Matches any non whitespace characters. \s\S “In” int _ CTR “
\d Matches any decimal number. \d “4 = 4” in “IV”
\D Matches any character that is not a decimal number. \D “,” = “,”, “,” “I” and “V” in “4 = IV”

Group Construction:

Grouping constructs describe sub expressions of regular expressions, which are usually used to capture sub strings of input strings.

Grouping construction describe pattern matching
( subexpression ) Capture the matching subexpression and assign it to a zero based sequence number. (\w)\1 “EE” in “deep”
(?< name >subexpression) Capture matching subexpressions into a named group. (?< double>\w)\k< double> “EE” in “deep”
(?< name1 -name2 >subexpression) Define balance group definitions. (((?’Open’\()[^\(\)]*)+((?’Close-Open’\))[^\(\)]*)+)*(?(Open)(?!))$ “((1-3) * (3-1)) in” 3 + 2 ^ ((1-3) * (3-1)) “
(?: subexpression) Define non capture groups. Write(?:Line)? Writeline in console. Writeline()
(?imnsx-imnsx:subexpression) Apply or disablesubexpressionThe options specified in. A\d{2}(?i:\w+)\b “A12xl” and “a12xl” in “a12xl a12xl”
(?= subexpression) Zero width positive prediction lookahead assertion. \w+(?=\.) “He is. The dog ran. The sun is out.” “Is”, “ran” and “out” in “
(?! subexpression) Zero width negative prediction lookahead assertion. \b(?!un)\w+\b “Sure” and “used” in “unsafe sure unit used”
(?<=subexpression) Zero width positive review post assertion. (?<=19)\d{2}\b “99”, “50” and “05” in “1851 1999 1950 1905 2003”
(? Zero width negative review post assertion. (? “Man” in “Hi woman”
(?> subexpression) Non backtracking (also known as “greedy”) subexpressions. [13579](?>A+B+) “1abb”, “3abb” and “5ab” in “1abb 3abbc 5ab 5AC”

qualifier
The qualifier specifies how many instances of the previous element (which can be a character, group, or character class) must exist in the input string for a match to occur. Qualifiers include the language elements listed in the following table. 

qualifier describe pattern matching
* Matches the previous element zero or more times. \d*\.\d “.0″、 “19.9”、 “219.9”
+ Matches the previous element one or more times. “be+” “Bee” in “be” and “be” in “bent”
? Matches the previous element zero or once. “rai?n” “ran”、 “rain”
{ n } Matches the previous element exactly n times. “,\d{3}” “043” in “1043.6”, “876”, “543” and “210” in “9876543210”
{ n ,} Matches the previous element at least N times. “\d{2,}” “166”、 “29”、 “1930”
{ n , m } Match the previous element at least N times, but not more than m times. “\d{3,5}” “19302” in “166”, “17668”, “193024”
*? Matches the previous element zero or more times, but as few times as possible. \d*?\.\d “.0″、 “19.9”、 “219.9”
+? Match the previous element one or more times, but as few times as possible. “be+?” “Be” in “be” and “be” in “bent”
?? Matches the previous element zero or once, but as few times as possible. “rai??n” “ran”、 “rain”
{ n }? Match the leading element exactly n times. “,\d{3}?” “043” in “1043.6”, “876”, “543” and “210” in “9876543210”
{ n ,}? Match the previous element at least N times, but as few times as possible. “\d{2,}?” “166”, “29” and “1930”
{ n , m }? The number of times to match the previous element is between N and m, but as few as possible. “\d{3,5}?” “193” and “024” in “166”, “17668”, “193024”

2、 Code

C # regular expressionC # regular expression

static void Main(string[] args)
        {
            //Judge whether it contains three characters a, B and C
            string str = Console. ReadLine();                // Store the characters entered by the user in str
            string regex = @"[abc]";                          // Adding @ "[ABC]" in C # is the basic writing method of regular expression
            bool isMatch = Regex. IsMatch(str, regex);         // Comparison: ismatch (characters to be judged, regular): return bool value 
            Console. WriteLine(isMatch ? "Match [abc]" : "not Match[abc]"); // Output results
            Console.WriteLine();
        }

Include a, B, C

C # regular expressionC # regular expression

static void Main(string[] args)
        {
            StringBuilder s = new StringBuilder("www.baidu.com", 50); // Declare a string with a capacity of 50

            //News = regular class Replace string (string to be processed, regular, added string) - replace the position conforming to the regular condition with the following string
            //Replace beginning
            string news = Regex. Replace (S. tostring(), "^", "website:"); // ^: Represents the beginning
            Console. WriteLine(news);                               // output

            //Replace end
            news = Regex. Replace (s.tostring(), "$", "end"); // $: Represents the end
            Console.WriteLine(news);
            Console.ReadLine();
        }

Replace beginning or end of regular

C # regular expressionC # regular expression

static void Main(string[] args)
        {
            string s = Console. ReadLine();           // S receives user input
            string regex = @"^\W*$";                 // Regular: beginning and ending with any character other than letters, underscores, and numbers
            bool isMatch = Regex. IsMatch(s, regex);  // Match user input, whether regular conditions are met
            Console. Writeline (ismatch? "Satisfied": "not satisfied")// Ternary operation
            Console.WriteLine();
        }

Matches characters that begin and end with any character other than letters, underscores, and numbers

C # regular expressionC # regular expression

static void Main(string[] args)
        {
            string s = "abcdef";
            string regex = @"[^bde]";                    // [^ BDE] represents all characters except B, D and E, any character
            string newReplace = Regex. Replace(s, regex, "1"); // Match characters other than BDE in S and replace with 1
            Console.WriteLine(newReplace);
            Console.WriteLine();
        }

Replace character

C # regular expressionC # regular expression

static void Main(string[] args)
        {
            string qq = Console. ReadLine();       // Waiting for user input
            string regex = @"^\d{5,11}$";         // Regular: the beginning and end are numbers, and 5-11 bits are numbers
            bool isqq = Regex. IsMatch(qq, regex); // Determines and returns a Boolean value
            Console. Writeline (ISqq? "Is QQ number": "is not QQ number")// Ternary operation
            Console.WriteLine();
        }

Match QQ number

C # regular expressionC # regular expression

static void Main(string[] args)
        {
            string regex = @"^((([1]?\d\d?|2[0-4]\d|25[0-5])\.){3}([1]?\d\d?|2[0-4]\d|25[0-5]))$"; // Judge whether the IP address is compliant
            while (true)
            {
                string s = Console. ReadLine();                 // Waiting for user input
                bool isMatch = Regex. IsMatch(s, regex);        // Verify whether IP is legal
                Console. Writeline (ismatch? "Is an IP address": "is not an IP address")// Ternary operation
            }
        }

Verify IP address