Capture group / non capture group of regular expressions

Time:2021-1-14

Capture group
Grammar:

character 

describe

Examples

(pattern)

Match the pattern and capture the result, automatically set the group number.

 (abc)+d

Match ABCD or abcabcd

(?<name>pattern)

or

(?’namepattern)

Match the pattern and capture the result. Set the name to the group name.

 

\num

A back reference to a capture group. Where num is a positive integer.

(\w)(\w)\2\1

Match ABBA

\k< name >

or

\k’ name

A back reference to a named capture group. Where name is the capture group name.

(?<group>\w)abc\k<group>

Match xabcx

After using parentheses to specify a subexpression, the text matching the subexpression (that is, the content captured by this grouping) can be further processed in the expression or other programs. By default, each capture group will automatically have a group number. The rule is: from left to right, marked by the left bracket of the group, the group number of the first group is 1, the second group is 2, and so on.
For example:
(\d{4})-(\d{2}-(\d{2}))
1 1 2 3 32
The following is an example of using the program to process Capture groups, resolving a URL address and displaying all capture groups.
You can see the capture group number set in order.
Regex.Match method

Copy codeThe code is as follows:
using System.Text.RegularExpressions;
namespace Wuhong.Test
{
class Program
{
static void Main(string[] args)
{
//Target string
string source = “http://reg-test-server:8080/download/file1.html# “;
//Regular form
string regex = @”(\w+):\/\/([^/:]+)(:\d+)?([^# :]*)”;
Regex regUrl = new Regex(regex);
//Match regular expressions
Match m = regUrl.Match(source);
Console.WriteLine(m.Success);
if (m.Success)
{
//The capture group is stored in Match.Groups In the collection, the index value starts from 1, and the index 0 is the whole string value of the match
//Display in the format of “group number: captured content”
for (int i = 0; i < m.Groups.Count; i++)
{
Console.WriteLine(string.Format(“{0} : {1}”, i, m.Groups[i]));
}
}
Console.ReadLine();
}
}
}

Capture group / non capture group of regular expressions

You can also specify the group name of the subexpression yourself. In this way, you can directly refer to the group name in the expression or program, and of course, you can continue to use the group number. However, if there are both normal capture group and named capture group in regular expression, special attention should be paid to the numbering of capture group. The numbering rule is to number the normal capture group first, and then the named capture group.
For example:
(\d{4})-(?<date>\d{2}-(\d{2}))
1 1 3 2 23

Next, we will process the named capture group in the program, display the group number generated by the mixing rule, and replace the source string with the content of the capture group.
You can see that the normal capture group is numbered first, and then the named capture group is numbered.
Regex.Replace method

Copy codeThe code is as follows:
using System.Text.RegularExpressions;
namespace Wuhong.Test
{
class Program
{
static void Main(string[] args)
{
//Target string
string source = “http://reg-test-server:8080/download/file1.html# “;
//Regular form, name two of them
string regex = @”(\w+):\/\/(?<server>[^/:]+)(?<port>:\d+)?([^# :]*)”;
Regex regUrl = new Regex(regex);
//Match regular expressions
Match m = regUrl.Match(source);
Console.WriteLine(m.Success);
if (m.Success)
{
//The capture group is stored in Match.Groups In the collection, the index value starts from 1, and the index 0 is the whole string value of the match
//Display in the format of “group number: captured content”
for (int i = 0; i < m.Groups.Count; i++)
{
Console.WriteLine(string.Format(“{0} : {1}”, i, m.Groups[i]));
}
}
//Replace string
//The “$group number” refers to the contents of the capture group.
//It should be noted that “$group number” cannot be followed by a numeric string. In this case, you need to use a named capture group and the reference format “${group name}”
string replacement = string.Format(“$1://{0}{1}$2”, “new-reg-test-server”, “”);
string result = regUrl.Replace(source, replacement);
Console.WriteLine(result);
Console.ReadLine();
}
}
}

Capture group / non capture group of regular expressions
Non capture group
Grammar:

character 

describe

Examples

(?:pattern)

The pattern is matched, but the matching result is not captured.

‘industr(?:y|ies)

Match ‘industry’ or ‘industries’.

(?=pattern)

Zero width forward look-up does not capture matching results.

‘Windows (?=95|98|NT|2000)’

Match “windows” in “Windows 2000”

Does not match ‘windows’ in’ windows3.1 ‘.

(?!pattern)

Zero width negative look-up does not capture matching results.

‘Windows (?!95|98|NT|2000)’

Match ‘windows’ in’ windows3.1 ‘

Does not match ‘windows’ in’ Windows2000 ‘.

(?<=pattern)

Zero width forward lookback, no matching results are captured.

‘2000 (?<=Office|Word|Excel)’

Match “2000” in “office2000”

Does not match ‘2000’ in ‘Windows 2000’.

(?<!pattern)

Zero width negative look-up does not capture matching results.

‘2000 (?<!Office|Word|Excel)’

Match “2000” in “Windows 2000”

Does not match ‘2000’ in ‘office2000’.

The non capture group only matches the result, but it does not capture the result and does not assign the group number. Of course, it can not be further processed in expressions and programs.
First of all, the difference between (?: pattern) and (pattern) is that the result is not captured.
The next four non capture groups are used to match the content before (or after) the pattern position. The matching result does not include pattern.
For example:
(? < = < (W +) >). * (? = < \ / [1 >) matches the content in a simple HTML tag that does not contain attributes. For example: < div > Hello < / div >, the matching result does not include prefix < div > and suffix < / div >.
The following is an example of a non capture group in the program to extract the zip code.
You can see that both reverse lookback and reverse prequery are not captured.
Regex.Matches method

Copy codeThe code is as follows:
using System.Text.RegularExpressions;
namespace Wuhong.Test
{
class Program
{
static void Main(string[] args)
{
//Target string
String source = there are six groups of numbers: 01000110021000310000410011510002. Pick out the zip code. “;
//Regular form
string regex = @”(?<!\d)([1-9]\d{5})(?!\d)”;
Regex regUrl = new Regex(regex);
//Get all matches
MatchCollection mList = regUrl.Matches(source);
for (int j = 0; j < mList.Count; j++)
{
//Display each group. You can see that each group has only items with group number 1. Reverse backcheck and reverse preview are not captured
for (int i = 0; i < mList[j].Groups.Count; i++)
{
Console.WriteLine(string.Format(“{0} : {1} : {2}”, j, i, mList[j].Groups[i]));
}
}
Console.ReadLine();
}
}
}

Capture group / non capture group of regular expressions
notes
Grammar:

character

describe

Examples

(?#comment)

Comment is a comment, which has no effect on the processing of regular expressions

2[0-4]\d(?#200-249)|25[0-5](?#250-255)|1?\d\d?(?#0-199)

Match integers 0-255

I don’t want to explain this.