Specifically, use (? >) ）”Is no different from a normal match, but if the match goes after this structure (that is, after the closed parentheses), all the standby states in the structure are discarded (cannot be backtracked).
That is to say, at the end of solidification group matching, the matched text has been solidified into a unit, which can only be retained or abandoned as a whole. The untouched standby states in the subexpression in parentheses are no longer available, so backtracking can never select any of them (at least, the state in which “locked in” is in when the structure match is complete).
For example, to process a batch of data, the original format was 123.456. Later, due to the display problem of floating-point numbers, part of the data format changed to 123.4566000000789, which requires only 2-3 digits after the decimal point, but the last digit cannot be 0. How to write this regular? (the numbers after the decimal point are directly considered below). After writing the regularization, we have to use this regularization to match the data and replace the original data with the matching result.
//Group 1 of the matching result is referred back
Obviously, this method of writing, for part of the data format of 123.456 this format, white processing, in order to improve efficiency, we have to deal with this regular. Comparing the string 123.456 with others, we find that there is no number after 123.456, so we have to deal with it in vain. That’s easy to do. Let’s change the regular rule by changing the following quantifier * to +. In this way, we will not deal with those with 1 or 2 digits after 123.45 decimal point. Moreover, for those with more than three digits, the processing is normal. Its PHP code is
OK, is this regular really OK?? Now, let’s also analyze the regular matching process.
The string “123.456”, and the regular expression is [\. (\ D / D [1-9]?) / D +], let’s take a look
First of all (123 before the decimal point),
If [\.] matches “.”, the matching is successful. If the control right is given to the next [?], the [?] matches “4” successfully, and the control right is given to the second [?], which matches “5”. Then, the control right is given to [[1-9]?]. Since the quantifier is [?], the regular expression follows “quantifier first matching”, and here is [?], which will leave a backtracking point 。 Then the match “6” is successful, and then the control right is given to [D +]. If the [- D +] finds that there is no character after it, it follows the “last in, first out” rule and returns to the previous backtracking point for matching. At this time, [[1-9]?] will return its matched character “6”, and [[1-9]?] matches “6” successfully. The match is complete. It is found that the result of [(([1-9]?)] matching is indeed “45”, which is not the “456” we want, and “6” has been matched by [D +]. So what should we do? Can [1-9]?] match successfully without backtracking? This uses the above-mentioned “solidification group”, PHP (preg)_ The regular engine used in replace function supports fixed grouping. According to the writing method of fixed group, we can change the code into the following way
In this case, the string “123.456” does not meet the requirements and will not be matched. Then we can fulfill our requirements.
So let’s take a look at (\. [1-9]?) / D +.
In the fixed group, the quantifier can work normally, so if [1-9] does not match, the regular expression will return to the standby state left by. Then the matching is separated from the curing group and continues to move to the “⁃ D +”. In this case, when control leaves the curing group, no standby state needs to be abandoned (because no standby state is created in the solidification group).
If [1-9] can match, the standby state saved by “?” will still exist after the matching is separated from the solidification group. However, it will be discarded because it belongs to the closed solidification group.
This happens when matching ‘. 625’ or ‘. 625000’. In the latter case, abandoning those States does not cause any trouble, because the “﹤ D +” matches “. 625000”, where the regular expression has finished matching. However, for ‘. 625’, the regular engine needs to backtrack because the “﹣ D +” cannot match, but the backtracking cannot be carried out because the standby state no longer exists. Since there is no backup state that can be traced back, the whole match fails, and ‘. 625’ does not need to be processed, which is exactly what we expect.