# Regular expression rules and common methods of PHP

Time：2020-12-11

Note: This paper is transferred fromPick up the star blog

Regular expressions in PHP

``````"^ D + \$" // non negative integer (positive integer + 0)
"^ [0-9] * [1-9] [0-9] * \$" // positive integer
"^ ((- \ D +)| (0 +)) \$" // non positive integer (negative integer + 0)
"^ - [0-9] * [1-9] [0-9] * \$" // negative integer
"^ -? \ D + \$" // integer
"^ D + (\. D +)? \$" // non negative floating point number (positive floating point number + 0)
"^ (([0-9] + \. [0-9] * [1-9] [0-9] *) | ([0-9] * [1-9] [0-9] * \. [0-9] +) | ([0-9] * [1-9] [0-9] *)) \$" // positive floating point number
"^ ((- \ D + (\. D +)?)| (0 + (\. 0 +)?) \$" // non positive floating point number (negative floating point number + 0)
"^ (- ((([0-9] + \. [0-9] * [1-9] [0-9] *) | ([0-9] * [1-9] [0-9] * \. [0-9] +) | ([0-9] * [1-9] [0-9] *)) \$" // negative floating point number
"^ (-?) (\. D +)? \$" // floating point number
"^ [a-za-z] + \$" // a 26 letter string
"^ [A-Z] + \$" // a string consisting of 26 uppercase letters
"^ [A-Z] + \$" // a string of 26 lowercase letters
"^ [a-za-z0-9] + \$" // a string of numbers and 26 letters
"^ W + \$" // a string consisting of numbers, 26 English letters or underscores
"^ [\ W -] + (\. [w -] +) * @ [\ W -] + (\. [w -] +) + \$" // email address
"^[a-zA-z]+://(\w+(-\w+)*)(\.(\w+(-\w+)*))*(\?\S*)?\$"　　//url
/^(D {2}| D {4}) - ((0 ([1-9] {1})) | (1 [1 | 2]) - (([0-2] ([1-9] {1})) | (3 [0 | 1])) \$// / year month day
/^((0 ([1-9] {1})) | (1 [1 | 2])) / (([0-2] ([1-9] {1})) | (3 [0 | 1]) / (D {2}| D {4}) \$// / month / day / year
"^([w-.]+)@(([[0-9]{1,3}.[0-9]{1,3}.[0-9]{1,3}.)|(([w-]+.)+))([a-zA-Z]{2,4}|[0-9]{1,3})(]?)\$"   //Emil
/^((\ +? [0-9] {2,4} \ - [0-9] {3,4} \ -) | ([0-9] {3,4} \ -)? ([0-9] {7,8}) (\ - [0-9] +)? \$// / phone number
(D {1,2} ә 1DD ә 2 [0-4] d_ [0-5]) (D {1,2} dd_ [0-4] d_ [0-5]) (D {1,2} DD ә 2 [0-4] d_ [0-5]) (D {1,2} dd_ [0-4] d [0-5]) (D {1,2} 1DD DD [0-4] d_ [0-5 [0-5]) (D {1,2,2} 1DD [0-4] IP address``````

Regular expression matching Chinese characters: [u4e00-u9fa5]

``Match double byte characters (including Chinese characters): [^ ^ X00 - \ XFF]``

Regular expressions matching empty lines: n [s|] * r
Regular expression matching HTML Tags: / <)>.< /1>|<(.*) />/
Regular expression matching first and last spaces: (^ s))|(s\$)
Regular expression matching email address: W + ([- +.] W +)@w+([-.]w+).w+([-.]w+)*
Regular expression matching URL of web address: ^ [A-ZA – Z] +: / / (- W + (- W +)))(\.(\w+(-\w+)))(\?\S)?\$
Whether the matching account is legal (start with a letter, allow 5-16 bytes, allow alphanumeric underscores): ^ a-za-z {4,15}\$
Match domestic phone number: (D {3} – | D {4} -)? (D {8} | D {7})?
Matching Tencent QQ number: ^ [1-9]1-9\$

Metacharacters and their behavior in the context of regular expressions:
Marks the next character as a special character, or an literal character, or a backward reference, or an octal escape character.
^Matches the start of the input string. If the multiline property of the regexp object is set, ^ also matches the position after ‘n’ or ‘R’.
\$matches the end of the input string. If the multiline property of the regexp object is set, the \$also matches the position before ‘n’ or ‘R’.
*Matches the preceding subexpression zero or more times.
+Matches the previous subexpression one or more times. +It is equivalent to {1,}.
? matches the previous subexpression zero or once. ? is equivalent to {0,1}.
{n} N is a nonnegative integer, which matches n times.
{n,} n is a nonnegative integer that matches at least N times.
Both {n, m} m and N are nonnegative integers, where n < = M. At least N times and at most m times. You cannot have spaces between commas and two numbers.
When the character follows any other qualifier (*, +,?, {n}, {n,}, {n, m}), the matching pattern is non greedy. The non greedy model has as few matches as possible

The default greedy pattern matches as many strings as possible.
. matches any single character except “n”. To match any character, including ‘n’, use a pattern like ‘[. N]’.
(pattern) match the pattern and get the match.
(?: pattern) matches the pattern but does not get the matching result, that is, it is a non retrieval match and is not stored for future use.
(? = pattern) forward prefetching, matching the search string at the beginning of any string that matches a pattern. This is a non fetch match, that is, the match does not need to be

Get for later use.
(?! pattern) has the opposite effect as (?! pattern)
X|y matches X or y.
[XYZ] character set.

``[^ XYZ] negative character set.``

[A-Z] character range, which matches any character in the specified range.

``[^ A-Z] negative range of characters to match any character that is not in the specified range.``

B matches a word boundary, which is the position between the word and the space.
B matches non word boundaries.
CX matches the control character specified by X.
D matches a numeric character. It is equivalent to [0-9].

``\D matches a non numeric character. Equivalent to [^ 0-9].``

F matches a page break. It is equivalent to x0c and CL.
N matches a newline character. It is equivalent to x0a and CJ.
R matches a carriage return. It is equivalent to x0d and cm.
S matches any white space characters, including spaces, tabs, page breaks, and so on. It is equivalent to [fnrtv].

``\S matches any non white space characters. It is equivalent to [^ ^ f / N / R / T / v].``

T matches a tab character. It is equivalent to X09 and CI.
V matches a vertical tab. It is equivalent to x0B and CK.
W matches any word character that includes an underline. Equivalent to ‘[a-za-z0-9_ ]’。

``\W matches any non word characters. Equivalent to '[^ a-za-z0-9_ ]’。``

Xn matches n, where n is a hexadecimal escape value. The hexadecimal escape value must be two digits long.
Num matches num, where num is a positive integer. Reference to the match obtained.

N identifies an octal escape value or a backward reference. If there are at least n acquired subexpressions before N, then n is a backward reference. Otherwise, if n is an octal digit (0-7), then n is an octal escape value.

Nm identifies an octal escape value or a backward reference. If there are at least nm derived subexpressions before nm, then nm is a backward reference. If there are at least n fetches before nm, then n is a backward reference followed by the word M. If neither of the previous conditions is satisfied, if n and m are octal digits (0-7), then nm will match the octal escape value nm.

NML if n is an octal digit (0-3), and m and L are octal digits (0-7), then the octal escape value NML is matched.

UN matches n, where n is a Unicode character represented by four hexadecimal digits.
Regular expression matching Chinese characters: [u4e00-u9fa5]

``Match double byte characters (including Chinese characters): [^ X00 XFF]``

Regular expressions matching empty lines: n [s|] * r
Regular expression matching HTML Tags: / <)>.</1>|<(.*) />/
Regular expression matching first and last spaces: (^ s))| (s\$)
Regular expression matching email address: W + ([- +.] W +)@w+([-.]w+).w+([-.]w+)*

``Regular expression matching URL: http: // ([w -] +.) + [w -] + (/ [w -. /?%, = *)?``

Using regular expressions to restrict the input content of text box in web form:

Use regular expression to restrict Chinese input only:

``````onkeyup="value=value.replace(/[^u4E00-u9FA5]/g,'')"

onbeforepaste="clipboardData.setData('text',clipboardData.getData('text').replace(/[^u4E00-u9FA5]/g,''))"``````

Use regular expressions to restrict the input of full width characters only:

``````onkeyup="value=value.replace(/[^uFF00-uFFFF]/g,'')"

onbeforepaste="clipboardData.setData('text',clipboardData.getData('text').replace(/[^uFF00-uFFFF]/g,''))"``````

Use regular expressions to restrict you to only enter numbers:

``('text',clipboardData.getData('text').replace(/[^d]/g,''))"``

Use regular expression to restrict the input of only numbers and English:

``````onkeyup="value=value.replace(/[W]/g,'')

"onbeforepaste="clipboardData.setData('text',clipboardData.getData('text').replace(/[^d]/g,''))"``````

Regular expressions in common use
Regular expression matching Chinese characters: [u4e00-u9fa5]

``Match double byte characters (including Chinese characters): [^ ^ X00 - \ XFF]``

Regular expressions matching empty lines: n [s|] * r
Regular expression matching HTML Tags: / <)>.</1>|<(.*) />/
Regular expression matching first and last spaces: (^ s))|(s\$)
Regular expression matching IP address: / (D +). (D +). (D +). (D +) / g//
Regular expression matching email address: W + ([- +.] W +)@w+([-.]w+).w+([-.]w+)*

``Regular expression matching URL of web address: http: // [[w -] + \.) + [\ W -] + (/ [\ W -. /?%, = *)?``

SQL statement: ^ (select drop delete create update insert)*\$
1. Nonnegative integer: ^ D+\$
2. Positive integer: ^ [0-9]1-9\$
3. Non positive integer: ^ ((- D +)| (0 +))\$
4. Negative integer: ^ – [0-9]1-9\$
5. Integer: ^ -? D+\$
6. Nonnegative floating point number: ^ D + (. D +)\$
7. Positive floating point number: ^ ((0-9) +. [0-9]1-9)|([0-9]1-9. [0-9]+)|([0-9]1-9))\$
8. Non positive floating point number: ^ ((- D +. D +)?) ((0 + (. 0 +)?))\$
9. Negative floating point number: ^ (- (positive floating-point regular formula)))\$
10. English string: ^ [a-za-z]+\$
11. English capital string: ^ [A-Z]+\$
12. English lowercase string: ^ [A-Z]+\$
13. English character digit string: ^ [A-ZA – z0-9]+\$
14. English number underlined string: ^ w+\$
15. E-mail address: ^ [w -] + (. [w -] +) * @ [w -] + (. [w -] +)+\$
16、URL：^[a-zA-Z]+://(w+(-w+))(. (w+(-w+)))(?s)?\$
Or:

``^http:\/\/[A-Za-z0-9]+\.[A-Za-z0-9]+[\/= \?%\-&_~`@[\]\':+!]*([^<>\"\"])*\$``

17. Postcode: ^ [1-9] d {5}\$
18. Chinese: ^ [u0391-uffe5]+\$
19. Telephone number: ^ ((d2,3) | (D {3} -)? (0d2,3 | 0d {2,3} -)? [1-9] d {6,7} (- D {1,4})\$
20. Mobile phone number: ^ (((D {2,3})) | (D {3} -))? 13D {9}\$
21. Double byte characters (including Chinese characters): ^ X00 XFF
22. Match the first and last spaces: (^ s))|(s\$) (trim function like VBScript)
23. Matching HTML Tags: <)>.</1>|<(.*) />
24. Matching blank line: n [s|] * r
25. Extracting network links in information: (H | h) (R | R) (E | E) (f | f)= (‘|”)?(w|\|/|.)+(‘|”| *|>)?
26. Email address in information extraction: W + ([- +.] W +)@w+([-.]w+).w+([-.]w+)*
27. Extract image links from information: (s | s) (R | R) (C | C)= (‘|”)?(w|\|/|.)+(‘|”| *|>)?
28. Extract the IP address in the information: (D +). (D +). (D +). (D +)
29. Chinese mobile phone number in information extraction: (86)013d{9}
30. Extract the Chinese fixed telephone number from the information: (D3,4 | D {3,4} – | s)? D {8}
31. Extract Chinese phone numbers (including mobile and fixed lines) in the information: (D3,4 | D {3,4} – | s)? D {7,14}
32. Extract the Chinese postal code from the information: [1-9] {1} (D +) {5}
33. Extract floating-point numbers (i.e. decimals) in information: (-? D *).? D+
34. Extract any number in the information: (-? D *) (. D +)?
35、IP：(d+).(d+).(d+).(d+)
36. Area code: / ^ 0d {2,3}\$/
37. Tencent QQ number: ^ [1-9]1-9\$
38. Account number (at the beginning of a letter, 5-16 bytes are allowed, and alphanumeric underscores are allowed): ^ a-za-z {4,15}\$
39. Chinese, English, numbers and underscores: ^ [u4e00-u9fa5_ a-zA-Z0-9]+\$

40. Chinese characters, English, numbers, underscores, short links – different extraction methods under utf8 and GB2312 (examples are as follows:

``````function getChinaEnglishNumStrlen(\$str,\$charset='utf8'){

if(\$charset=='gb2312'){
if(!preg_match_all("/^[".chr(0xa1)."-".chr(0xff)."A-Za-z0-9_\-]+/",\$str,\$match)){
return false;
}
return implode('',\$match);
}
//
if(\$charset=='utf8'){
if(!preg_match_all("/[\x{4e00}-\x{9fa5}A-Za-z0-9_\-]+/u",\$str,\$match)){
return false;
}
return implode('',\$match);
}
return false;

}``````

The above function returns the extracted alphanumeric characters_ -Symbol string combination

41. Filter out special characters and keep only Chinese, English letters, numbers, underscores and dashes
be careful:In the following method, only Chinese characters, English letters, numbers, underscores, and dashes are reserved, and other symbols are filtered out. If the string is UTF-8, the following ones do not need transcoding, and the MB inside can be commented out_ convert_ Encoding method.

``````/**
*Filter special characters (only Chinese, English letters, numbers, underscores and dashes are reserved)
*@ desc this method is mainly used to filter the content of sensitive words with special symbols in the advertisements
*@ param string \$STR characters to be processed (GBK code)
* @return string
*/
function filter_special_characters(\$str)
{
if(empty(\$str))  return "";

//Converting GBK into UTF-8 code
\$str = mb_convert_encoding(\$str, "utf-8", "gbk");

//Filtered string
\$new_str = "";

//Regular matching
if(preg_match_all("/[\x{4e00}-\x{9fa5}A-Za-z0-9_\-]+/u", \$str , \$match))
{
if(\$match)
{
foreach(\$match as \$val)
{
\$new_str  .= \$val;
}

//Transcoding to GBK output
\$new_str = mb_convert_encoding(\$new_str , "gbk", "utf-8");

}

}

return \$new_str;

}

\$STR = "a dream in the world of mortals + Q [1 ⒐ 6.2.4] [reputation first]";
\$new_str = filter_special_characters(\$str);
print_r(\$new_str);

//Printout
//A dream in the world of mortals``````

42、preg_ Match combined with regular use
preg_ Match() will stop matching after one successful match. If you want to match all the results, you need to use preg_ match_ All() function.

``preg_match (pattern , subject, matches)``

Example 1 – find letters:

``````<?php
//The "I" after the pattern qualifier indicates a case insensitive search
if (preg_match ("/hi/i", "Welcome to hi-docs.com.")) {
echo "A match was found.";
} else {
}
?>

Output:
A match was found.``````

Example 2 – matching URL hyperlinks in strings

``````<?php
\$urls = '<h3><a target="_ blank" href="/php/preg_ match.html "><span class="hl">preg</span>_ match()</a></h3><p>[<a href="/ Php.html "> PHP < / a >] for regular expression matching < br / > < EM > applicable version: 5 < / EM > < / P > < DD > < DD > < H3 > < a target ="_ blank" href="/php/preg_ match_ all.html "><span class="hl">preg</span>_ match_ all()</a></h3>';
if(preg_match("/<a[^>]*?href=\"([^>]+?)\"[^>]*?>.+?<\/a>/i", \$urls ,\$match)) {
print_r(\$match);
} else {
Echo "does not match.";
}
?>

Output:
Array
(
 => <a target="_blank" href="/php/preg_match.html"><span class="hl">preg</span>_match()</a>
 => /php/preg_match.html
)``````

Example 3 – using regular expressions to match Chinese

``````\$str = 'preg_ Match regular matching Chinese 123 ';
//Regular expression matching Chinese (utf8 encoding)
if(preg_match('/[\x{4e00}-\x{9fa5}]+/u',\$str)){
Echo 'match';
}else{
Echo 'no match';
}
//GB2312, GB2312
preg_match("/^[".chr(0xa1)."-".chr(0xff)."A-Za-z0-9_]+\$/",\$str);``````

Match the relevant data according to the article number:

``````define('runcode', 1);

\$SKU = "color category: a72287962 brown; size: XXL; lovers' style: Men's style";

//Regular matching for Chinese
\$pattern = "/[\x{4e00}-\x{9fa5}]+[:|;|；|\s]([A-Za-z0-9_-]+)\s*(.*)?[:|;|；|\s]+[\x{4e00}-\x{9fa5}]+[:|;|；|\s]([A-Za-z0-9]+)/u";
if(preg_match(\$pattern, \$sku, \$matches))
{
dump(\$matches);
}``````

Print results:

``````Array
(
 = > Color Classification: a72287962 brown; size: XXL
 => A722287962
 = > Brown
 => XXL
)``````

preg_ Match usage explanation

## Swift advanced 08: closure & capture principle

closure closurecanCapture and storageOf any constants and variables defined in their contextquote, this is the so-calledClose and wrap those constants and variablesTherefore, it is called“closure”Swift can handle everything for youCaptured memory managementOperation of. Three forms of closure [global function is a special closure]: a global function is a closure that has a name but does […]