Regular expression is a text pattern that includes ordinary characters (for example, letters between a and z) and special characters (called “metacharacters”).
Regular expressions use a single string to describe and match a series of strings that match a syntactic rule.
Regular expression is a special character sequence, which can help you easily check whether a string matches a pattern.
Regular expression is cumbersome, but it is powerful. The application after learning will not only improve your efficiency, but also bring you an absolute sense of achievement.
In ptrhon, the re module enables the python language to have all the regular expression functions.
Re.match attempts to match a pattern from the starting position of the string. If the matching is not successful, match() returns none and a matchobject.
re.match(pattern, string, flags=0)
>>> print(re.match('abc', '123abc')) None >>> re.match('abc', 'abc') >>> re.match('a.c', 'abc') >>> re.match('a\.c', 'a.c') >>> re.match('a\*c', 'a*c')
Re. Search method
Re.search scans the entire string and returns the first successful match.
re.search(pattern, string, flags=0)
>>> re.search('abc', 'abc') >>> re.search('a.c', 'abc') >>> re.search('a\.c', 'a.c') >>> re.search('a\*c', 'a*c')
The difference between re.match and re.search
Re.match only matches the beginning of the string. If the beginning of the string does not conform to the regular expression, the matching fails and the function returns none; Re.search matches the entire string until a match is found.
The compile function is used to compile regular expressions and generate a regular expression (pattern) object for use by the match () and search () functions.
The syntax format is:
Pattern: a regular expression in the form of a string
Flags: optional, indicating the matching mode, such as ignoring case, multi line mode, etc. the specific parameters are:
|re.I||Make matching pairs case insensitive|
|re.L||Local aware matching|
|re.M||Multiline matching, affecting ^ and$|
|re.S||Match. To all characters, including line breaks|
|re.U||Parses characters according to the Unicode character set. This flag affects \ W, \ W, \ B, \ B|
|re.X||This flag gives you a more flexible format so that you can write regular expressions easier to understand.|
Regular expressions can contain optional flag modifiers to control matching patterns. The modifier is specified as an optional flag. Multiple flags can be specified by bitwise OR (|) them. For example, re. I | re. M is set to I and M flags.
>>>#Find all numbers >>> regex=re.compile(r'\d+') >>> regex.findall('12abc56py7hello') ['12', '56', '7'] >>> >>>##Find words with wh, case insensitive >>> re.findall(r'\w*wh\w*','Who are you, what your name, when you go home', re.I) ['Who', 'what', 'when'] >>># equivalent to above >>> regex=re.compile(r'\w*wh\w*', re.I) >>> regex.findall('Who are you, what your name, when you go home') ['Who', 'what', 'when'] >>>
Regular expression object
Re. Compile() returns the regexobject object.
Group() returns the string matched by re.
Start() returns the location where the match started
End() returns the position where the match ends
Span() returns a tuple containing the position of the match (start, end)
Regular expression pattern
The pattern string uses a special syntax to represent a regular expression:
Letters and numbers represent themselves. Letters and numbers in a regular expression pattern match the same string.
Most letters and numbers have different meanings when preceded by a backslash.
Punctuation marks match themselves only when they are escaped, otherwise they represent a special meaning.
The backslash itself requires a backslash escape.
Since regular expressions usually contain backslashes, you’d better use the original string to represent them. Pattern elements (such as R ‘\ t’, equivalent to ‘\ t’) match the corresponding special characters.
The following table lists the special elements in the regular expression pattern syntax. If you use a pattern and provide optional flag parameters, the meaning of some pattern elements will change.