Python learning note 9: regular expressions

Time:2021-11-19

Regular expression is a text pattern that includes ordinary characters (for example, letters between a and z) and special characters (called “metacharacters”).
Regular expressions use a single string to describe and match a series of strings that match a syntactic rule.
Regular expression is a special character sequence, which can help you easily check whether a string matches a pattern.
Regular expression is cumbersome, but it is powerful. The application after learning will not only improve your efficiency, but also bring you an absolute sense of achievement.

In ptrhon, the re module enables the python language to have all the regular expression functions.

Re.match function
Re.match attempts to match a pattern from the starting position of the string. If the matching is not successful, match() returns none and a matchobject.
Function syntax:

re.match(pattern, string, flags=0)

Example code:

>>> print(re.match('abc', '123abc'))
None
>>> re.match('abc', 'abc')


>>> re.match('a.c', 'abc')


>>> re.match('a\.c', 'a.c')

>>> re.match('a\*c', 'a*c')

Re. Search method
Re.search scans the entire string and returns the first successful match.
Function syntax:

re.search(pattern, string, flags=0)

Example code:

>>> re.search('abc', 'abc')


>>> re.search('a.c', 'abc')


>>> re.search('a\.c', 'a.c')

>>> re.search('a\*c', 'a*c')

The difference between re.match and re.search
Re.match only matches the beginning of the string. If the beginning of the string does not conform to the regular expression, the matching fails and the function returns none; Re.search matches the entire string until a match is found.

Re.compile function
The compile function is used to compile regular expressions and generate a regular expression (pattern) object for use by the match () and search () functions.
The syntax format is:

re.compile(pattern[, flags])

Parameters:
Pattern: a regular expression in the form of a string
Flags: optional, indicating the matching mode, such as ignoring case, multi line mode, etc. the specific parameters are:

Modifier describe
re.I Make matching pairs case insensitive
re.L Local aware matching
re.M Multiline matching, affecting ^ and$
re.S Match. To all characters, including line breaks
re.U Parses characters according to the Unicode character set. This flag affects \ W, \ W, \ B, \ B
re.X This flag gives you a more flexible format so that you can write regular expressions easier to understand.

Regular expressions can contain optional flag modifiers to control matching patterns. The modifier is specified as an optional flag. Multiple flags can be specified by bitwise OR (|) them. For example, re. I | re. M is set to I and M flags.

>>>#Find all numbers
>>> regex=re.compile(r'\d+')
>>> regex.findall('12abc56py7hello')
['12', '56', '7']
>>> 
>>>##Find words with wh, case insensitive
>>> re.findall(r'\w*wh\w*','Who are you, what your name, when you go home', re.I)
['Who', 'what', 'when']
>>># equivalent to above
>>> regex=re.compile(r'\w*wh\w*', re.I)
>>> regex.findall('Who are you, what your name, when you go home')
['Who', 'what', 'when']
>>>

Regular expression object
re.RegexObject
Re. Compile() returns the regexobject object.

re.MatchObject
Group() returns the string matched by re.
Start() returns the location where the match started
End() returns the position where the match ends
Span() returns a tuple containing the position of the match (start, end)

Regular expression pattern
The pattern string uses a special syntax to represent a regular expression:
Letters and numbers represent themselves. Letters and numbers in a regular expression pattern match the same string.
Most letters and numbers have different meanings when preceded by a backslash.
Punctuation marks match themselves only when they are escaped, otherwise they represent a special meaning.
The backslash itself requires a backslash escape.
Since regular expressions usually contain backslashes, you’d better use the original string to represent them. Pattern elements (such as R ‘\ t’, equivalent to ‘\ t’) match the corresponding special characters.
The following table lists the special elements in the regular expression pattern syntax. If you use a pattern and provide optional flag parameters, the meaning of some pattern elements will change.

Recommended Today

Why didn’t the looper’s polling loop in the main thread block the main thread?

Main thread activitythread [source code] public static void main(String[] args) { Trace.traceBegin(Trace.TRACE_TAG_ACTIVITY_MANAGER, “ActivityThreadMain”); // CloseGuard defaults to true and can be quite spammy. We // disable it here, but selectively enable it later (via // StrictMode) on debug builds, but using DropBox, not logs. CloseGuard.setEnabled(false); Environment.initForCurrentUser(); // Set the reporter for event logging in libcore […]