# Glacier has revealed an important programming skill it has summed up!

Time：2021-4-16

## Write on the front

Mastering regular expressions skillfully can help programmers write the most elegant code with the fastest speed. Glacier has combed and summarized the regular expressions used in many years of programming work. These regular expressions can help you save a lot of coding time. Often a simple regular expression can omit a lot of coding time`if...else...`code. This time, binghe disclosed his regular expressions to his friends, hoping to bring substantial help to them.

The article has been included in:

https://github.com/sunshinelyz/technology-binghe

https://gitee.com/binghe001/technology-binghe

## Glaciers commonly used

Integer or decimal

``^[0-9]+\.{0,1}[0-9]{0,2}\$ ``

Only numbers can be entered

``^[0-9]*\$``

Only n digits can be entered

``^\d{n}\$``

You can only enter at least n digits

``^\d{n,}\$``

Only m ~ n digits can be input

``^\d{m,n}\$ ``

You can only enter numbers starting with zero and non-zero

``^(0|[1-9][0-9]*)\$``

Only positive real numbers with two decimal places can be entered

``^[0-9]+(.[0-9]{2})?\$``

Only positive real numbers with 1-3 decimal places can be entered

``^[0-9]+(.[0-9]{1,3})?\$``

You can only enter non-zero positive integers

``^\+?[1-9][0-9]*\$``

Only nonzero negative integers can be entered

``^\-[1-9][]0-9*\$``

Only characters of length 3 can be entered

``^.{3}\$``

You can only enter a string of 26 English letters

``^[A-Za-z]+\$``

You can only enter a string of 26 uppercase English letters

``^[A-Z]+\$``

You can only enter a string of 26 lowercase English letters

``^[a-z]+\$``

You can only enter a string of numbers and 26 letters

``^[A-Za-z0-9]+\$``

You can only enter a string of numbers, 26 letters or underscores

``^\w+\$``

``^[a-zA-Z]\w{5,17}\$ ``

Note: the correct format is: start with a letter, length between 6 ~ 18, can only contain characters, numbers and underscores.

Verify if it contains`^%&',;=?\$\`Equal character

``[^%&',;=?\$\x22]+ ``

Only Chinese characters can be input

``^[\u4e00-\u9fa5]{0,}\$ ``

``^\w+([-+.]\w+)*@\w+([-.]\w+)*\.\w+([-.]\w+)*\$``

Verify Internet URL

``^[http|https]://([\w-]+\.)+[\w-]+(/[\w-./?%&=]*)?\$``

Verify phone number

``^(\(\d{3,4}-)|\d{3.4}-)?\d{7,8}\$``

The correct format is: xxx-xxxxxx, XXX – XXXXXXXX, xxx-xxxxxxxx, xxx-xxxxxxxx, XXXXXXXX and XXXXXXXX

Verify ID number (15 digit or 18 digit).

``^\d{15}|\d{18}\$``

12 months of validation year

``^(0?[1-9]|1[0-2])\$``

The correct formats are: 01-09 and 1-12

Verify the 31 days of a month

``^((0?[1-9])|((1|2)[0-9])|30|31)\$``

The correct format is: 01-09 and 1-31

Regular expressions matching Chinese characters

``[\u4e00-\u9fa5]``

Matching double byte characters (including Chinese characters)

``[^\x00-\xff] ``

Regular expressions that match empty lines

``\n[\s| ]*\r``

Regular expressions matching HTML tags

``<(.*)>(.*)<\/(.*)>|<(.*)\/>``

Regular expressions that match leading and trailing spaces

``(^\s*)|(\s*\$)``

``\w+([-+.]\w+)*@\w+([-.]\w+)*\.\w+([-.]\w+)*``

Regular expressions that match HTML tags

``<(\S*?)[^>]*>.*?|<.*? />``

Comment: the version circulated on the Internet is too bad. The above one can only match the part, and it is still powerless for complex nested tags

Regular expressions that match first and last white space characters

``^\s*|\s*\$``

Comments: can be used to delete the beginning and end of the line of white space characters (including spaces, tabs, page breaks, etc.), very useful expressions

``\w+([-+.]\w+)*@\w+([-.]\w+)*\.\w+([-.]\w+)*``

Comment: form validation is very useful

A regular expression that matches the URL of a web address

``[a-zA-z]+://[^\s]*``

Commentary: the function of the version circulated on the Internet is very limited, and the above one can basically meet the needs

Whether the matching account number is legal (starting with a letter, 5-16 bytes allowed, alphanumeric underline allowed)

``^[a-zA-Z][a-zA-Z0-9_]{4,15}\$``

Comment: form validation is very useful

Match domestic phone number

``\d{3}-\d{8}|\d{4}-\d{7}``

Comments: the matching form is 0511-4405222 or 021-87888822

Match Tencent QQ number

``[1-9][0-9]{4,}``

Commentary: Tencent QQ number starts from 10000

Postcode of China

``[1-9]\d{5}(?!\d)``

Comment: the postcode of China is 6 digits

Matching ID card

``\d{15}|\d{18}``

Commentary: China’s ID card is 15 or 18

``\d+\.\d+\.\d+\.\d+``

Comment: useful for extracting IP address

Match specific numbers

``````^[1-9] [D * \$// match positive integers
^-[1-9] [D * \$// match negative integers
^-? [1-9] [D * \$// match integers
^[1-9] [D * | 0 \$// matches non negative integers (positive integers + 0)
^-[1-9] [D * | 0 \$// matches non positive integers (negative integers + 0)
^[1-9] - D * \. [D * | 0 \. [D * [1-9] - D * \$// matches a positive floating-point number
^-([1-9] - D * \. [D * | 0 \. [D * [1-9] - D * * \$// matches negative floating point numbers
^-? ([1-9] - D * \. [D * | 0 \. [D * [1-9] - D * | 0? \. 0 + | 0) \$// matches floating point numbers
^\ * - [0.1249 +] - floating point number
^(- ([1-9] - D * \. [D * | 0 \. [D * [1-9] - D *) | 0? \. 0 + | 0 \$// matches non positive floating point number (negative floating point number + 0) s``````

Comments: it is useful to deal with a large amount of data, and attention should be paid to the correction in specific application.

Matches a specific string

``````^[a-za-z] + \$// matches a string of 26 English letters
^[A-Z] + \$// matches a string of 26 uppercase English letters
^[A-Z] + \$// matches a string composed of 26 lowercase English letters
^[a-za-z0-9] + \$// matches a string of numbers and 26 English letters
^\W + \$// matches a string of numbers, 26 letters or underscores``````

Comment: some of the most basic and commonly used expressions

## Time regularization case

Simple date judgment (yyyy / mm / DD)

``^\d{4}(\-|\/|\.)\d{1,2}\d{1,2}\$ ``

Date judgment of evolution (yyyy / mm / DD | YY / mm / DD)

``^(^ (< D {4} | D {2}) (\ - | / |), (^ (< D {1,2}), (< D {1,2}) \$), (^ D {4} year / D {1,2} month / D {1,2}) \$)\$``

example:

``^((((1[6-9]|[2-9]\d)\d{2})-(0?[13578]|1[02])-(0?[1-9]|[12]\d|3[01]))|(((1[6-9]|[2-9]\d)\d{2})-(0?[13456789]|1[012])-(0?[1-9]|[12]\d|30))|(((1[6-9]|[2-9]\d)\d{2})-0?2-(0?[1-9]|1\d|2[0-8]))|(((1[6-9]|[2-9]\d)(0[48]|[2468][048]|[13579][26])|((16|[2468][048]|[3579][26])00))-0?2-29-))\$ ``

analysis:

What is a legal date range? There are different explanations for different application scenarios. The agreement in MSDN is adopted here

DateTimeThe value type represents the date and time when the value ranges from 12:00:00 midnight on January 1, 0001 to 11:59:59 pm on December 31, 9999

The explanation of leap year.

The leap year of the Gregorian calendar is as follows: the earth’s revolution around the sun is called a regression year, which is 365 days older than 5:48:46 seconds. Therefore, the Gregorian calendar stipulates that there are ordinary years and leap years. The ordinary year has 365 days, which is 0.2422 days shorter than the regression year. The four years are 0.9688 days shorter. Therefore, one day is added every four years. This year has 366 days, which is the leap year. However, one more day in four years is 0.0312 days more than that in four regression years, and there will be 3.12 days more in 400 years. Therefore, three leap years will be set less in 400 years, that is, only 97 leap years will be set in 400 years, so the average length of Gregorian calendar year is similar to that of regression year. Thus: the year is the whole hundred, must be a multiple of 400 is a leap year, for example, 1900, 2100 is not a leap year.

First, you need to verify the year. Obviously, the year range is 0001 – 9999. The regular expression matching yyyy is:

``[0-9]{3}[1-9]|[0-9]{2}[1-9][0-9]{1}|[0-9]{1}[1-9][0-9]{2}|[1-9][0-9]{3}``

Where [0-9] can also be expressed as a word, but it’s not as intuitive as [0-9], so I’ll always use [0-9]

There are two difficulties in using regular expressions to verify dates: first, the days of big and small months are different, and second, leap year is considered.

For the first difficulty, let’s not consider leap year. Suppose that February has 28 days. In this way, month and date can be divided into three cases

(1) The months are 1, 3, 5, 7, 8, 10 and 12, and the days range from 01 to 31

``(0[13578]|1[02])-(0[1-9]|[12][0-9]|3[01])``

(2) The month is April, June, September and November, and the days range is 01-30

``(0[469]|11)-(0[1-9]|[12][0-9]|30)``

(3) The month is 2. Considering the normal year, the regular expression matching MM-DD is:

``02-(0[1-9]|[1][0-9]|2[0-8])``

According to the above results, we can get a regular expression that matches the format of year date as yyyy-mm-dd

``([0-9]{3}[1-9]|[0-9]{2}[1-9][0-9]{1}|[0-9]{1}[1-9][0-9]{2}|[1-9][0-9]{3})-(((0[13578]|1[02])-(0[1-9]|[12][0-9]|3[01]))|((0[469]|11)-(0[1-9]|[12][0-9]|30))|(02-(0[1-9]|[1][0-9]|2[0-8])))``

Next, let’s solve the second difficulty: the consideration of leap year. According to the definition of leap year, we can divide leap year into two categories

(1) A year divisible by four but not 100. Looking for the change rule of the last two digits, we can quickly get the following regular matching:

``([0-9]{2})(0[48]|[2468][048]|[13579][26])``

(2) Years divisible by 400. The number divisible by 400 is definitely divisible by 100, so the last two digits must be 00. We only need to ensure that the first two digits can be divisible by 4

``(0[48]|[2468][048]|[3579][26])00 ``

The strongest validation date of regular expression, added leap year validation

The date format supported by this date regular expression is as follows.

``````YYYY-MM-DD
YYYY/MM/DD
YYYY_MM_DD
YYYY.MM.DD``````

The complete regular expression is as follows

``((^((1[8-9]\d{2})|([2-9]\d{3}))([-\/\._])(10|12|0?[13578])([-\/\._])(3[01]|[12][0-9]|0?[1-9])\$)|(^((1[8-9]\d{2})|([2-9]\d{3}))([-\/\._])(11|0?[469])([-\/\._])(30|[12][0-9]|0?[1-9])\$)|(^((1[8-9]\d{2})|([2-9]\d{3}))([-\/\._])(0?2)([-\/\._])(2[0-8]|1[0-9]|0?[1-9])\$)|(^([2468][048]00)([-\/\._])(0?2)([-\/\._])(29)\$)|(^([3579][26]00)([-\/\._])(0?2)([-\/\._])(29)\$)|(^([1][89][0][48])([-\/\._])(0?2)([-\/\._])(29)\$)|(^([2-9][0-9][0][48])([-\/\._])(0?2)([-\/\._])(29)\$)|(^([1][89][2468][048])([-\/\._])(0?2)([-\/\._])(29)\$)|(^([2-9][0-9][2468][048])([-\/\._])(0?2)([-\/\._])(29)\$)|(^([1][89][13579][26])([-\/\._])(0?2)([-\/\._])(29)\$)|(^([2-9][0-9][13579][26])([-\/\._])(0?2)([-\/\._])(29)\$))``

The February of leap year has 29 days, so the regular expression matching the date format of leap year as yyyy-mm-dd is as follows:

``(([0-9]{2})(0[48]|[2468][048]|[13579][26])|((0[48]|[2468][048]|[3579][26])00))-02-29``

Finally, by combining the date validation expressions of normal year and leap year, we get the final regular expression of validation date format yyyy-mm-dd as follows:

``(([0-9]{3}[1-9]|[0-9]{2}[1-9][0-9]{1}|[0-9]{1}[1-9][0-9]{2}|[1-9][0-9]{3})-(((0[13578]|1[02])-(0[1-9]|[12][0-9]|3[01]))|((0[469]|11)-(0[1-9]|[12][0-9]|30))|(02-(0[1-9]|[1][0-9]|2[0-8]))))|((([0-9]{2})(0[48]|[2468][048]|[13579][26])|((0[48]|[2468][048]|[3579][26])00))-02-29)``

The regular validation expression of DD / mm / yyyy * format is

``(((0[1-9]|[12][0-9]|3[01])/((0[13578]|1[02]))|((0[1-9]|[12][0-9]|30)/(0[469]|11))|(0[1-9]|[1][0-9]|2[0-8])/(02))/([0-9]{3}[1-9]|[0-9]{2}[1-9][0-9]{1}|[0-9]{1}[1-9][0-9]{2}|[1-9][0-9]{3}))|(29/02/(([0-9]{2})(0[48]|[2468][048]|[13579][26])|((0[48]|[2468][048]|[3579][26])00)))``

Friends can collect these regular expressions first, and then check them.

OK, that’s all for today. I’m glacier. If you have any questions, you can leave a message below or add my wechat: sun_ shine_ Lyz, I’ll bring you into the group, exchange technology together, advance together, and force together~~

## Programming Xiaobai must understand the network principle

How is the network composed? Why can we easily surf the Internet now?Whether you are a computer major or not, you may always have such questions in your heart!And today we will solve this matter and tell you the real answer! Basic composition of network First, let’s look at this sentence Connect all computers together […]