29.22 minutes learn to write regularly

Time:2019-12-2

Write at the front

When you see the title, you may wonder why it’s not 30 minutes?
Because my article is full of pictures and texts. It’s very scary, brother. In fact, you can read it in less than 30 minutes.
You may think I’m bragging about B, but when you’re finished, pinch your watch and you’ll find
I’m really bragging about B
So why is. 22?
As a science student, it is the same belief to keep two decimal places.
And I just like the number 2. That’s all

regular expression

Regular expression, also known as regular expression. (English: regular expression, often abbreviated as regex, regexp or re in code), a concept of computer science. Regular expressions are often used to retrieve, replace, and verify text that conforms to a pattern (rule).

RegExp object

Regexp object represents regular expression in claw hole dead bitter per, which is a powerful tool for pattern matching of strings.
So how to use it?
There are two ways:Literal, constructor

Var reg = / \ bhello \ B / g // literal 
//"\ B" stands for word boundary, that is to say, the regular match is hello world rather than hello world
//Because HelloWorld is connected, there is no word boundary
var reg = new RegExp('\bhello\b','g')
//Pay attention to the difference between the two
//The latter method needs to escape the backslash (JavaScript reason),
//And this g (modifier, global match) is extracted separately
//Besides, there is no / surrounded regular expression on both sides. The first one is like this = > / regular expression/

Regular visualizer

Regulex
Visual graphics are very important for understanding regularitylargeHelp
Let’s not talk about it. This article will use this website to verify the examples.

Meta character

Regular expressions are composed of two basic character classes

  • literal character
  • Meta character

The original meaning character is the character that represents the original meaning. Like hello in the above regular, it means matching the Hello string
As for metacharacters, they mean characters that are not the original meaning, so it’s much easier to think about them. Like this one above

29.22 minutes learn to write regularly

29.22 minutes learn to write regularly

Since metacharacters do not represent their own characters, what if I want to match their original characters? For example, if I want to match the + sign and * sign, please use \ to escape the character

Let’s go through the following metacharacters. First, you can read them without memorizing them~

  • $matches the end of the input string. If the multiline property of the regexp object is set, $also matches’ n ‘or’ R ‘. To match the $character itself, use \ $.
  • () marks the start and end positions of a subexpression. Subexpressions can be obtained for later use. To match these characters, use (and).
  • *Matches the previous subexpression zero or more times. To match the * character, use *.
  • +Matches the previous subexpression one or more times. To match the + character, use +.
  • . matches any single character except the newline n. To match.
  • [] marks the beginning of a bracket expression. To match [, use [.
  • {} marks the beginning of the qualifier expression. To match {, use {.
  • |Indicates a choice between the two. To match |, use |.
  • ? matches the preceding subexpression zero or once, or indicates a non greedy qualifier. To match the? Character, use?.
  • \Mark the next character as a or special character, or literal character, or back reference, or octal escape character. For example, ‘n’ matches the character ‘n’. ‘n’ matches line breaks. Sequence ‘\’ matches’ ‘, while’ (‘matches’ (‘.
  • ^Matches the start of the input string, unless used in a square bracket expression, where it means that the character set is not accepted. To match the ^ character itself, use ^.
  • \CX matches the control character indicated by X. For example, CM matches a control-m or carriage return. The value of X must be one of A-Z or A-Z. Otherwise, C is treated as an original ‘C’ character.
  • \F matches a page break. Equivalent to x0c and CL.
  • \N matches a line break. Equivalent to x0a and CJ.
  • \R matches a carriage return. Equivalent to x0d and cm.
  • \S matches any white space characters, including spaces, tabs, page breaks, and so on. Equivalent to [fnrtv]. Note that Unicode regular expressions match the full space character.
  • \S matches any non white space characters. Equivalent to1
  • \Tmatch a tab. Equivalent to X09 and CI.
  • \V matches a vertical tab. Equivalent to x0B and CK

boundary

29.22 minutes learn to write regularly

We’ve known this B from the beginning. No, it’s this \ B
He means the word boundary
As we know, f * ck has many uses. It can be used alone or with a variety of parts of speech.
And then we just want to find a separate f * ck, look at the code

//How can a glorious socialist successor use f * ck as an example?
var reg = /\bis\b/g;
var str = "this is me";
str.replace(reg,'X')
//"this X me"
var reg = /is/g;
var str = "this is me";
str.replace(reg,'X')
//"thX X me"

The difference between the two is clear. Let me not say more, my guest.

Let’s look at one more question,What if I just want the a character in the beginning and the a character in the middle of the text?
Only in this way can we fight the enemy

var reg = /^A/g;
var str = "ABA";
str.replace(reg,'X');
//"XBA"

29.22 minutes learn to write regularly

If you need a regular ending with a, it is as follows

var reg = /A$/g;
var str = "ABA";
str.replace(reg,'X');
//"ABX"

29.22 minutes learn to write regularly

Note that just as at the beginning and the end, so are ^ and $, ^ before the regular expression and $

Character class

In general, one character of a regular expression corresponds to one character of a string
For example, the expression \ bhello means the matching character \ b h e l l o,

If we want to match a type of character?
For example, if I want to match a or B or C, we can use the metacharacter [] to build a simple class
[a B C] just put a, B and C into one category, indicating that they can match a or B or C.
If you lose your English, you should be able to read the following figure, one of ABC, which is to match any one of ABC~

29.22 minutes learn to write regularly

Scope class

When we learn the above, if we want to write a number that matches 0 to 9, it should be like this

29.22 minutes learn to write regularly

But what if I want to match more? That’s not a keyboard that’s going to break? Is this regularity too unwise???
Obviously, as you can imagine, so do the people who create regularity
We can do this

29.22 minutes learn to write regularly

Well, it’s more convenient, and then you may be surprised. What about my dash? What if I want to match 0-9 and the dash?
Don’t panic, just fill it back
This picture clearly shows that there are two branches, that is, I can take the 0-9 road or the short horizontal line road

29.22 minutes learn to write regularly

29.22 minutes learn to write regularly

Predefined Class

After learning the above, we can write the regularity of matching numbers, [0-9]

Is there a simpler and shorter way?

Coincidentally, regular is spicy

In the metacharacter section above, you may have seen the subtlety

The table above is not the one above

29.22 minutes learn to write regularly

29.22 minutes learn to write regularly

29.22 minutes learn to write regularly

29.22 minutes learn to write regularly

29.22 minutes learn to write regularly

We can remember the usage of these predefined classes according to the meaning of English words.
We found that the difference between capital letters and lowercase letters is to take the opposite! For example, D and D
At the same time, we can find from the equivalence classes in the table that if we want to negate a class, then add a^
none of abc

29.22 minutes learn to write regularly

Classifier

What if you write a rule that matches 10 numbers? How would you write
Well, you may have written it down

\d\d\d\d\d\d\d\d\d\d

Surprised, you will find that even your right hand, which has been single for more than 20 years, still feels a little weak!
Tired, sometimes after overwork
In order to save some people’s right arm, regular has quantifiers
To achieve the above requirements, we only need to \ D {10}
29.22 minutes learn to write regularly

Digit 10times
For the convenience of some people who are not good at English, such as me, I even use the little-known Baidu translation

29.22 minutes learn to write regularly

But if I don’t know how many numbers to match? It’s about matching between 100 and 1000 numbers
Dangdangdang~

29.22 minutes learn to write regularly

Let’s take a look at the results of the visualizer for easy understanding

Note that this {n, m} includes N times and m times. It’s a closed interval

29.22 minutes learn to write regularly

Greedy mode and non greedy mode

From the above we know that if we want to match 100 to 1000 numbers, it’s like this
\d{100,1000}
What if I give 1000 numbers in the string, but I just want to match the first 100?

If you write as above, then

var reg = /\d{3,6}/;
var str = "123456789";
Str.replace (reg, 'replace with this');
//"Replace with this 789"

As we can see, the above example matches 6 numbers and replaces 6 numbers, although his regular matches are 3 to 6 numbers.

Yes, it’s greedy! It will match as much as possible!
This is the regular greedy matching, which is the default. If we don’t want to be greedy, how can we become more satisfied?
Just add? After the quantifier

var reg = /\d{3,6}?/;
var str = "123456789";
Str.replace (reg, 'replace with this');
//"Replace with this 456789"

It is clear that the regular only matches the first three numbers ~ this is the regular non greedy pattern

Branch condition

What if I only need to match 100 or 1000 numbers?
There are only 100 and 1000 possibilities, but not any number from 100 to 1000. How to deal with the enemy?
So we have to design regular branching conditions

\d{100}|\d{1000}

29.22 minutes learn to write regularly

It should be noted that this|divides all parts on the left and right sides, rather than just connecting the left and right parts of the symbol. See the figure below

29.22 minutes learn to write regularly

Sometimes we only need a part of the branch and the same trunk to follow. We just need to include the branch with ()

29.22 minutes learn to write regularly

Note: this match starts from the branch condition on the left side of the regular. If the left side is satisfied, then the right side will not be compared!

var reg = /\d{4}|\d{2}/
var str = "12345"
str.replace(reg,'X');
// "X5"
var reg = /\d{2}|\d{4}/
var str = "12345"
str.replace(reg,'X');
//"X345"

Forward looking / backward looking

Some times, the characters we are looking for may depend on the characters before and after
For example, I want to replace two consecutive numbers with two English letters in front of them
You may wonder, isn’t this the end of the story?

\d{2}\w{2}

The above matches two numbers and two letters. Although they are connected, they match four characters. If I want to replace the matching text, I will replace four characters. We just want to replace two numbers!
You need to use assertions at this time
First of all, we need to understand a few points

  • Regular expressions start to parse from the head to the tail of the text. The tail direction of the text is called “front”, that is, to move forward, that is, to move towards the tail
  • prospectWhen a regular expression matches a rule (in this case, ‘2 numbers’), look forward to see if it matches the assertion (in this case,’ 2 letters in front ‘), and the backward / backward rule is the opposite. (JavaScript does not support backtracking)

Form up!

29.22 minutes learn to write regularly

According to the contents of the table, we can solve this problem. Note that \ w includes numbers. The title requires two letters

29.22 minutes learn to write regularly

var reg = /\d{2}(?=[a-zA-Z]{2})/;
var str = "1a23bc456def";
str.replace(reg,'X');
//"1aXbc456def"

Only replaced the number, not the assertion!

By the way, take a look at the negative side

29.22 minutes learn to write regularly

See thisnotI think you should know the usage

Grouping

How do we write when we want to match a word that appears three times instead of a number?
You might write like this

hello{3}

Then you open the visualization tool

29.22 minutes learn to write regularly

Mom, I only repeated my o! Dead slag, too much

In fact, as long as we use (), we can achieve the purpose of grouping, and make the quantifier act on the grouping. The brackets in the above branching conditions are the same

29.22 minutes learn to write regularly

How to use group content after grouping?
First of all, let’s look at a question,How to match 8 discontinuous numbers?
If you don’t use grouping, you will find that there is no way to start, because you can’t judge whether there is repetition or not!
Let’s publish the answers first, and then analyze the wave

29.22 minutes learn to write regularly

  • First of all, this (?! negative forward-looking assertion a) expression B, which uses the negative forward-looking assertion, that is to say, the content (expression b) in front of the assertion a does not conform to the expression A. this statement is very awkward, and I can’t bend my mouth. Can you understand that this design is that my assertion is “there are duplicate numbers”, and then the expression is “8 numbers”,”8 numbers “cannot be compound” duplicate numbers“
  • Then, this. *(\ d). * is to find a number in any position first. Why any position? Because the repetition of our judgment may appear in any place; you can see from the visualization above,\There are 0-n characters before and after D, so it’s anywhere
  • The most important thing is coming. What does this \ 1 represent? If you look at it carefully, you can see that \ D has a parenthesis, which meansGrouping, what’s the group number? The first bracket is group 1 (by default). If there is another bracket, it is group 2,Forward looking brackets don’t count, and this \ 1 represents the reference to group 1. Use \ 2 to reference group 2. You may be curious. Is it equivalent to writing a “d” in this position when I quote it? NOP, it’s not just that simple,It refers to the content of this \ D, which means that it will be the same value as \ D! Isn’t that repetition?!!!This. *(\ d). * \ 1 stands forAny number of repetitions at any position
  • Finally, we put all of these together,Matching 8 numbers cannot have any repetition。 (?! any repetition) 8 numbers, because this (?!) is a negative looking forward, so… Emmm. So I understand.

There are other more detailed contents in the group, but the space is limited,It’s almost 30 minutes。 I have to pick up some valuable and common ones~

Hey hey hey

This is the introduction of regular~
The next article will cover the properties of regular objects in JavaScript, as well as some methods.
29.22 minutes learn to write regularly (2)
If you have any comments or suggestions, please point out in the comment area, thank you


  1. fnrtv ↩