Deep understanding of JavaScript – regular expressions


regular expression

Regular expressions are patterns used to match character combinations in strings. In JavaScript, regular expressions are also objects. These patterns are used toRegExpOfexecandtextMethods, andStringInmatchmatchAllreplace search andsplitmethod

Create expression


Use two/Create a regular expression directly, with a slash to indicate the beginning and end

var	reg	=	/ab/g

When the script is loaded, the regular expression literal is compiled. When the regular expression is kept unchanged, better performance can be obtained by using this method.


var reg = new RegRxp('ab','g')
//Equivalent to var reg = / AB / g

Literal quantity creates a modifier after the end of the slash and the second argument of the constructor.

The above two ways of writing are to add a new object in the regular expression. The difference is that when the first re engine compiles code, it creates a new regular expression. The second method creates a new expression at run time, so literal quantity is more efficient. And the literal quantity is more convenient and intuitive, basically will use literal quantity to define regular expression.

Instance properties

Regular modifier related instance properties (read only)

  • ignoreCase: returns a Boolean value indicating whether it is setiModifier
  • global: returns a Boolean value indicating whether it is setgModifier
  • multiline: returns a Boolean value indicating whether it is setmModifier
  • flags: returns a string containing all the modifiers for the setting

Not related to modifiers. Strength attribute:

  • lastIndex: returns a certificate indicating the location of the next search
  • source: returns the string form of a regular expression, read-only
var reg = /abc/gim
//Modifier related properties
reg.ignoreCase  //true  //true
reg.multiline  //true
reg.flags   //gim
//Modifier independent properties
reg.lastIndex //0
reg.source    //abc

Example method

Regular instance method


Test the match in the string and return the valuetrue orfalse

var reg = /av/g
var s = 'avbabc'
reg.test(s)  //true

reg.lastIndex = 2
reg.test(s) //false

When a regular expression has agModifier every timetestMethods will match backwards from the last ending position. You can uselastIndexView current location

var reg = /av/g
var s = 'avbavabc'

reg.lastIndex //0
reg.test(s)		//true

reg.lastIndex	//2
reg.test(s)		//true

reg.lastIndex //5
reg.test(s)		//false

If the regular expression is an empty string, it matches all the strings and returnstrue


In the string to find matching characters, return an array, not matching to returnnull
execMethod returns an array containing two properties:

  • input: the entire original string
  • index: start position index of successful pattern matching
var reg = /av/g
var s = 'avbavabc'

reg.exec(s)   //["av", index: 0, input: "avbavabc", groups: undefined]
reg.exec(s)		//["av", index: 3, input: "avbavabc", groups: undefined]
reg.exec(s)		//null

Like test, when a regular expression has agModifier every timeexecMethods will match backwards from the last ending position. You can uselastIndexView current location

When a regular expression contains()When group matching, the returned array contains multiple matching data. The first is the successful result of the whole regular matching. The second is the matching result in brackets. If there are multiple brackets, the third is the matching content in the second bracket. And so on.

var reg = /a(v)/g
var s = 'avbavabc'

reg.exec(s)		//[ 'av', 'v', index: 0, input: 'avbavabc', groups: undefined ]
reg.exec(s) 	//[ 'av', 'v', index: 3, input: 'avbavabc', groups: undefined ]
reg.exec(s)  	//null


var reg = /a(v)(b)/g
var s = 'avbavabc'

reg.exec(s) // [ 'avb', 'v', 'b', index: 0, input: 'avbavabc', groups: undefined ]
reg.exec(s) //null

String instance method


In the string to find matching characters, return an array, not matching to returnnull
When a regular expression does not have agThe modifier is to return an array withindexandiuputattribute

var reg = /ac/
var s = 'acbacvabc'
var s1 = 'aabaavabc'

s.match(reg)  //[ 'ac', index: 0, input: 'acbacvabc', groups: undefined ]
s1.match(reg) //null

Regular expressions withgModifier, the method returns all the result arrays that match successfully at one time. No longer withindexandinputattribute

var reg = /ac/g
var s = 'acbacvabc'

s.match(reg) //[ 'ac', 'ac' ]

Note: set thelastindexAttribute pairmatchThe method is invalid,matchMethod always matches the first string.


Performs a search in a string for all matching characters and returns an iterator. Note that when usingmatchAllThe regular expression needs to have agModifier, otherwise an error will be reported.

var reg = /a/g
var s = 'acbacvabc'

arr = [...s.matchAll(reg)]
  [ 'a', index: 0, input: 'acbacvabc', groups: undefined ],
  [ 'a', index: 3, input: 'acbacvabc', groups: undefined ],
  [ 'a', index: 6, input: 'acbacvabc', groups: undefined ]


To find the matching character in the string, return the position of the first matching character, and return – 1 if it does not match

var reg = /en/g
var reg1 = /yo/g
var s = 'yuwenbo' //3	//-1


Find the matching character in the string and replace the matching substring with the replacement string. Two parameters, one is the regular expression, the other is the content to be replaced.

If notgModifier to replace only the first successful match. If sogModifier, replace all values that match successfully.

var s = 'i love you'
console.log(s.replace(/\s/, '❤'))  //i❤love you
console.log(s.replace(/\s/g, '❤')) //i❤love❤you

replaceThe second parameter can be used$Symbol, which is used to make the replacement more convenient

  • $&: substring to match
  • `$’: matches the text in front of the result
  • $': matches the text following the result
  • $n: match the nth group of contents successfully. N is a natural number starting from 1
  • $$: dollar symbol$
console.log('he llo'.replace(/(\w+)\s(\w+)/, '$2 $1')) //llo he
console.log('hello'.replace(/e/, '-$`-$&-$\'-')) //h-h-e-llo-llo

replaceThe second parameter of can also be used as a function to replace each regular matching content with the return value of the function

The function can accept multiple parameters. The first parameter is the matched content, followed by the group matching content (there can be multiple group matching), the penultimate parameter is the position of the matching content in the string, and the penultimate parameter is the original string.

console.log('hello'.replace(/e/, function (match, index, str) {
 console.log(match, index, str)
 return '❤'

//e 1 hello


Use regular expression or a fixed string to split a string, and store the split substring in the array
This method can accept two parameters. The first parameter is a regular expression, which represents the segmentation rule. The second parameter is the maximum number of members of the returned array

str = 'ni hao ya.hei hei hei'
str.split(/ |\./, 5) //[ 'ni', 'hao', 'ya', 'hei', 'hei' ]


To determine whether a string is matched, usetestperhapssearchmethod
For more information, useexecOr,matchThe method will be slow.

Modifier (flag)

The modifier represents the additional rule, which is placed at the end of the regular pattern. It can be used individually or together.

//Single modifier
'abAbab'.match(/a/g)  //["a","a"] 

//Use multiple modifiers together 
'abAbab'.match(/a/gi)  //["a", "A", "a"]


Global search. By default, it only matches once, and then stops matching. With modifiers, it will search downward all the time


By default, matching strings are case sensitive


Multi line search, multi line mode, can be modified^and$act
By default,^and$Matches the beginning and end of a string
addmModifier,^and$It also matches the beginning and end of the line, that is^and$Line breaks are recognized\n

For example:

  • /yewen$/m.test('hi yuwen\n')bytrue 
  • /yewen$/.test('hi yuwen\n')byfalse


allow.Match line breaks


useunicodeCode pattern matching


The sticky search match starts at the current position of the target string

Special characters


Escape character
Regular expressions need to match the special character itself, need to be followed by a backslash\
In regular expressions, backslashes need to be escaped^,.,[,$,(,),|,*,+,?,{,\


Match start position
If the multiline flag is set, the position after the newline character is also matched

For example:/^A/It will match"Ant"InA, but it won’t match"ntA"InA


Match end position
If the multiline flag is set, the position before the newline character is also matched

For example:/A$/It will match"ntA"InA, but it won’t match"Ant"InA


Matches an expression 0 or more times
Equivalent to{0, }

For example:/yueno*/It will match"yuenoooyuen"Inyuenoooandyuen


Matches an expression one or more times
Equivalent to{1, }

For example:/yueno+/It will only match"yuenoooyuen"Inyuenooo


Match an expression 0 or 1 times
Equivalent to{0, 1}

  • For example:/yueno?/It will only match"yuenoooyuen"Inyueno
  • be careful:?If it is followed by any quantifier*+?or{}Will make quantifiers non greedy (match as few characters as possible)
  • For example:/yueno??/It will only match"yuenoooyuen"Inyuen


Any single character other than the newline is matched by default

  • For example:/.y/It will only match"yuenoooyuen"Inoy
  • For example:/..y/It will only match"yuenoooyuen"Inooy


Capture parenthesis
The bracket in regular expression indicates grouping matching, and the pattern in bracket can be used to match the content of grouping
Group matching can be used\n
In regular substitution, you can use$1,$2grammar

  • For example:/(wenbo)+/.test('wenbowenbo')bytrue, indicating a matchwenboOne or more times as a whole
  • For example:"wenbo,zhijian".replace(/(wenbo),(zhijian)/, '$2,$1')
  • Output:zhijian,wenbo


matchingXBut don’t remember the match
Non trapping parentheses enable you to define subexpressions used with regular expression operators
Use non capture parentheses to match elements, but not in use\nand$nmethod


Match >x, only if >xAfter that is the >y>, antecedent assertion

  • For example:'wenbo'.match(/wen(?=bo)/)
  • Output:[ 'wen', index: 0, input: 'wenbo', groups: undefined ]
  • For example:'wenyu'.match(/wen(?=bo)/)
  • Output: null


Match >x, only if >xIn front of it is >y, > after assertion

  • For example:'wenbo'.match(/(?<=wen)bo/)
  • Output:[ 'bo', index: 3, input: 'wenbo', groups: undefined ]
  • For example:'yubo'.match(/(?<=wen)bo/)
  • Output: null


Match >x, only if >xIt’s not >yWhen the > positive negative look-up


Match >x, only if >xThe front is not >y>, reverse negative search


Match >xOr >yIt can be used together

  • For example:'wenyu'.match(/w|e|n/g)
  • Output:[ 'w', 'e', 'n' ]


Matching the previous character just appears >nTimes, >n>Is a positive integer

  • For example:'hello'.match(/l{2}/g)
  • Output:[ 'll' ]


Matching a character with at least >nTimes, >n>Is a positive integer


Match the preceding character at least >n>Most times >m>Times, >n> ,> mYes > is a positive integer >


Character set > matches any character in brackets, including escape character. You can use dash (-) to specify a character, > for example: >[a-zA-Z1-9]>

  • For example:'hello 123'.match(/[a-h1-2]/g)
  • Output:[ 'h', 'e', '1', '2' ]


Reverse character set, > matches any character that is not contained in square brackets

  • For example:'hello 123'.match(/[^a-h1-2]/g)
  • Output:[ 'l', 'l', 'o', '3' ]


Match a backspace (U + 0008), not >\bDon’t mix it up


Match the boundaries of a word

For example:

  • /\bworld/.test('hello world') // true
  • /\bworld/.test('hello-world') // true
  • /\bworld/.test('helloworld')  // false


Match a non word boundary

For example:

  • /\bworld/.test('hello world') // false
  • /\bworld/.test('hello-world') // false
  • /\bworld/.test('helloworld')  // true


When x is a character between a and Z, it matches a controller in the string


Matching a number is equivalent to >[0-9]


Matching a number is equivalent to >[^0-9]


Matching a number is equivalent to >[^0-9]


Match a page feed (U + 000C)


Match a newline character (U + 000a)


Match a carriage return


Matches a blank character, including spaces, tabs, page breaks, and line breaks



Matches a non whitespace character


Match a horizontal tab


Match a vertical tab


Match a single character (letter, number or underline) >, equivalent to >[A-Za-z0-9_]


Match a non single character >, equivalent to >[A-Za-z0-9_]


Matches a non word character


Returns the last nth word and captures the matching sub characters. The number of > captures is calculated by the left bracket


Match null character (U + 0000)


Matches a character represented by a two digit hexadecimal number (< X00 – < XFF)


Matches a utf-16 code unit represented by a four digit hexadecimal number

\u{hhhhh}Character or\u{hhhh}

Matches the Unicode character represented by a hexadecimal character (only if the U flag is set)