JavaScript regular library: xregexp

Time:2021-1-14

It seems that there are not many articles about the JS regular library on the Internet. Maybe it’s also because complex regular matching rarely needs to be run on the client, so it’s not used much on JS. Moreover, I haven’t found a second one by searching the JS regular library. But some of its features are very practical, interested students can see. Because it doesn’t take much time to think about it, this paper only gives a brief introduction to the library. All the use cases in this paper come from its API documents.

Xregexp is a library that provides extension function for JS regular expression. After calling, it can make up for some shortcomings of native JS in regularity, and also enhance the function of JS regularity to a great extent. It solves the problem of regular compatibility between browsers, and supports native ES6 regular syntax?New regular expression features in ECMAScript 6)

The compressed version of xregexp is about 4.25k. In terms of performance, since the regularities generated by xregexp are all native regularization objects, their performance is the same as that of native regularization. It only requires more overhead when creating xregexp regularization for the first time. Its main features are as follows:

Main features of xregexp

  • Extended regular syntax, including support for named capture groups and more powerful text replacement

  • Add flagssSupport single line mode;xIgnore spaces and line comments;nClear capture group mode;ASupports 21 bit Unicode matching

  • Provide a set of functions to simplify the regular processing

  • Solve the problem of cross browser regular compatibility

  • On this basis, addons is provided to support more regular syntax and functions

Basic Usage

The main API of xregexp is this constructorXRegExp(pattern, [flags])Its syntax is as follows:

  • Regular expression whose pattern parameter is character type

  • [flags] is an optional regular modifier. Of course, it is also character type. It supports native modifiers and extended modifiers in xregexp (listed in features)

The return value is the extended regular object.

For example:

//The 'x' modifier is used, so spaces are ignored and line comments are supported
//Ignoring spaces means that spaces are ignored in the regularization
//(?<name>… )This is called the capture group
//Comment for line
var date = XRegExp('(?<year>  [0-9]{4} ) -?  # year  \n\
                    (?<month> [0-9]{2} ) -?  # month \n\
                    (?<day>   [0-9]{2} )     # day   ', 'x');

//The exec () method of xregexp extension is used here
var match = XRegExp.exec('2015-02-22', date);
match.year; // -> '2015'

The above chestnut? Shows the powerful extension function of xregexp in regularization, such as named capture group, regular annotation, etc., which makes up for the deficiency of native JS regularization to a certain extent.

Interesting features

Since I haven’t finished reading all the API documents and just came into contact with this thing, I only select a few features that I think are more practical.

Iterator foreach

Its syntax format isXRegExp.forEach(str, regex, callback)

  • STR the string to be matched

  • Regex introduced regularization

  • Callback function, the method will pass in four parameters of the callback function each iteration

    • The current matching array with a named backward reference property

    • Current matching index location

    • String being traversed

    • Regular object in use

This method traverses the matched string, ignoring the regular global modifierg, also ignorelastIndexThe initial value of.
The method has no return value.
For example:

//Extract a number from the matched string one at a time and put it into an even array
var evens = [];
XRegExp.forEach('1a2345', /\d/, function (match, i) {
  if (i % 2) evens.push(+match[0]);
});
// evens -> [2, 4]

Matchchain method

The matching chain method can invoke the next regular continuation match from the previous matching results, just like filtering the data you want from a wide range of different regularized rules.
Its grammar isXRegExp.matchChain(str, chain)

  • STR is matched with a string

  • A regular array, such as [reg1, reg2,…]

This method returns the last regular matching content of the matching chain (regular array), or an empty array.
For example:

//Basic usage: extract the number of each < b > marked package
//(? Is) is the syntax before the modifier in xregexp, which is equivalent to adding the modifier i s after the regular
XRegExp.matchChain('1 <b>2</b> 3 <b>4 a 56</b>', [
  XRegExp('(?is)<b>.*?</b>'),
  /\d+/
]);
// -> ['2', '4', '56']


//Returns the named capture group content (back reference)
html = '<a href="http://xregexp.com/api/">XRegExp</a>\
        <a href="http://www.google.com/">Google</a>';
XRegExp.matchChain(html, [
  {regex: /<a href="([^"]+)">/i, backref: 1},
  {regex: XRegExp('(?i)^https?://(?<domain>[^/?#]+)'), backref: 'domain'}
]);
// -> ['xregexp.com', 'www.google.com']

Union: a regular merging method

This method can merge the string or regular expression to be matched into a regular expression, and the regular expression with back reference will be recoded when merging, and its syntax format isXRegExp.union(patterns, [flags])

  • Patterns is an array. The elements of the array can be string or regular string to match

  • Optional modifier flags

The return value is the merged regular expression.
For example:

XRegExp.union(['a+b*c', /(dogs)/, /(cats)/], 'i');
// -> /a\+b\*c|(dogs)|(cats)/i

It’s fun to use 21 bit Unicode to match emoticons. If you see something more interesting, you can continue to add it.
The above content is mainly a rough translation of the original API, plus some of my own understanding, this introduction is very brief, it may be a little unclear, if there are any mistakes or omissions, please point out. Check out more details or download it directly to its home page.
Xregexp home page:?XRegExp Github :?XRegExp 3.0.0