A powerful tool for the front end of puppeter


PuppeteerIt’s a product released by the chrome development team in 2017 Node.js Bag, along with headless chrome. Used to simulate the operation of Chrome browser. It provides an advanced API to control headless chrome or chromium through the devtools protocol. It can also be configured to use full (non headless) chrome or chromium.

A powerful tool for the front end of puppeter

Before learning about puppeter, let’s take a look at chrome devtool protocol and headless chrome.

What is chrome devtool protocol

  • CDP is based on websocket and uses websocket to realize fast data channel with browser kernel.
  • CDP is divided into several domains (DOM, debugger, network, profiler, console…), in which related commands and events are defined.
  • We can encapsulate some tools based on CDP to debug and analyze Chrome browser. For example, our commonly used “Chrome developer tool” is implemented based on CDP.
  • Many useful tools are implemented based on CDP, such asChrome developer toolschrome-remote-interfacePuppeteerAnd so on.

What is headless Chrome

  • You can run chrome without an interface.
  • Operate chrome through command line or program language.
  • Without human intervention, the operation is more stable.
  • You can start chrome in headless mode by adding the parameter — headless when starting chrome.
  • What parameters can be added when chrome starts? You can clickheresee.

In a word, headless Chrome is the no interface form of Chrome browser. You can use all the features supported by chrome to run your program without opening the browser.

What is a puppeter

  • The puppeter is Node.js Tool engine.
  • Poppeter provides a series of APIs to control the behavior of chromium / Chrome browser through the chrome devtools protocol.
  • By default, poppeter starts Chrome with headless. It can also start Chrome with interface through parameter control.
  • By default, puppeter binds to the latest chromium version, or you can set different versions of the binding.
  • We don’t need to know much about the underlying CDP protocol to communicate with the browser.

What can poppeter do

Official introduction: most of the operations that you can perform manually in the browser can be completed by using puppeter! Example:

  • Generate a screenshot and PDF of the page.
  • Crawl spa or SSR website.
  • Automatic form submission, UI testing, keyboard input, etc.
  • Create the latest automated test environment. Use the latest JavaScript and browser functions to run the test directly in the latest version of chrome.
  • Capture a timeline trace of the site to help diagnose performance issues.
  • Test the Chrome extension.

Puppeter API hierarchy

The API hierarchical structure in the poppeter is basically consistent with that of the browser. Here are some commonly used classes:

A powerful tool for the front end of puppeter

  • Browser: for a browser instance, a browser can contain multiple browsercontexts
  • BrowserContext: there is a context session for the browser, just as we open a normal chrome and then open a browser in stealth mode. Browsercontext has an independent session (cookie and cache are not shared independently). A browsercontext can contain multiple pages
  • Page: represents a tab page, through browserContext.newPage ()/ browser.newPage () create, browser.newPage () the default browsercontext will be used when creating a page. A page can contain multiple frames
  • Frame: a frame, each page has a main frame( page.MainFrame ()), or multiple subframes, mainly created by iframe tags
  • ExecutionContext: is the JavaScript execution environment, each frame has a default JavaScript execution environment
  • ElementHandle: corresponding to an element node of DOM. Through this instance, we can click on the element, fill in the form and other behaviors. We can get the corresponding element through the selector, XPath and so on
  • JsHandle: corresponding to the JavaScript object in DOM, elementhandle inherits from jshandle. Because we can’t directly operate the object in DOM, we encapsulate it as jshandle to realize the related functions
  • CDPSession: it can directly communicate with native CDP through session.send Function to send messages directly through the session.on To receive messages, you can implement functions not covered in the puppeter API
  • Coverage: get JavaScript and CSS code coverage
  • Tracing: grab performance data for analysis
  • Response: response received by the page
  • Request: request from page

Installation and environment of puppeter

Note: before v1.18.1, puppeter needs at least node v6.4.0. Versions from v1.18.1 to v2.1.0 depend on node 8.9.0 +. Since V3.0.0, puppeter has been dependent on node 10.18.1 +. To useasync / await, which is only supported by node v7.6.0 or later.

Puppeter is a node.js Package, so the installation is simple:

npm install puppeteer
yarn add puppeteer

NPM may report an error when installing puppeter! This is due to the external network, using Taobao image cnpm installation can solve the problem.

When you install puppeter, it will download the latest version of chromium. From version 1.7.0, the official release of thepuppeteer-coreSoftware package. By default, no browser will be downloaded. It is used to start an existing browser or connect to a remote browser. It should be noted that the installed version of puppeter core is compatible with the browser you intend to connect to.

Use of puppeter

Case 1: screenshot

We can take a screenshot of a page or an element in the page by using the puppeter

const puppeteer = require('puppeteer');

(async () => {
  const browser = await puppeteer.launch();
  const page = await browser.newPage();
  //Set the size of the visual area. The default page size is 800x600 resolution
  await page.setViewport({width: 1920, height: 800});
  await page.goto('https://www.baidu.com/');
  //Take a screenshot of the whole page
  await page.screenshot({
      path: './files/baidu_ home.png '// image save path
      type: 'png',
      Fullpage: true // scrolling and screenshot
      // clip: {x: 0, y: 0, width: 1920, height: 800}
  //Take a screenshot of an element on the page
  let element = await page.$('#s_lg_img');
  await element.screenshot({
      path: './files/baidu_logo.png'
  await page.close();
  await browser.close();

How do we get an element in a page?

  • page.$('#uniqueId'): gets the first element corresponding to a selector
  • page.$$('div'): gets all the elements corresponding to a selector
  • page.$x('//img'): get all the elements corresponding to an XPath
  • page.waitForXPath('//img'): wait for an element corresponding to an XPath to appear
  • page.waitForSelector('#uniqueId'): wait for the element corresponding to a selector to appear

Case2: simulating user actions

const puppeteer = require('puppeteer');

(async () => {
    const browser = await puppeteer.launch({
        Slowmo: 100, // slow down
        Headless: false, // enable visualization
        defaultViewport: {width: 1440, height: 780},
        Ignore HTTP errors: false, // ignore the error reported by HTTPS
        Args: ['-- start fullscreen'] // open the page in full screen
    const page = await browser.newPage();
    await page.goto('https://www.baidu.com/');
    //Input text
    const inputElement = await page.$('#kw');
    await inputElement.type('hello word', {delay: 20});
    //Click the search button
    let okButtonElement = await page.$('#su');
    //Wait for the page Jump to complete. Generally, when you click a button to jump, you need to wait page.waitForNavigation () the jump is successful only after execution
    await Promise.all([
    await page.close();
    await browser.close();

What functions are provided by elementhandle to operate elements?

  • elementHandle.click(): click on an element
  • elementHandle.tap(): simulate finger touch and click
  • elementHandle.focus(): focus on an element
  • elementHandle.hover(): hover over an element
  • elementHandle.type('hello'): enter text in the input box

Case 3: embedding JavaScript code

The most powerful function of puppeter is that you can execute any JavaScript code you want to run in the browser. The following code is an example of crawling data for Baidu home news recommendation.

const puppeteer = require('puppeteer');

(async () => {
    const browser = await puppeteer.launch();
    const page = await browser.newPage();
    await page.goto('https://www.baidu.com/');
    //Through page.evaluate  Execute code in browser
    const resultData = await page.evaluate(async () =>  {
      let data = {};
      const ListEle = [...document.querySelectorAll('#hotsearch-content-wrapper .hotsearch-item')];
      data = ListEle.map((ele) => {
        const urlEle = ele.querySelector('a.c-link');
        const titleEle = ele.querySelector('.title-content-title');
        return {
          href: urlEle.href,
          title: titleEle.innerText,
      return data;
    await page.close();
    await browser.close();

What functions can execute code in a browser environment?

  • page.evaluate(pageFunction[, ...args]): execute function in browser environment
  • page.evaluateHandle(pageFunction[, ...args]): executes the function in the browser environment and returns the jshandle object
  • page.$$eval(selector, pageFunction[, ...args]): pass all the elements corresponding to the selector into the function and execute it in the browser environment
  • page.$eval(selector, pageFunction[, ...args]): pass the first element corresponding to the selector into the function and execute it in the browser environment
  • page.evaluateOnNewDocument(pageFunction[, ...args]): when creating a new document, it will be executed in the browser environment before all scripts on the page are executed
  • page.exposeFunction(name, puppeteerFunction): register a function on the window object. This function is executed in the Node environment. It has the opportunity to call in the browser environment. Node.js Correlation function library

Case 4: request interception

Requests are very necessary in some scenarios. We can intercept unnecessary requests to improve performance. We can listen to the request event of the page and intercept requests on the premise that request interception is enabledpage.setRequestInterception(true)

const puppeteer = require('puppeteer');

(async () => {
    const browser = await puppeteer.launch();
    const page = await browser.newPage();
    const blockTypes = new Set(['image', 'media', 'font']);
    await  page.setRequestInterception (true); // enable request interception
    page.on('request', request => {
        const type = request.resourceType();
        const shouldBlock = blockTypes.has(type);
            //Block requests directly
            return request.abort();
            //Rewrite request
            return request.continue({
                //You can override URL, method, postData and headers
                headers: Object.assign({}, request.headers(), {
                    'puppeteer-test': 'true'
    await page.goto('https://www.baidu.com/');
    await page.close();
    await browser.close();

What events are provided on the page?

  • page.on('close')Page close
  • page.on('console')The console API is called
  • page.on('error')Page error
  • page.on('load')Page loaded
  • page.on('request')Request received
  • page.on('requestfailed')request was aborted
  • page.on('requestfinished')Request successful
  • page.on('response')Response received
  • page.on('workercreated')Create webworker
  • page.on('workerdestroyed')Destroy webworker

Case 5: get websocket response

At present, poppeter does not provide a native API for processing websocket, but we can get it through the lower layer of chrome devotor protocol (CDP)

const puppeteer = require('puppeteer');

(async () => {
    const browser = await puppeteer.launch();
    const page = await browser.newPage();
    //Create CDP session
    let cdpSession = await page.target().createCDPSession();
    //Start network debugging and listen to network related events in chrome devools protocol
    await cdpSession.send('Network.enable');
    //Listen to the websocketframereceived event to obtain the corresponding data
    cdpSession.on('Network.webSocketFrameReceived', frame => {
        let payloadData = frame.response.payloadData;
            //Parse payloaddata and get the data pushed by the server
            let res = JSON.parse(payloadData.match(/\{.*\}/)[0]);
            if(res.code !== 200){
                console.log (` error calling websocket interface: Code=${ res.code },message=${ res.message }`);
                console.log ('Get websocket interface data: ', res.result );
    await page.goto('https://netease.youdata.163.com/dash/142161/reportExport?pid=700209493');
    await page.waitForFunction('window.renderdone', {polling: 20});
    await page.close();
    await browser.close();

Case 6: how to grab elements in iframe

A frame contains an execution context. We can’t execute functions across frames. A page can have multiple frames, which are mainly generated by embedding iframe tags. Most of the functions on the page are page.mainFrame (). XX is a short for. Frame is a tree structure. We can frame.childFrames () traverses all frames. If you want to execute functions in other frames, you must obtain the corresponding frames to process them

The following is when logging in to the 188 mailbox, the login window is actually an embedded iframe. When we get the following code, we are getting the iframe and logging in

const puppeteer = require('puppeteer');

(async () => {
    const browser = await puppeteer.launch({headless: false, slowMo: 50});
    const page = await browser.newPage();
    await page.goto('https://www.188.com');
    for (const frame of page.mainFrame().childFrames()){
        //Find the iframe corresponding to the login page according to the URL
        if (frame.url().includes('passport.188.com')){
            await frame.type('.dlemail', '[email protected]');
            await frame.type('.dlpwd', '123456');
            await Promise.all([
    await page.close();
    await browser.close();

Case 7: page performance analysis

Puppeter provides a tool for page performance analysis. At present, its function is still relatively weak. It can only obtain data of a page performance execution. How to analyze it needs to be analyzed by ourselves according to the data. It is said that in version 2.0, it will make a big revision: – a browser can only trace once at a time – the corresponding JSON can be uploaded in the performance of devtools File and view the analysis results – we can write scripts to parse trace.json Through tracing, we can get the page loading speed and script execution performance

const puppeteer = require('puppeteer');

(async () => {
    const browser = await puppeteer.launch();
    const page = await browser.newPage();
    await page.tracing.start({path: './files/trace.json'});
    await page.goto('https://www.google.com');
    await page.tracing.stop();
        continue analysis from 'trace.json'

Case 8: uploading and downloading files

In automatic testing, we often meet the requirements of uploading and downloading files, so how to implement it in the puppeter?

const puppeteer = require('puppeteer');

(async () => {
    const browser = await puppeteer.launch();
    const page = await browser.newPage();
    //Set download path through CDP session
    const cdp = await page.target().createCDPSession();
    await cdp.send('Page.setDownloadBehavior', {
        Behavior: 'allow' // allow all download requests
        Downloadpath: 'path / to / download' // set the download path
    //Click the button to trigger the download
    await (await page.waitForSelector('#someButton')).click();
    //Wait for the file to appear, rotate to judge whether the file appears
    await waitForFile('path/to/download/filename');

    //When uploading, the corresponding inputelement must be a < input > element
    let inputElement = await page.waitForXPath('//input[@type="file"]');
    await inputElement.uploadFile('/path/to/file');

Case 9: jump to new tab page processing

When you click a button to jump to a new tab page, a new page will be opened. How do we get the page instance corresponding to the changed page? It can be realized by listening to the targetcreated event on the browser to indicate that a new page has been created

let page = await browser.newPage();
await page.goto(url);
let btn = await page.waitForSelector('#btn');
//Before clicking the button, define a promise to return the page object of the new tab
const newPagePromise = new Promise(res => 
    target => res(target.page())
await btn.click();
//After clicking the button, wait for the new tab object
let newPage = await newPagePromise;

Case 10: simulating different devices

Puppeter provides the function of simulating different devices, in which puppeteer.devices Object to define a lot of device configuration information, which mainly includes viewport and useragent, and then through the function page.emulate Simulation of different devices

const puppeteer = require('puppeteer');
const iPhone = puppeteer.devices['iPhone 6'];
puppeteer.launch().then(async browser => {
  const page = await browser.newPage();
  await page.emulate(iPhone);
  await page.goto('https://www.baidu.com');
  await browser.close();

Performance and optimization

  • About shared memory:
Chrome uses / dev / SHM shared memory by default, but docker only uses 64MB by default, which is obviously not enough. There are two ways to solve this problem
-When starting docker, add the parameter -- SHM size = 1GB to increase / dev / SHM shared memory, but swarm does not support SHM size parameter at present
-Start chrome, add parameter - Disable dev SHM usage, forbid to use / dev / SHM shared memory
  • Try to use the same browser instance as much as possible to realize cache sharing
  • Intercept unnecessary resources by request
  • Just like when we open chrome, the number of tab pages is bound to get stuck, so we must effectively control the number of tab pages
  • If a chrome instance starts for a long time, it will inevitably lead to memory leakage and page crash. Therefore, it is necessary to restart the chrome instance regularly
  • To speed up performance, turn off unnecessary configurations, such as: – no sandbox (sandbox function), — Disable extensions (extenders), etc
  • Try to avoid using it page.waifFor (1000), it’s better to let the program decide for itself
  • Because websocket is used to connect with chrome instance, there will be websocket sticky session problem