The method of accumulating the number of songs to be listened to by java + selenium Netease cloud music brush

Time:2019-12-9

background

It should have been last year. Brush Zhihu saw a question about how to brush the accumulated number of songs that Netease cloud music individuals listened to. Then he had a highly praised answer, posted a section of JS code, and executed it directly in the browser console. At that time, I tried it and directly brushed it for tens of thousands. Tragically, the next day it returned to its original form, which was apparently blocked by the discovery of Netease cloud music. In addition, Netease cloud added some restrictions on the cumulative number of songs it listened to, with a maximum of 300 songs per day. Today, it brings a way of automatically playing songs through java + selenium to achieve the effect of brushing the cumulative number of songs. In addition, with the help of this demo, I am more familiar with the use of selenium, which is also some interesting things in the crawler application.

thinking

There are two ways to log in:

A. simulate the login process of the web. Advantage: this method is more general and convenient for dynamic account switching. Disadvantages: it’s a little bit more troublesome than using cookies directly, and there is a certain chance that a graphic verification code will appear, which needs to be considered.

B. set cookies. Advantages: no need to deal with the login process, which is relatively simple and convenient. It is more convenient when the cookie expires for a long time, and no need to switch frequently. Disadvantages: it’s troublesome to switch accounts, unable to achieve automation. I choose this method here.

Play: after successful login in the previous step, directly open the song list page. Following chart

, you can see it on the song list page. There are three places where you can click to play. The first thing I think of is the bottom play button. Then I keep the display of the bottom play component and get the real-time play dynamic. Try to click the play button through simulation, which is always unsuccessful. Finally, click the top play button to play.

Get play dynamic: in order to determine whether the play is in normal progress, you can get the accumulated songs related information of the personal home page in real time for monitoring. Since there is already a page playing songs, in order not to affect the original page playing songs, you can open a new tab page to get the personal home page and open a new table page. Here, JS is usedwindow.open('about:blank')

In the end, you will see the following format logs, which indicates success:

2019-03-26 09:25:10406 info [, main] - [com. GitHub. Wycm. Music163] - Yili Riverside - 00:00 / 00:00 --- the first song is currently played, and the total listening time is 20572
2019-03-26 09:25:16817 info [, main] - [com. GitHub. Wycm. Music163] - Yili Riverside - 01:00 / 07:19 --- currently playing the first song, listening to the song accumulatively: 20572
2019-03-26 09:25:23157 info [, main] - [com. GitHub. Wycm. Music163] - Yili Riverside - 01:06 / 07:19 --- currently playing the first song, listening to the song accumulatively: 20572
2019-03-26 09:25:29394 info [, main] - [com. GitHub. Wycm. Music163] - Yili Riverside - 01:13 / 07:19 --- currently playing the first song, listening to the song accumulatively: 20572
2019-03-26 09:25:35592 info [, main] - [com. GitHub. Wycm. Music163] - Yili Riverside - 01:19 / 07:19 --- currently playing the first song, listening to the song accumulatively: 20572
2019-03-26 09:25:41974 info [, main] - [com. GitHub. Wycm. Music163] - Yili Riverside - 01:25 / 07:19 --- currently playing the first song, listening to the song accumulatively: 20572

Complete code

package com.github.wycm;

import org.openqa.selenium.*;
import org.openqa.selenium.chrome.ChromeDriver;
import org.openqa.selenium.chrome.ChromeOptions;
import org.slf4j.Logger;
import org.slf4j.LoggerFactory;

import java.util.*;
import java.util.concurrent.TimeUnit;
import java.util.regex.Matcher;
import java.util.regex.Pattern;

/**
 * Created by wycm
 */
public class Music163 {
 private static Logger logger = LoggerFactory.getLogger(Music163.class);

 //Copy the browser's original cookie successfully logged in
 private final static String RAW_COOKIES = "cookie1=value1; cookie2=value2";
 private final static String CHROME_DRIVER_PATH = "/Users/wangyang/Downloads/chromedriver";
 //Song list ID
 private static String startId = "22336453";
 
 
 private static String userId = null;
 private static Set<String> playListSet = new HashSet<>();
 private static Pattern pattern = Pattern.compile("<span class=\"j-flag time\"><em>(.*?)</em>(.*?)</span>");
 private static Pattern songName = Pattern.compile("class=\"f-thide name fc1 f-fl\" title=\"(.*?)\"");
 private static ChromeOptions chromeOptions = new ChromeOptions();
 private static WebDriver driver = null;
 static {
  System.setProperty("webdriver.chrome.driver", CHROME_DRIVER_PATH);
  chromeOptions.addArguments("--no-sandbox");
 }
 public static void main(String[] args) throws InterruptedException {
  while (true){
   try {
    driver = new ChromeDriver(chromeOptions);
    playListSet.add(startId);
    invoke();
   } catch (Exception e){
    logger.error(e.getMessage(), e);
   } finally {
    driver.quit();
   }
   Thread.sleep(1000 * 10);
  }
 }

 /**
  *Initialize cookies
  */
 private static void initCookies(){
  Arrays.stream(RAW_COOKIES.split("; ")).forEach(rawCookie -> {
   String[] ss = rawCookie.split("=");
   Cookie cookie = new Cookie.Builder(ss[0], ss[1]).domain(".163.com").build();
   driver.manage().addCookie(cookie);
  });
 }
 private static void invoke() throws InterruptedException {
  driver.manage().timeouts().implicitlyWait(5, TimeUnit.SECONDS);
  driver.manage().timeouts().pageLoadTimeout(15, TimeUnit.SECONDS);
  String s = null;
  driver.get("http://music.163.com/");
  initCookies();
  driver.get("http://music.163.com/");
  s = driver.getPageSource();
  userId = group(s, "userId:(\d+)", 1);
  driver.get("https://music.163.com/#/playlist?id=" + startId);
  driver.switchTo().frame("contentFrame");
  WebElement element = driver.findElement(By.cssSelector("[id=content-operation]>a:first-child"));
  element.click();
  ((JavascriptExecutor) driver).executeScript("window.open('about:blank')");
  ArrayList<String> tabs = new ArrayList<String>(driver.getWindowHandles());
  driver.switchTo().window(tabs.get(0));
  driver.switchTo().defaultContent();
  int i = 0;
  String lastSongName = "";
  int count = 0;
  while (true){
   if(i > Integer.MAX_VALUE - 2){
    break;
   }
   i++;
   s = driver.getPageSource();
   driver.switchTo().window(tabs.get(1)); //switches to new tab
   String songs = null;
   try{
    driver.get("https://music.163.com/user/home?id=" + userId);
    driver.switchTo().frame("contentFrame");
    Songs = group (driver. Getpagesource(), "cumulative songs (\ \ d) +", 1);
   } catch (TimeoutException e){
    logger.error(e.getMessage(), e);
   }
   driver.switchTo().window(tabs.get(0));
   Matcher matcher = pattern.matcher(s);
   Matcher songNameMatcher = songName.matcher(s);
   if (matcher.find() && songNameMatcher.find()){
    String songNameStr = songNameMatcher.group(1);
    if (!songNameStr.equals(lastSongName)){
     count++;
     lastSongName = songNameStr;
    }
    Logger.info (songnamestr + "-" + matcher. Group (1) + matcher. Group (2) + "- -- currently playing" + count + "songs, cumulative listening:" + songs);
   } else {
    Logger.info ("failed to parse song playing record or song name");
   }
   Thread.sleep(1000 * 30);
  }
 }
 public static String group(String str, String regex, int index) {
  Pattern pattern = Pattern.compile(regex);
  Matcher matcher = pattern.matcher(str);
  return matcher.find() ? matcher.group(index) : "";
 }
}

Operation precautions

  • Modify the path configuration of your own related chrome driver
  • Log in to your own web-based Netease cloud music: https://music.163.com/
  • Copy the original cookies successfully logged in by yourself to the raw ﹣ cookies field in the code
  • Switch the song list. If the default song list is played, you can search for some songs that have not been played, similar tohttps://music.163.com/#/playlist?id=22336453To extract the ID and directly replace the startid field in the code.

summary

  • You may have questions. I want to put this task on my own server and run it directly in the background. This is the problem of setting up selenium running environment on the server. Please refer to my previous article. Alibaba cloud and Tencent cloud have the lowest matching servers that can run.
  • In addition, why is selenium adopted here? Is there any other simpler way to achieve the effect of brushing directly through simple HTTP request. I have personally tried to find the request to increase the number of songs I listen to in the way of pure HTTP request. Because the requests of online banking cloud are encrypted, they are not found at last. So we use selenium instead.

Last

See https://github.com/wycm/crawler-set/tree/master/music163 for the complete project code

Copyright notice
By wycm
Source: https://my.oschina.net/wycm/blog/3023967

The above is the whole content of this article. I hope it will help you in your study, and I hope you can support developepaer more.