Prediction of Professional Chess Player’s Way of Walking by Deep Learning

Time:2019-3-22

Abstract:I believe many friends can play chess, so have you ever tried to build a chess engine, let’s play together!

Prediction of Professional Chess Player's Way of Walking by Deep Learning

I’m not good at chess.

My father taught me when I was young, but I guess he was one of those fathers who kept their children winning. To compensate for the lack of skills in one of the world’s most popular games, I did what any data science enthusiast would do: build an artificial intelligence to defeat people I couldn’t defeat. Unfortunately, it’s not as good as Alpha Zero (or even the average player). But I want to see how the chess engine does without reinforcement learning and how to deploy the deep learning model to the network.

Here’s the game!

get data

FICS has a database of 300 million games, personal walking, results and ratings of the players involved. I downloaded all the Games in 2012, and at least one player was over 2000 ELO. The total number of games was about 97,000, with 7.3 million walkers. The distribution of victories is: 43000 white victories, 40,000 black victories and 14,000 draws.

The Minimax Algorithm

To understand how to do a deep learning chess AI, I must first understand the traditional chess AI program. From the minimax algorithm. Minimax is the abbreviation of “minimizing the maximum loss”, and it is a concept in game theory that decides how zero-sum game should be carried out.

Minimax is usually used by two players, one is the maximizer and the other is the minimizer. Robots or winners using this algorithm assume that they are maximizers and their opponents are minimizers. The algorithm also requires a checkerboard evaluation function to measure who wins and who loses. The number is between -_ and _. The maximizer wants to maximize this value, while the minimizer wants to minimize it. This means that when you, the maximizer, have two choices, you will choose the one that gives you a higher assessment, and the minimizer will make the opposite choice. The game assumes that both players are at their best and that no one makes any mistakes.

Prediction of Professional Chess Player's Way of Walking by Deep Learning

Take the GIF above for example. You, the maximizer (circle) have three choices (starting at the top). The way you choose to walk directly depends on the way your opponent (box) will choose after walking. But the way your opponent chooses to walk directly depends on the way you choose to walk after you walk, and so on, until the end of the game. Playing until the end of the game will take up a lot of computing resources and time, so in the above example, choose a depth, 2. If the minimizer (the leftmost box) chooses to move left, you have 1 and – 1 to choose from. You choose 1 because it gives you the highest score. If the minimizer chooses the right way to go, choose 0 because it is higher. Now it’s the minimizer round, they choose 0 because it’s lower. The game continues until all walks are completed or your thinking time is exhausted. For my chess engine, suppose the white side is the maximizer and the black side is the minimizer. If the engine is white, the algorithm decides which branch will give the highest and lowest score, assuming that people choose the lowest score every time they walk, and vice versa. In order to achieve better performance, the algorithm can also be combined with another algorithm: alpha-beta pruning. Alpha-beta pruning cut-off system is suitable for deciding whether the next branch should be searched.

Deep Learning Architecture

My research began with Erik Bernhardsson’s excellent article on chess in-depth learning. He described how he used traditional methods to make AI chess and converted it into using neural networks as engines.

The first step is to convert the chessboard into the digital form of the input layer. I borrowed Erik Bernhardsson’s encoding strategy, in which the chessboard is a thermal encoding and each square has a chessboard. This is a total of 768 element arrays (8 x 8 x 12, because there are 12 pieces).

Prediction of Professional Chess Player's Way of Walking by Deep Learning

Bernhardsson chose to set the output layer to 1 for the white side, 1 for the black side and 0 for the draw. He believes that every board position in the game is related to the result. If the black side wins, each chess position is trained to “support the black side”, and if the white side wins, then “support the white side”. This allows the network to return values between – 1 and 1, which will tell you whether that location is more likely to lead to a white or black win.

I want to solve this problem with a slightly different evaluation function. Can the network see which way will lead to victory, not the white side or the black side? First, I try to put the chessboard representation of 768 elements into the output, one of which is the input, the next is the output. Of course, this is useless because it turns it into a multi-classification problem. This leads to too many errors when the engine properly chooses the legitimate walker, because all 768 elements in the output layer can be 1 or 0. So I consulted Barak Oshri and Nishith Khandwala’s Stanford University paper “Predicting Motion in Chess Using Convolutional Neural Networks” to find out how they solved this problem. They trained seven neural networks, one of which was a chess selector network. This network determines which squares are most likely to be moved. The other six networks are dedicated to each chess type and decide where to move a particular chess piece. If the chess selector chooses a square with soldiers, only the chess neural network will respond to the most likely moving square.

I borrowed two convolutional neural networks from their ideas. First, moving from the network will be trained to use an array of 768 elements to represent and output the moving squares of professional chess players (between blocks 0 and 63). The second network: Moving to the network will do the same thing, except that the output layer will be where professional chess players move to. I didn’t think about who won because I thought all the movements in the training data were relatively optimal, regardless of the final result.

The architecture I chose was two 128 convolution layers with 2×2 filters, followed by two 1024 neuron fully connected layers. I did not apply any pools because pools provide location invariance. The cat in the upper left corner of the picture is like the cat in the lower right corner of the picture. However, for chess, the value of the king of chess pieces is totally different from that of the soldier. The activation function of the hidden layer is RELU, and I applied softmax to the last layer, so I basically got a probability distribution, where the probability of all the squares added up to 100%.

Prediction of Professional Chess Player's Way of Walking by Deep Learning

My training data is 6 million locations in the training set, and the remaining 1.3 million locations are used to validate the set. At the end of the training, I got 34.8% verification accuracy from the network, and 27.7% verification accuracy when transferring to the network. This doesn’t mean that 70% of the time it doesn’t learn to walk legally, it just means that AI doesn’t do the same thing as professional players in validation data. In contrast, the average network validation accuracy of Oshri and Khandwala is 37%.

Combining in-depth learning with Minimax

Because now this is a classification problem, where the output can be one of 64 classes, which leaves a lot of error space. One warning about training data (from high-level players) is that good players rarely play generals. They know when they lost and there’s usually no need to follow up the whole game. This lack of balanced data makes the network chaotic at the end of the final game. It will choose the car to move and try to move diagonally. If it fails, the network may even try to command its opponent’s chess pieces (brazen!).

To solve this problem, I command the probability of output. Then, I use the python-chess library to get a list of all the legal walkers at a given location, and select the legal walkers with the highest result probability. Finally, I applied a predictive fractional equation with penalty to select the impossible walk: 400 (the sum of the selected walk index). The farther the legitimate walkers on the list are, the lower their predictive scores are. For example, if it is legal to combine the first index (index 0) of a network move with the first index of a network move, the predicted score is 400 (0 + 0), which is the highest possible score: 400.

Prediction of Professional Chess Player's Way of Walking by Deep Learning

After using numbers in conjunction with material scores, I chose 400 as the maximum predictive score. Material score is a number that can be used to determine whether a walker captures a chess piece. According to the captured chess pieces, the overall score of the walker will be improved. The value of the material I choose is as follows:

Soldier: 10, horse: 500, elephant: 500, car: 900, rear: 5000, Wang: 50000.

This is particularly helpful to the end. In the case that killing the son would be the second most likely legal action with a low predicted score, the king’s material value would exceed it. The score of soldiers is so low, because the network is sufficiently considered in early games, so if it is a strategic measure, it will use soldiers.

Then I combine these scores to return an evaluation of the chessboard given any potential walker. I provided this with a 3-depth minimax algorithm (using alpha-beta pruning) and got a runnable chess engine to kill!

Deployment using Flask and Heroku

I used Bluefever Software’s guide on YouTube to show how to make a JavaScript chess UI by issuing AJAX requests to flask servers and routing my engine through it. I use Heroku to deploy Python scripts to the Web and connect them to my custom domain, Sayonb.com.

conclusion

Although the performance of the engine is not as good as I hoped, I learned a lot about the basics of AI, deploying machine learning models to the web, and why AlphaZero does not use convolutional neural networks to play games!

Improvements can be made in the following ways:

  1. LSTM combines the time series of moving from network to network by using bigram model. This may help move out and into decision-making, as each is currently approaching independently.
  2. Improve the assignment of the chessboard by adding the position of the captured chessboard (the player who captures the center of the chessboard is more advantageous than the player who captures the edge).
  3. Switching between predictive scores and sub-force scores using neural networks, rather than using both at each node. This allows for a higher search depth for the minimax algorithm.
  4. Consider edge situations, such as reducing the likelihood of isolating one’s own soldiers and increasing the likelihood of a horse approaching the center of the chessboard.

Look at the code, or train a new network with your own training data in GitHub repo!


Author: [Direction]

Read the original text

This article is the original content of Yunqi Community, which can not be reproduced without permission.

Recommended Today

Swift advanced (XV) extension

The extension in swift is somewhat similar to the category in OC Extension can beenumeration、structural morphology、class、agreementAdd new features□ you can add methods, calculation attributes, subscripts, (convenient) initializers, nested types, protocols, etc What extensions can’t do:□ original functions cannot be overwritten□ you cannot add storage attributes or add attribute observers to existing attributes□ cannot add parent […]