Statistical analysis of game log (1)

Time:2020-6-5

Statistical analysis of game log (1)

Taking the game settlement log as an example, this paper shows the process of statistical analysis of the log by using ksql.

Start inclusive

cd ~/Documents/install/confluent-5.0.1/

bin/confluent start

View the list of Kafka topics

bin/kafka-topics --list --zookeeper localhost:2181

Create a topic to accept the game settlement log

bin/kafka-topics --create --zookeeper localhost:2181 --replication-factor 1 --partitions 4 --topic score-normalized

Using the producer command line tool to write logs to topic

bin/kafka-console-producer --broker-list localhost:9092 --topic score-normalized

> 

{"cost":7, "epoch":1512342568296,"gameId":"2017-12-04_ 07:09:28_ Zone 1_ 200_ 015_ 185175","gameType":"situan","gamers": [{"balance":4405682,"delta":-60,"username":"0791754000"}, {"balance":69532,"delta":-60,"username":"70837999"}, {"balance":972120,"delta":-60,"username":"abc6378303"}, {"balance":23129,"delta":180,"username":"a137671268"}],"reason":"xiayu"}

Use the consumer command line tool to check whether the log is written normally

bin/kafka-console-consumer --bootstrap-server localhost:9092 --topic score-normalized --from-beginning

;; visible

{"cost":7, "epoch":1512342568296,"gameId":"2017-12-04_ 07:09:28_ Zone 1_ 200_ 015_ 185175","gameType":"situan","gamers": [{"balance":4405682,"delta":-60,"username":"0791754000"}, {"balance":69532,"delta":-60,"username":"70837999"}, {"balance":972120,"delta":-60,"username":"abc6378303"}, {"balance":23129,"delta":180,"username":"a137671268"}],"reason":"xiayu"}

Start ksql client

bin/ksql http://localhost:8088

You can see the icon after ksql is started, and the operating terminal.

Ksql terminal view Kafka topic list

ksql> show topics;

Print messages in topic

PRINT 'score-normalized';

You can see:

Format:STRING
19-1-5 11:59:31 p.m., null, {"cost": 7, "epoch": 1512342568296, "gameid": "2017-12-04_ 07:09:28_ \xE9\xAB\x98\xE6\x89\x8B1\xE5\x8C\xBA_ 200_ 015_ 185175","gameType":"situan","gamers": [{"balance":4405682,"delta":-60,"username":"0791754000"}, {"balance":69532,"delta":-60,"username":"70837999"}, {"balance":972120,"delta":-60,"username":"abc6378303"}, {"balance":23129,"delta":180,"username":"a137671268"}],"reason":"xiayu"}

Among them:

  • First comma19-1-5 11:59:31 PMIndicates the message time.
  • Second commaNULLIs the key of the message, because it is from thekafka-console-producerPush, default isNULL
  • After that is the pushed message content.

Create a stream from topic score normalized

CREATE STREAM SCORE_EVENT \
 (epoch BIGINT, \
  gameType VARCHAR, \
  cost INTEGER, \
  gamers ARRAY< \
              STRUCT< \
                      username VARCHAR, \
                      balance BIGINT, \
                      delta BIGINT \
                      > \
               >, \
  gameId VARCHAR, \
  tax BIGINT, \
  reason VARCHAR) \
  WITH ( KAFKA_TOPIC='score-normalized', \
         VALUE_FORMAT='JSON', \
         TIMESTAMP='epoch');

amongTIMESTAMP='epoch'Represents the timestamp of the event based on the time of the epoch.

Delete a stream

DROP  STREAM stream_name ;

If a query statement is querying the stream, an error occurs:

Cannot drop USER_SCORE_EVENT. 
The following queries read from this source: []. 
The following queries write into this source: [CSAS_USER_SCORE_EVENT_2, InsertQuery_4, InsertQuery_5, InsertQuery_3]. 
You need to terminate them before dropping USER_SCORE_EVENT.

Need to useTERMINATECommand to stop these query statements before deleting the flow:

TERMINATE CSAS_USER_SCORE_EVENT_2;
TERMINATE InsertQuery_4;

Query from the earliest record

ksql> SET 'auto.offset.reset' = 'earliest';

Query all data from stream

ksql> SELECT * FROM SCORE_EVENT;

You can see:

1546702389664 | null | 1512342568296 | situan | 7 | [{USERNAME=0791754000, BALANCE=4405682, DELTA=-60}, {USERNAME=70837999, BALANCE=69532, DELTA=-60}, {USERNAME=abc6378303, BALANCE=972120, DELTA=-60}, {USERNAME=a137671268, BALANCE=23129, DELTA=180}] | 2017-12-04_ 07:09:28_ Zone 1_ 200_ 015_ 185175 | null | xiayu

Among them:

  • Column 1 is the timestamp of the record.
  • Column 2 is the key of the record.
  • After column 3 is the value of each field in the message, corresponding to the order in which the flow was created.
  • The next to last column is null becausetaxField does not exist.

Statistics2017-12-04Total number of daily matches

;; add a game_ Date field for statistics
CREATE STREAM SCORE_EVENT_WITH_DATE AS \
    SELECT SUBSTRING(gameId, 0, 10) AS game_date, * \
    FROM SCORE_EVENT;
    
SELECT game_date, COUNT(*) \
    FROM SCORE_EVENT_WITH_DATE \
    WHERE game_date = '2017-12-04' AND reason = 'game' \
    GROUP BY game_date;

At present, ksql does not support queries like the following:

SELECT COUNT(*) \
  FROM SCORE_EVENT \
  WHERE gameId LIKE '2017-12-04_%';

Count the total number of players participating in the match (de duplication)

Because a log contains the game information of multiple players, the idea is to split each player into separate events

  • Integrate the events of each player into a unified flowUSER_SCORE_EVENT
CREATE STREAM USER_SCORE_EVENT AS \
    SELECT epoch, gameType, cost, gameId, tax, reason, gamers[0]->username AS username, gamers[0]->balance AS balance, gamers[0]->delta AS delta \
    FROM SCORE_EVENT;
    
INSERT INTO USER_SCORE_EVENT \
    SELECT epoch, gameType, cost, gameId, tax, reason, gamers[1]->username AS username, gamers[1]->balance AS balance, gamers[1]->delta AS delta \
    FROM SCORE_EVENT;
    
INSERT INTO USER_SCORE_EVENT \
    SELECT epoch, gameType, cost, gameId, tax, reason, gamers[2]->username AS username, gamers[2]->balance AS balance, gamers[2]->delta AS delta \
    FROM SCORE_EVENT;
    
INSERT INTO USER_SCORE_EVENT \
    SELECT epoch, gameType, cost, gameId, tax, reason, gamers[3]->username AS username, gamers[3]->balance AS balance, gamers[3]->delta AS delta \
    FROM SCORE_EVENT;
  • For subsequent connection of player name usernameJOINTo query, you need to reset the key:
CREATE STREAM USER_SCORE_EVENT_REKEY AS \ 
SELECT * FROM USER_SCORE_EVENT \
PARTITION BY username;

Output:

ksql> SELECT * FROM USER_SCORE_EVENT_REKEY;


4000 | lzc | 4000 | situan | 7 | 2017-12-04_ 07:09:28_ Zone 2_ 500_ 015_ 185175 | null | game | lzc | 972120 | -60
4000 | lzb | 4000 | situan | 7 | 2017-12-04_ 07:09:28_ Zone 2_ 500_ 015_ 185175 | null | game | lzb | 69532 | -60

be careful:

In the process of practice, it is found that only partition by for the field of stream can take effect.

  • Create a table by counting the total number of games played, the total number of wins and losses, and the total tax revenue contributed by each playerUSER_SCORE_TABLE
CREATE TABLE USER_SCORE_TABLE AS \
    SELECT username, COUNT(*) AS game_count, SUM(delta) AS delta_sum, SUM(tax) AS tax_sum \
    FROM USER_SCORE_EVENT_REKEY \
    WHERE reason = 'game' \
    GROUP BY username;

seeUSER_SCORE_TABLEAll data:

ksql> SELECT * FROM USER_SCORE_TABLE;
1546709338711 | 70837999 | 70837999 | 4 | -240 | 0
1546709352758 | 0791754000 | 0791754000 | 4 | -240 | 0
1546709338711 | a137671268 | a137671268 | 4 | 720 | 0
1546709352758 | abc6378303 | abc6378303 | 4 | -240 | 0
  • Query the number of matches, the total number of wins and losses, and the total tax of contributions of a player:
ksql> SELECT * FROM USER_SCORE_TABLE WHERE username = '70837999';

Output:

1546709338711 | 70837999 | 70837999 | 4 | -240 | 0

Count the total number of players (de duplication)

  • Add aPuppet trainFor statistics:
CREATE TABLE USER_SCORE_WITH_TAG AS \
    SELECT 1 AS tag, * FROM USER_SCORE_TABLE;
  • Count the total number of players after de duplication
SELECT tag, COUNT(username) \
FROM USER_SCORE_WITH_TAG \
GROUP BY tag;

Continued

Ksql window function.

Recommended Today

Java security framework

The article is mainly divided into three parts1. The architecture and core components of spring security are as follows: (1) authentication; (2) authority interception; (3) database management; (4) authority caching; (5) custom decision making; and;2. To build and use the environment, the current popular spring boot is used to build the environment, and the actual […]