Building a hundred million store search system based on Tablestore multi index

Time:2020-12-3

1、 Programme background

For a geo management system, its core and bottleneck lies in the storage performance and query ability of the database; on the one hand, the storage service needs to deal with the low latency storage and reading of massive data; on the other hand, the storage service also needs to provide efficient geo + multidimensional data retrieval. Table store, as a server less distributed NoSQL database, fully meets the requirements of the system.
Next, we will build a [100 million level geo management system] Based on tablestore;

Demand scenarios

A store search platform provides store information of 100 million orders of magnitude. Through the PC and mobile web pages provided by the platform, users can search the favorite shops according to their own demand dimension combination. The platform needs to display the specific location of the store, store details and the jump of the store homepage on the map;
Dimension 1: [distance within 1km] [within 100 per capita] [highest score] [milk tea shop];
Dimension 2: shops in Hangzhou with the highest score and Shen family;
……
The core function of geo management solution is to realize fast and multi-dimensional geo query function. Examples are as follows:
Note: the sample provides store data of 100 million. Sample address of the official website console:Project sample

A list of the store search system pages based on table storage. The sample is embedded in the table storage console, and users can log in to the console to experience the system (for new users of table storage, they need to experience after opening the service, which is free of charge. The order data is stored in the public instance, and the experience does not consume user storage, traffic and Cu).

Table store scheme

Using the multiple index (searchindex) scheme developed by tablestore, we can easily build a set of store search system with the order of 100 million. The multi index function can create geo index, word segmentation string index and so on. It provides users with the ability of geo retrieval and multi-dimensional combination retrieval. Users can create it at any time and automatically synchronize the stock and incremental data.

As a fully hosted, zero operation and maintenance distributed NoSQL data storage service provided by Alibaba cloud, tablestore has the functions of [massive data storage], [automatic segmentation of hot data], [multidimensional retrieval of massive data], etc., effectively solving the challenge of large expansion of geo data;

Users can create and open the index only when they need it. The consistency of data synchronization is ensured by tablestore, which greatly reduces the workload of user’s scheme design, service operation and maintenance, code development, etc.

2、 Build preparation

If you have a good experience of the [100 million store search system] Based on the tablestore and want to start the journey of building your own system, just follow the following steps to build it:

1. Open form storage

The form storage service is opened through the console, and the form is stored as soon as it is opened (post payment). The pay as you go method is adopted, which has provided users with enough free quota for functional test. Table storage official website console, free quota description.

2. Create instance

Create a table storage instance through the console and select the region that supports multiple indexes. (at the present stage, the searchindex function has not been commercialized. Beijing, Shanghai, Hangzhou and Shenzhen will be opened temporarily, and the rest of the region will be gradually opened.)

Building a hundred million store search system based on Tablestore multi index

After the instance is created, the work order is submitted to apply for multi index function test (now the multi index function has been commercialized, no application is required).

Building a hundred million store search system based on Tablestore multi index

3. SDK Download

Use the SDK with multiple index (searchindex), official website address, temporary Java, go node.js Three kinds of SDKs add new functions

java-SDK

<dependency>
    <groupId>com.aliyun.openservices</groupId>
    <artifactId>tablestore</artifactId>
    <version>4.8.0</version>
</dependency>

go-SDK

$ go get github.com/aliyun/aliyun-tablestore-go-sdk

Nodejs-SDK

$ npm install [email protected]

4. Table design

The sample of store retrieval system is simple to use a store table, mainly including fields: store type, store name, store location, average score of store, per capita consumption consumption, etc. The table is designed as follows:
Table name: Geo_ positon

Building a hundred million store search system based on Tablestore multi index

3、 Start building (core code)

1. Create data table

Users only need to create a “store information table” under the instance that has completed the test invitation: create and manage the data table through the console (users can also create it directly through the SDK)

Building a hundred million store search system based on Tablestore multi index

2. Create data table index

Tablestore automatically synchronizes the full and incremental index data: users can create and manage indexes through the console (or create indexes through the SDK)

Building a hundred million store search system based on Tablestore multi index

Building a hundred million store search system based on Tablestore multi index

3. Data import

Insert test data (100 million pieces of data are inserted in the sample console, and users can insert a small amount of test data themselves);

Building a hundred million store search system based on Tablestore multi index

Building a hundred million store search system based on Tablestore multi index

4. Data reading

There are two types of data reading:

Primary key read

Get the primary key column based on the native table storage: getrow, getrange, batchgetrow, etc. Primary key reading is used for index (automatic) reverse query. Users can also provide single query page of primary key (order MD5), and the query speed is extremely fast under the order of 100 million. Single primary key query method does not support multi-dimensional retrieval;

Store index read

Based on the new searchindex function, query: search interface. Users can freely design multi-dimensional conditional combination queries of index fields. Through setting and selecting different query parameters, different query conditions and different sorting methods are constructed. At present, it supports: precise query, range query, prefix query, matching query, wildcard query, phrase matching query, word segmentation string query, and Boolean and, or combination.
For example, [“36.76613111.41461” milk tea shop within 1km around it], the query conditions are as follows:

List<Query> mustQueries = new ArrayList<Query>();

TermQuery termQuery = new TermQuery();
termQuery.setFieldName("type");
termQuery.setTerm ( ColumnValue.fromString (milk tea));
mustQueries.add(termQuery);

GeoDistanceQuery geoDistanceQuery = new GeoDistanceQuery();
geoDistanceQuery.setFieldName("pos");
geoDistanceQuery.setCenterPoint("36.76613,111.41461");
geoDistanceQuery.setDistanceInMeter(1000);
mustQueries.add(geoDistanceQuery);

BoolQuery boolQuery = new BoolQuery();
boolQuery.setMustQueries(mustQueries);

Author: Tan Tan

Reading the original

This article is the original content of yunqi community, which can not be reproduced without permission.