Ten questions and ten answers will give you a comprehensive understanding of the core advantages of tdsql-a


In the series sharing of “domestic database hard core technology salon tdsql-a technology disclosure”, five Tencent cloud technology masters gave an in-depth interpretation of tdsql-a from the aspects of overall technical architecture, column storage and related execution optimization, cluster data interactive bus, distributed execution framework and vector execution engine.

In the last issue of this series, we sorted out the ten most concerned questions about tdsql-a. Tencent cloud technology experts will answer these questions one by one.

Tdsql-a is Tencent’s first distributed analytical database engine. It adopts a fully parallel and shareless architecture, has a self-developed column storage engine, supports row column hybrid storage, and is suitable for massive OLAP association analysis and query scenarios. It can support the cluster size of more than 2000 physical servers, and the storage capacity can reach 100 P level of a single database instance.

Q1: which users are currently using tdsql-a?

Tdsql-a is a product derived from Tencent’s own business development. It is widely used in Tencent, such as Tencent advertising and QQ music core businesses.

Q2: what are the main advantages of tdsql-a?

The main advantage of tdsql-a is that it is a distributed super large-scale data warehouse cluster. Its application scenario is mainly for high-speed computing of super large-scale data sets. While reaching the super large scale of a single database cluster of thousands of servers, it can also achieve efficient query execution.

Q3: what has tdsql-a done to support localized hardware and operating system?

In recent years, with the promotion of national core innovation and the large-scale application of independently developed hardware and software, tdsql-a has also been doing relevant adaptation work. In terms of domestic hardware, we have made good support for mainstream domestic chips such as Kunpeng and haiguang. In terms of domestic operating system, we have also passed the certification of a series of mainstream manufacturers.

Q4: does tdsql-a support strong synchronous replication?

In the PostgreSQL process, the wal log is recorded before the transaction is committed. For row save tables and column save tables, we have corresponding wal logs to support streaming replication between primary and standby tables. Users can also select specific replication levels or replication synchronization configurations. Users can choose whether to use high-level or low-level active / standby synchronization according to different scenarios. In addition, tdsql-a also supports hot standby, that is, the standby plane is readable. These can be easily configured through the control platform of tdsql-a.

Q5: how does tdsql-a perform efficient distributed join?

This mainly involves optimizer, actuator, vectorization and so on. Let’s take the optimizer as an example. If it is a complex query with CTE or many subqueries, the tdsql-a optimizer will first rewrite the query, inline the CTE, and then try to promote the subquery to join for optimization. This can reduce the execution complexity of queries. If it is a specific query, it may even increase hundreds of times.

After rewrite, the optimizer will automatically adjust the join order, which is equivalent to a comprehensive traversal, and then select a plan with the best execution cost. In the distributed scenario, we have done a lot of optimization. For example, tdsql-a will intelligently adjust whether the column associated with the join is a data distribution column, whether redistribution is required, and whether a delay materialization operator is added.

In addition, when selecting physical operators, tdsql-a will select different association algorithms according to different scenarios, selection rates and cost evaluation, such as nestloop join, hashjoin and mergejoin. At the same time, tdsql-a also supports parallel execution and vectorization execution of the whole process, so as to ensure the overall efficient distributed join computing power.

Q6: what operators does tdsql-a currently support? In what aspects has its operator level parallelism been enhanced?

We have done a lot of self research and function enhancement on tdsql-a based on PG 10. Because PG 10 itself lacks parallelism. For example, hash join supports apparent parallelism, but build hash table does not support parallelism. In order to make up for the deficiency, we developed the completed hash join parallelism based on PG 10 to support the inner / outer plan parallelism at the same time.

The other is the aggregation operator. Because many distributed scenarios require two-stage aggregation computing. This two-stage aggregation may redistribute data in the middle. In this case, we also need to support complete parallelism. Tdsql-a will perform a hash calculation according to the aggregated columns to select which work to calculate on, and bind between different execution fragments to ensure that correct data can be received between different parallel processes during different fragment execution, and then perform parallel calculation. Tdsql-a also supports some other operators, which will be updated later.

Q7: can tdsql-a directly access external data sources including oracle?

The docking of big data projects such as Oracle and hive / spark is a wide range of application scenarios. For Oracle compatibility and domestic database replacement, we have done a lot of work to enhance compatibility. In addition, users can directly use dblink in tdsql-a to directly access Oracle external data sources to make data changes or queries. Like hive or other third-party data sources, PG has an interface such as foreign data wrapper. Users can use our supported plug-ins or adapt rich third-party plug-ins to access heterogeneous external data sources.

If you need to pull the data to tdsql-a, we can use TDx, the supporting data import tool of tdsql-a, to import the data. In fact, we have rich protocols, which can lead the data flow to TDx through different external data sources. The database will extract data from TDx to dn node in parallel, so that you can access external data sources and enjoy the convenient service of data loading.

Q8: can tdsql-a quickly import data into the database system? Can I interact with other systems?

Tdsql-a supports copying on PG data sources. It can copy data in and from, and also supports appearance definition. We can create external tables and directly define external data files in the database, so as to carry out some operations. At the same time, we also support DB bridge, a data synchronization tool on the cloud, which can synchronize data from other systems in real time and then out through Kafka. In addition, we also developed a special data import and export tool TDx, which can import and export data in parallel and efficiently with the help of multiple DN.

Q9: how many nodes can tdsql-a currently support?

In tdsql-a, the number of deployment nodes is a configuration parameter, which can be configured at the beginning of initial installation. The default value is 2048, which can be adjusted according to user needs.

Q10: how can FN ensure more efficient communication during large-scale deployment?

Let’s assume that without FN, because the scale of the system cluster is relatively large and connections are constantly created in the cluster, the number of sockets in the whole system network will become a bottleneck. This is also the original intention of introducing FN data interaction bus to effectively reduce the number of connections. Through the FN node, the specific computing process does not need direct full link, but data routing through the FN node, which greatly reduces the number of socket connections.

How to ensure the efficiency of FN can be answered from two aspects. On the one hand, within the same server, FN obtains data directly through shared memory, which actually saves the network layer, so its communication is more efficient. On the other hand, between different servers, FN is mainly responsible for sending data. Its interior is a multi-threaded model, and there can be many sending threads. There is also a merge thread inside FN. Its function is to check which sending thread’s queue is idle before sending data, and then schedule the data to the idle sending thread, so as to achieve the effect of complex and average performance.