On June 7, the annual technical conference of tidb community, tidb DevCon 2020, came to a successful conclusion. The conference took the form of online live broadcast, bringing together people from all over the world80+Developers, tidb users and partners share their first-hand development and practical experience. Topics cover finance, telecommunications, e-commerce, logistics, video, information, education, medical and many other industries. At the meeting, we officially released the landmark tidb 4.0 GA version, shared the technical details and the actual effect of the production environment, and awarded honorary trophies and certificates to contributors, committees and maintainers who have made outstanding contributions to the tidb community in the past years.
The conference lasted for two days, with a total of seven forums,29 hoursThe cumulative sharing time is long, and the popularity value of the live broadcast room is as high as23000The missing buddies can continue to “squatting down” the recent push of the official account. We will pick and collect some exciting output. Please look forward to it!
The following is our co-founder and CEO Liu Qi’s live sharing record.
Every year, I have a time when I am particularly excited, that is, when a large version of the product is released, it is usually the time when tidb DevCon, the community’s annual technology conference, is held. Last year, tidb DevCon 2019, we released tidb 3.0 beta. Of course, this year, tidb 4.0 GA also arrived.
For a long time, tidb users have been using clusters with a large scale, and then they will ask for “how to reduce my use cost”. After tidb 4.0 has the ability of serverless, it will automatically do elastic scaling based on k8s according to the actual business load of users.
Once upon a time, when we launched a system, the first thing was capacity planning. To evaluate how many servers we need, for example, 50 servers were prepared in advance. However, after running for one month, we found that five servers were enough. This leads to a lot of waste of resources. If the whole system can automatically and elastically scale on the cloud, this kind of resource waste can be avoided.
More importantly, the flexibility of tidb means that you never need to allocate system resources according to the peak value of the business. For example, there are two obvious peaks in your business, morning and evening. In fact, the duration of each peak is usually only about two hours. In other words, for the four hour peak, we have configured 24 hours We can allocate the highest resources per hour and pay for it. However, we can save the resources and costs in off peak hours. Maybe we can save about 70% or even more.
In addition, the flexible tidb can cope with unpredictable environment In workload, no one knows when a commodity will sell well, and no one knows when a fund I sell will be popular. At this time, if we give the system a permission, it can automatically expand the server according to the current actual situation of the business. This may be a “life-saving way” for an enterprise or a business, such as the situation in the figure above, Human intervention is often too slow, too late.
In today’s world, we want everything to be faster and simpler. However, if we still use a database in the traditional way, we will not be able to meet this “faster and simpler” demand, because the traditional way needs to go through a series of very complex processes to extract these changed information, events and chronicles from the database, and then analyze them. This process will often bring about a relatively long delay. These “delays” Let us lose a lot of direct economic value.
In tidb 4.0, we officially launched tiflash. Tiflash is a column storage engine cooperating with tidb system. It is seamlessly combined with tidb. The features of online DDL, seamless expansion, automatic fault tolerance and other convenient operation and maintenance are inherited in tiflash. At the same time, tiflash can keep pace with row storage in real time.
With tiflash, tidb 4.0 can be at least 10 times faster than the previous version in a large number of complex computing scenarios, and we never need to worry about “consistency”.Whether it is a simple OLTP type workload or a complex OLAP type workload, it is always consistent, real-time, and can automatically expand or scale flexibly.
You should be very familiar with the above architecture diagram. Almost every company with a certain data scale has experienced it. Once there was a user who built a complex system similar to the above figure in a scenario with only a few tens of tons of data, just to be able to do OLTP and a report query. We have to access Kafka and ETL, and then re serialize the results of this report query to a storage system such as HBase. Is there any way to simplify the whole system?
When we look at it from the perspective of tidb 4.0, the users have already given their answers to go online. As shown in the figure above, when we put tidb in the middle layer, the complexity of the whole system is greatly reduced. Next, users will share their experience in using tiflash and what simplification they have made in the architecture.
On the infrastructure level, users don’t really want to know whether the workload is a long query or a short query,From the perspective of users, we just hope to get the results as soon as possible, reduce the complexity of the process as far as possible, so as to save costs, improve the development speed and create more value.
I know that everyone is looking forward to pingcap providing tidb’s cloud service. Now I’m very happy to release itTiDB CloudThe tidb cloud is managed, maintained and optimized by pingcap.
We started this preparation four years ago, and today tidb can seamlessly dance in the cloud.
If someone says, “I don’t want to install tidb, I don’t want to maintain tidb.”. You can also choose to give it to pingcap. At present, we have supported AWS and GCP cloud platforms (the support of other cloud platforms is also steadily advancing). If you are using these two platforms, you don’t need to do anything. With a few mouse clicks, you can easily use tidb, which is really “out of the box”.
In tidb 4.0, we provide more than 70 new features. You can read this article “tidb 4.0: the leading real time HTAP database is ready for cloud”.
The built-in dashboard in tidb 4.0 is very suitable for people like me who haven’t written SQL for a long time. They can solve most problems through the graphical interface, observe the hot data and slow queries in the whole system, what the business looks like in the database, and understand the business load through a variety of different views. We hope to help users locate most of the faults and problems within 10 seconds. The following dashboard satisfies all my “fantasies”.
Performance: fast and fast
Performance is always an “exciting” issue. Compared with version 3.0, the overall performance of tidb 4.0 is improved by about 50%; if it is running aggregate query, it can be improved by 10 times or even higher in many scenarios, and the performance of TPC-H is also doubled. This achievement also comes from the contribution of the entire tidb open source community. At the end of last year, we held the first quarter of the tidb challenge “performance challenge”. A total of 165 community developers participated in the competition, including 23 teams and 122 individual participants. Their achievements were all landed in tidb products.
Tiup one click installation and deployment
Some students make complaints about it. I installed TiDB too much trouble. It took dozens of minutes or even a day to deploy the whole system.
In tidb 4.0, we have specially written a tool called tiup, which is a package manager. With tiup, you can run tidb in one minute and experience tidb in one minute. It only takes 45 seconds to deploy the production cluster with 15 nodes, that is to say, the rapid experience can be achieved in one minute. Tiup is a huge improvement of ease of use experience. Welcome to experience it.
TiUP: A component manager for the TiDB eco-system
Try TiDB (playground) within 1 minute with 1 command
$ curl https://tiup-mirrors.pingcap…. | sh && tiup playground nightly –monitor
Deploy a production cluster in 45 seconds
Tiup is a huge improvement of usability Experience for users. Welcome to experience it.
With the increasing application scale of tidb in the world, more and more users are using tidb in more serious scenarios. Therefore, we also provide security features that everyone is very concerned about to meet the compliance requirements of security and privacy in various countries. At present, the communication process of all tidb communication components is fully encrypted, and all stored data support transparent encryption, including pingcap or any cloud manufacturer, which can not violate the data privacy and security of tidb users. When tidb runs on this cloud, no one can see the database, and no one can intercept the data of the communication process from it.
What is the actual combat effect?
I believe some people will have doubts. After so much talk, is tidb 4.0 really ready? Can we go to the production environment? Is there any actual data sharing?
The read service of Zhihu in the figure above was upgraded to the largest internal cluster of tidb 4.0 a few days ago. The capacity of the whole cluster is 1 Pb, and the current storage data has reached 471tb.
When I saw this data for the first time, I was very shocked, not only because of the scale of the data, but also because of the confidence of Mr. Sun Xiaoguang (Zhihu, tikv maintainer) in 4.0. They upgraded the GA version of 4.0 on the fourth day of its official release on May 28. Of course, seeing this result, I am more confident,Tidb not only supports such a large data scale, but also greatly improves the computing power and latency of the whole system.
As can be seen from the figure above, compared with the previous version, tidb 4.0 reduces the delay by 40%. In other words, if the same delay is maintained, the cost can be reduced by about 40%.
Why is TiDB so Popular ?
In the past year, I was often asked a question, why is tidb so popular? Why can tidb travel all over the world? What can make complaints about the use and appreciation of so many users? (of course there are Tucao tanks. So, users make complaints about them so that we can be positive and have the power to improve them, so that we can quickly iterate).
In fact, all this is not only the credit of pingcap, but also the credit of the whole open source community. Pingcap is only a part of the community. It is precisely because developers from all over the world, such as square, azure, zoom in the United States, Dailymotion in France, paypay in Japan and so on, have contributed to our opinions, demands and PR codes, polished and achieved today’s tidb together, and formed today’s huge tidb open source community.
When 4.0 was released, we also made a word cloud, looked at the organizations of tidb code contributors, and drew a diagram according to the contribution degree of the organizations. We found that there were so many organizations that continued to contribute in tidb community
At the same time, what often surprises me is the creativity of the community. For example, tidb contributor Liu Dongpo made a visual display of the top 100 contributors
This little partner also shared the experience of participating in the community in the developer community forum of tidb DevCon. We also hope that more people, like this little partner, can gain something in tidb community and feel a sense of belonging in their contribution.
In addition, no matter who you are, as long as you want to participate in the creation of tidb or want to use tidb, we are ready for you:
- If you have any problems with tidb, you can go to asktug（ https://asktug.com ）Ask questions, there are more than 2700 members, they all share practical experience or stepped on the pit in asktug, maybe you can get answers by searching here.
- If you want to learn more about tidb, we also launched pingcap University（ https://university.pingcap.com ）Online and offline training courses. Finally, you can verify your learning effect, and you can also take the certification examination (as shown in the figure below).
In the past few months, tidb community partners have also done some crazy things. For example, we spent 48 hours writing a book, tidb 4.0 in action book.tidb.io ）。 Maybe it sounds amazing at first. How can you write a book in 48 hours? But if you look at the number of authors, you can understand that there are more than 100 authors in this book. Each person writes one bar, that is, 100 bars, which can be easily done in 48 hours.Of course, this thing is not easily facilitated, its realization is actually the accumulation of knowledge and spiritual power of the whole community for a long time.
If you see this, you are ambitious and want to go a step further. You want to write your own distributed database. No problem, we have also prepared the talent plan course, which can build the computing layer and storage layer of a distributed database step by step according to the course plan. This course will also have tutors from all over the world to help you review the code and homework. At present, it supports Chinese and English temporarily.
Bonus: Chaos Mesh™!
Finally, let’s talk about our practice in chaos engineering. In the field of software, there is a common sense that “all foreseeable failures in reality will inevitably occur in the end”. The complexity of the system is unavoidable, we have to face, and we have to solve it.Today, the complexity of the whole system is not limited to the database, but extends to the whole link of the whole business, and ultimately lies in the quality of service provided by the system for users.
As shown in the figure above, the visualization of microservices of Amazon and Netflix can’t be simply compared with spider web. It’s actually much more complicated than spider web. So we need a set of system to simulate all possible faults in reality, and let this fault happen constantly, so as to enhance the robustness of the system in advance.
Therefore, we build a system in the whole process of developing tidbChaos Mesh. What does it do?
For example, it can simulate a disk failure. In our test environment, the disk is broken every minute, and the network is isolated every minute. Although this kind of situation rarely appears in the real world, once it appears, it will form a catastrophic failure. The simulation disk failure is just one of the many functions that chaos mesh can provide.
Chaos mesh will help you to complete and test all possible faults in the whole business chain. In the past, we all said by experience that “there is a 99.99% or 99.999% probability that the system can operate normally”, including some “luck” elements. Because we use chaos mesh to test all kinds of fault conditions, we will find that it is very rare and extremely difficult for a system to achieve “99.99% or 99.999% normal operation”.During the development of tidb, we synchronously used chaos mesh to test tidb. The feedback of tidb 4.0 from test users is very good, which is partly due to the “crazy destructive” test of chaos mesh. Of course, we also welcome you to use chaos mesh to test and polish your system.
In fact, tidb is not only a database product, but also the cornerstone of many systems as an infrastructure. Before using it, you can also refer to other people’s mature experience or solutions. Tidb DevCon 2020 has 80 + tidb Users & partners share first-hand practical experience, no matter which industry you come from, such as finance, e-commerce, logistics, new retail, travel, telecommunications, health care, energy, manufacturing, high technology, education, video, information; or in different use scenarios, such as real-time analysis, data aggregation, data integration Marts, metadata storage, log auditing, log statistical analysis, and im. We are ready for all the industry references you want to see and understand, and the scenario practices you want to understand. Please look forward to the output of some video & text reviews of tidb DevCon 2020.
Thanks for the warm participation and support of community partners, we will continue to work hand in hand in the future, and move towards the star sea of open source world! Focus on PingCAP WeChat official account (pingcap2015), reply in the background.DevCon2020”Get part of the PPT materials arranged with the authorization of the instructor.
Tidb DevCon is a technical conference launched by pingcap team for tidb community, which is held in Beijing every year. This year’s DevCon was held from June 6 to 7 to show you the product technology transformation of tidb in the past year by means of online live broadcast, share the latest ecological progress at home and abroad, and invite 80 people from all over the world+ It is a platform for developers, users and partners to share their practical experience and open source thinking, covering finance, manufacturing, telecommunications, e-commerce, logistics, energy, FMCG, new retail, cloud computing and other fields.
The content of this article is based on the blog column of pingcap official website. Please click here to view the original.
If you have any questions about the use of tidb, you can also log in Asktug.com Search or post communication ~