Author introduction: TJ, Tang Jianfa, tapdata titanium platinum data CTO, chairman of mongodb Chinese community, former chief architect of mongodb Greater China, lecturer of mongodb video course in Geek time.
“How can we build a data center?” in the data processing industry, customers often ask such questions.
What is a data center, a product, a technology, or an architecture,
When the concept of data center is overwhelming, let’s talk about the architecture, technical implementation and how to solve problems in enterprises.
1、 Modern enterprise data architecture and pain points
– data islands: the root cause of inefficiency and utilization difficulties
– application bottleneck: shortcomings of traditional schemes such as data warehouse and data Lake
Let’s take the airline scenario as an example:
The marketing department of an airline plans to launch a new product or a customer activity. Would you like to know which channel is most commonly used by a certain type of customers? When I thought of this problem, I found that there were too many customer contacts in Airlines.
PSDP travel order, complaint, baggage system, frequent flyer system, mobile app system, etc. These systems are applications established by airlines in different stages and different business departments. These applications will only aim at their own business when deployed, without considering that other businesses of the enterprise can be well connected. If the data in these applications are not unified, it will take days or weeks to get the results, and you don’t even know where to get the data. Sometimes even if you know it, you have to coordinate with other business departments to give it correctly.
Let’s look at a case of policy loan applet.
When a customer applies for a cash loan through this policy loan applet, if the customer has purchased heavy illness insurance, life insurance or property insurance in the insurance company, the system can determine the appropriate type of cash loan provided to the customer within one minute according to the customer’s policy amount.
When I went online, I found that the policy loan applet was quickly developed, but the data is in different systems such as life insurance, serious illness and property insurance, and some still need a recommendation system and a labeling system. Therefore, it takes a lot of time to do data docking, which is weeks or even months. Because it is not only about data, but also about permissions.
The above situations are common data islands in enterprises, and this problem will become more and more common with the development of it construction at any time.
The cause of data island is that when building it services, business departments take their own business construction as the core rather than data construction as the goal.
Secondly, common databases such as Oracle, SQL server, DB2 and Sybase have always had performance expansion bottlenecks. As a result, when a large system or the number of customers increases, it is necessary to adopt the method of sub database and sub table. Because a single library cannot support too much business. This also forms a large number of data islands.
The impact of data islands has seriously hindered the reuse of existing data by new businesses:
- Requires a lot of time for docking and synchronization;
- The user experience drops, and the data is incomplete and not real-time;
- Repeated construction, low reuse rate.
In order to solve the problem of data island, the current solutions include application level ESB, enterprise bus, MQ, etc; From the perspective of storage, there are several warehouses, Teradata, Greenplum, and data lake. These solutions can solve problems at a certain level, but they have limitations:
First of all, these schemes are oriented to the analysis scenario. Most of the data extraction is not timely. Most of them are in T + 1 mode, that is, the data obtained by the business is produced by the system yesterday. These data are processed in the data warehouse and data lake to form a large number of reports and result data, which are delivered by downloading, exporting and other means in an extensive form. Most of the big data platforms currently on the market focus on analysis and are mainly used for Bi, reporting and dashboard to gain insight into enterprise operations and customers.
For enterprise operation, the key and core capability is not back-end analysis, but interaction with customers, business and processes at the front end.
Based on the above situation, the data center came into being.
Tapdata titanium platinum data
- New generation real-time data fusion platform product and solution provider
- Industry leading provider of real-time synchronization solutions for heterogeneous databases
Contact us for the enterprise Demo: [email protected]
Experience online heterogeneous database synchronization service immediately: cloud.tapdata.net