What is the application of big data? There are three key points


The key points of big data application are data source, productization and value creation; the uneven distribution of data resources makes it easier for big data application to obtain breakthroughs in data intensive fields; it is necessary to reform the improper industry management mode to promote the application of big data in existing industries.

Big data is important in application. At present, at the national level, the State Council has issued the action plan for promoting the development of big data; at the local level, big data is used as the regional development strategy engine; at the enterprise level, all kinds of big data concept companies are in the ascendant and flourish. We only focus on big data application, focusing on where data comes from, how data is used, and who pays for the results, which are the three key points of data source, productization and value creation. A good big data application may be very complex in technology, but it should be simple, straightforward and effective in business model. We are also concerned about whether there are a number of “data intensive” industries or areas in which big data applications may be easier to develop. In terms of industrial policy, we are concerned about big data as a new business form. In the past, we have tried and tried again and again, such as giving land, money and projects. Will it continue to be effective? Kodak big data will show you.

What is the application of big data? There are three key points

Three key points of big data application

The action plan for promoting big data development (hereinafter referred to as “big data plan”) of the State Council defines big data as “a new generation of information technology and service formats”, endows big data with strategic functions of “promoting economic transformation and development”, “reshaping national competitive advantage” and “improving government governance ability”, and defines data as “national basic strategic resources”. In terms of application, the outline of big data puts forward many development directions in the public field, such as scientific macro-control, precise government governance, convenient commercial services, efficient security and people’s livelihood services; at the industrial level, it is mainly divided into big data of industry, big data of emerging industry, big data of agriculture and rural areas, big data of innovation, and Big data product system and big data industry chain. These directions are only the potential and space of big data application. Whether they can be applied and play a role depends on whether there are feasible models and practical effects. No matter in the public domain or in the industrial level, the application of big data is inseparable from the data source, processing technology and methods, and the mode of creating value, which is our focus. In summary, the following three seemingly simple but key questions need to be answered.

Here, I would like to recommend my own big data learning and exchange group: 251956502, which is all about big data development. If you are learning big data, you are welcome to join the group. Everyone is the software development party, sharing dry goods from time to time (only related to big data software development), including the latest big data advanced materials and advanced development tutorials organized by myself , welcome to join us.

(I) where does the data come from

With regard to data sources, it is generally believed that the Internet and the Internet of things are the bases for generating and carrying big data. Internet companies are natural big data companies, accumulating and continuously generating massive data in search, social networking, media, trading and other core business areas. Internet of things devices are collecting data all the time, the number of devices and the amount of data are increasing day by day. These two kinds of data resources, as big data gold mine, are constantly producing various applications. The successful experience of big data abroad is mostly the classic case of this kind of data resource application. Some enterprises have accumulated a lot of data in their business, such as real estate transactions, commodity prices, consumption information of specific groups, etc. In a strict sense, these data resources are not big data, but for commercial applications, they are the most accessible and easy to process data resources, and they are also common application resources in China at present.

In China, there is another kind of data resources held by government departments, which are generally considered to be of good quality and high value, but with low openness. The “big data outline” takes public data interconnection and open sharing as the direction of efforts, and believes that big data technology can achieve this goal. In fact, for a long time, information and data between government departments have been closed and separated, which is a governance issue rather than a technical issue. The desire to open public data for the society is very good. I’m afraid it will be out of reach for a while. In terms of data resources, the application of “small data” and “medium data” in China is not sufficient, trying to step into the era of big data, taking the opportunity to solve the problems that could not be solved in the early informatization process, and the prospect is not optimistic. In addition, because Chinese Internet companies are mainly in China, their big data resources are not global.

Where the data comes from is the first focus for us to evaluate the application of big data. First, it depends on whether the application really has data support, whether the data resources are sustainable, whether the source channels are controllable, and whether there are hidden dangers in data security and privacy protection. Second, it depends on the quality of the application’s data resources, whether it is “rich” or “poor”, and whether it can guarantee the effectiveness of the application. For data resources from its own business, it has better controllability and data quality is generally guaranteed, but the data coverage may be limited, and other resource channels need to be used. For the data captured from the Internet, technical ability is the key. It is necessary to have the ability to obtain a large amount of data, and to have the ability to screen out useful content. For data obtained from third parties, special attention should be paid to the stability of data transactions. Where the data comes from is the starting point of analyzing big data application. If an application does not have a reliable data source, no matter how good or how advanced the data analysis technology is, there is no foundation.

(II) how to use data

How to use the data is our second concern in evaluating the application of big data. Big data is just a means. It can’t be all inclusive. We are concerned about what big data can and can’t do. Now, it seems that big data mainly has the following common functions.

Track. The Internet and the Internet of things are recording all the time. Big data can track and trace any record to form a real historical track. Tracking is the starting point of many big data applications, including consumer purchase behavior, purchase preferences, payment means, search and browse history, location information, etc.

Distinguish. On the basis of comprehensive tracking of various factors, through positioning, comparison and screening, accurate recognition can be achieved, especially for voice, image and video recognition, so that the analyzable content is greatly enriched and the results are more accurate.

Portrait. Through tracking, identifying and matching different data sources of the same subject, a more three-dimensional description and a more comprehensive understanding can be formed. For consumer portraits, advertisements and products can be accurately pushed; for enterprise portraits, credit and risk can be accurately judged.

Tips. On the basis of historical track, identification and portrait, forecast the future trend and the possibility of repeated occurrence, and give prompt and early warning when some indicators have expected changes or exceed expected changes. In the past, there were also predictions based on statistics. Big data greatly enriched the means of prediction, which is of great significance to the establishment of risk control model.

Match. Accurate tracking and identification in massive information, using correlation, proximity and other screening and comparison, more effectively achieve product tying and supply and demand matching. Big data matching function is the basis of new business models of sharing economy such as Internet car hailing, house renting, finance, etc.

Optimization. According to the given principle of the shortest distance and the lowest cost, the path and resources are optimized by various algorithms. For enterprises, improve service level and internal efficiency; for public sectors, save public resources and improve public service capacity.

At present, many seemingly complex applications can be subdivided into the above types. For example, the “big data targeted poverty alleviation project” implemented in Guizhou, from the perspective of big data application, can accurately screen and define the poor households and identify the target of poverty alleviation through identification and portraits; through tracking and prompting, can monitor and evaluate the poverty alleviation funds, behaviors and effects; through pairing and optimization, can better play the role of poverty alleviation resources. These functions are not unique to big data, but big data is far beyond the previous technology and can be done more powerful, more accurate, faster and better.

(3) who pays for the results

Who pays for the results is the third and last focus of our evaluation of big data application. The reason is very simple. The application without value creation is not good. We are concerned about whether the application of big data really improves the ability and performance. If big data is used for product design, marketing promotion and resource allocation, it depends on whether the competitiveness of the enterprise is improved and whether the enterprise finally makes more money than before. If big data is used to provide services for third parties, it depends on whether someone is willing to pay and is willing to continue to pay. But if it is used in the public domain, it also depends on whether the government or the public sector’s payment is worth it, not only from the perspective of the investors, but also from the perspective of the people.

When we are faced with a big data application, as long as we simply ask the above three questions – where the data comes from, how the data is used, and who pays for the results, we can uncover many “disguises”. Of course, if we can stand the above three big data questions, it is not necessarily excellent, but it is not far from the application of excellent big data.

Looking for data intensive areas

Since big data is regarded as a kind of resource, the problem of resource distribution should be considered. Generally speaking, the distribution of resources is very uneven, such as water, minerals, arable land, energy and other natural resources; the distribution of human resources and knowledge is even more uneven. Is big data unevenly distributed? Can the development of big data industry really overtake on a curve? These problems deserve deep consideration.

Different from the natural resources that can be detected, the distribution of data resources is difficult to locate and describe. However, the distribution of big data human resources can be used to indirectly reflect the differences of big data application in regions and industries, and the industries and regions where big data human resources are intensive can be regarded as data intensive.

We screened the recruitment information published by the two mainstream recruitment websites “51job” and “Zhilian Recruitment” since the second half of 2014, and obtained that the two websites have published a total of 227000 relevant information involving enterprises and 1.07 million positions in the past two years, and the data volume is indeed “large”. Through the summary and analysis of regions and industries, the results show that the distribution of human resources in big data is very uneven, and there are great differences among regions and industries. However, exactly speaking, the recruitment website reflects the demand for talents, not the distribution of human resources in a strict sense, but the two are closely related.

From the perspective of big data related jobs, Beijing, Guangdong and Shanghai are highly intensive and far ahead of other regions. In addition, the number of enterprises publishing Recruitment Information accounts for 52.35% and 47.48% of the two websites, and the number of positions accounts for 61.23% and 56.74%. It can be inferred that half of the big data human resources are concentrated in these three places, which is highly consistent with our usual intuitive feelings. In addition to these three areas, we are concerned about whether local governments attach importance to the big data industry and regard big data as the engine of regional economic development, which may promote the concentration of human resources, and may surpass other regions with similar economic development level. From the data reflection, at least at present, we can’t see such a result, which reveals that the human resource structure is the most short board and the most difficult difficulty to be overcome in the development of big data industry in late developing regions. The difficulty of changing the composition of human resources in a place is far greater than changing the appearance of ground buildings, which requires either a long-term process or a unique system.

Even in the same province, the distribution of big data human resources is extremely uneven. For example, in Guangdong, Shenzhen alone accounts for about half of the province. Plus Guangzhou, it can reach 90%. In other places, even though the economic strength is good, compared with Shenzhen and Guangzhou, there is a long way to go in terms of big data human resources. This again shows that the distribution of human resources in big data is extremely uneven. Obviously, the basis of developing big data industry in big data human resource intensive areas is better than that in human resource poor areas.

From the perspective of city ranking, beishangshenguang can be regarded as the first tier city with dense demand for big data human resources, and Hangzhou, Nanjing, Chengdu, Wuhan, Xi’an and other second tier cities. The distribution of big data human resources is generally consistent with the city’s economic strength, vitality and even housing price level.

From the perspective of industry distribution, the demand for big data human resources is more uneven, mainly concentrated in the Internet, information technology and computer related industries. This fully shows that big data is a part of the Internet or IT industry and a new development based on the original. These industries are typical “data intensive” industries and the cradle of big data industry development.

Finance is another particularly important “data intensive” area. The financial industry is not only the base for generating data, especially valuable data, but also the demand side and application place for data analysis services. More importantly, the financial industry with sufficient payment capacity will be an important battlefield for big data industry competition. Many big data are radiated to various industries through the application in the financial field.

In addition, telecommunications, professional services (such as consulting, human resources, accounting), education and training, film and television media, online games, etc. are relatively data intensive industries.

The big data outline has planned a broad prospect for big data application in almost all industries and fields, but the distribution of data resources is extremely uneven, and the big data application in the “data intensive” field is more likely to succeed in the market.

What kind of industrial policy does big data need

What kind of industrial policy does big data application need? From the perspective of application, big data is not a brand-new industry, but the integration with existing industries, transformation, upgrading and replacement of existing models. It is not big data itself that restricts the development of big data, but the existing problems in the industries and fields in which big data is applied, such as industry regulation, administrative monopoly, factors can not flow freely, and so on. Therefore, promoting the development of big data, using the methods of land, money and projects, can not solve the fundamental problems. From the perspective of big data application field, we should reform the improper industry management mode, adjust the existing interest pattern, and make the application of big data have the necessary conditions. Even in the enterprise, big data application is not only a technical problem, but also involves business process reorganization and management mode change, which is a test of enterprise management ability.

The “data intensive” industries, such as finance, telecommunications, education, film and television media, are not only the areas with huge potential for big data application, but also the key areas to urgently promote industry reform. On the other hand, the application of big data can also provide technical support for the industry reform and achieve the industry development goals with more effective technical routes.

The industrial policies needed for big data application are actually the policies for the development of various industries under the market economy, such as opening access, fair competition, reducing the burden of enterprises, eliminating the discrimination of enterprise ownership, eliminating the discrimination of enterprise scale, etc. Only in an open industrial environment can big data be effectively used in these industries. If a local government wants to vigorously promote the use of big data in finance, health care, education and other fields, the most effective policy is to carry out strong reform in these industries.