1、 Why data mining?
We know that Bi can assist decision-making, and BI application can be divided into status analysis, cause analysis and prediction analysis according to different degrees.
Status quo analysis insight into what happened? For example, is the business good or bad? What are the performance indicators? Business structure? The composition, development and changes of various businesses.
Cause analysis further insight into why? For example, last year’s profit fell by 10% on a month on month basis? Why didn’t the annual sales target be achieved?
Predictive analysis insight into what’s going to happen in the future? For example, what will be the company’s performance next year? Which customers are likely to lose?
Whether it is the status quo analysis, or cause analysis, through OLAP can be achieved. However, OLAP can not achieve prediction analysis, and prediction is exactly what data mining is good at.
2、 What is itdata mining ？
Data mining is to mine, mine and analyze the existing data in the database and data warehouse according to the predetermined rules, identify and extract the hidden patterns and interesting knowledge, and provide decision-making basis for decision makers.The task of data mining is to discover patterns from data. There are many kinds of models, which can be divided into two categories according to their functions: predictive model and descriptive model.
Predictive model is a model that can accurately determine a certain result according to the value of data item. The data used in mining predictive patterns can also clearly know the results. Descriptive pattern is to describe the rules existing in data, or to group data according to the similarity of data. Descriptive model can not be directly used for prediction. In practical application, according to the actual role of the pattern, it can be divided into six types: classification pattern, regression pattern, time series pattern, clustering pattern, association pattern and sequence pattern. The specific algorithms include market analysis, clustering detection, neural networks, decision trees, genetic analysis, link analysis, case-based reasoning, rough set and various statistical models.
3、 What is the difference between OLAP and data mining?
The difference between OLAP and data mining is that OLAP focuses on interaction with users, fast response speed and multi-dimensional view of data, while data mining focuses on automatic discovery of hidden patterns and useful information, although users are allowed to guide the process. OLAP analysis results can provide data mining with analysis information as the basis of mining. Data mining can expand the depth of OLAP analysis and find more complex and detailed information that OLAP can not find. The research focus of data mining is on the data mining algorithm and the solution of new problems when data mining technology is used in new data types and application environments, such as the mining of various unstructured data, the standardization of data mining language and visual data mining.
In short,OLAP reveals the known and past data relations, while data mining reveals the unknown and future data relations.Therefore, data mining can be used to make predictions!
4、 Why data mining?
How does data mining predict? Because data mining has a set of standard process, it can process and test the data scientifically, so as to find the hidden rules of the data itself. This process can be summarized into four steps: business understanding, data preparation, model establishment and evaluation model
Step 1: business understanding
Set goals and analyze requirements clearly
Predict which bank retail customers will be lost, and do a good job in marketing retention in advance.
Step 2:Data preparation
Collect raw data, test data quality, integrate data, format data
It is necessary to preliminarily judge the possible loss of customers, such as the decline of bank card transaction volume month by month and continuous customer complaints, and collect and format the data related to these situations.
Step 3: build a model
Select modeling technology, parameter tuning, generate test plan, build model
Whether the customers will lose is a classification problem, so choose the classification algorithm to build the model and train.
Step 4: evaluation model
Comprehensive evaluation of the model, evaluation results, review process
The established model should be evaluated, and the model parameters should be adjusted continuously according to the prediction results to realize the optimization of the model.
5、 The use of data mining tools
The key of the whole data mining process is the iterative optimization of the modelThe model algorithms include classification algorithm, regression algorithm, clustering algorithm, etc., and each algorithm type includes a variety of different algorithms. For example, classification algorithm includes logical regression, naive Bayes, decision tree, etc., and the programming languages used include Java language, python language, R language. Mining not only requires solid computer knowledge, but also involves statistics, model algorithm and other technologies. The learning threshold is very high, which is generally used by professional and technical personnel.
However, with the help of the data mining tools available in the market, the process of data mining can be greatly simplified, so that ordinary analysts can also quickly master. For example, smartbi mining, a data mining tool launched by smart software, takes Internet user experience as its design goal, uses minimalist style flow modeling, quickly realizes various types of data mining applications, and provides predictive analysis for decisions made by individuals, teams and enterprises.
Smartbi MiningIt has a process oriented and visual modeling interface, with built-in practical and classic statistical mining algorithm and deep learning algorithm. These algorithms are easy to configure, reduce the threshold of machine learning, and greatly save the cost. Business personnel can easily drag and drop components to carry out visual modeling, complete the construction of model process, and publish and manage the model.
6、 Application scenarios of data mining
Data mining can be widely used in various fields, including enterprise operation, production control, market analysis, engineering design, urban planning and scientific exploration
1. Precision marketing
Analysis of customer attributes and consumer behavior, recommend the most appropriate product information for customers, improve the effect of marketing.
2. Customer retention
Analyze the behavior change and satisfaction of customers in purchasing products, predict the possible loss of customers, and do a good job of retention in advance.
3. Sales forecast
Analyze the historical sales data of products, predict the sales volume of products in the future, and prepare for production and inventory in advance.
4. Price forecast
Collect all kinds of data that affect the product price in the market for analysis, predict the development trend of product price, and seize the market opportunity.
5. Credit score
Analyze the customer’s basic information and consumption, loan repayment and other records, score the customer’s credit, prevent credit risk and reduce losses.
Whether it is OLAP or data mining, data analysis technology has been in innovation and development. In the future, with the development of Bi and AI, data analysis will become more intelligent and easy to use. We need to choose the most suitable analysis tool to improve work efficiency according to the actual application scenarios, so as to better guide business decisions and make Bi play more benefits!