• ## Multivariate analysis of variance in Statistical Science

Time：2021-7-24

01. Preface We talked about simple one-way ANOVA before. In this article, we talk about two-way ANOVA and multi-factor ANOVA. Two-way ANOVA is the simplest multi-factor ANOVA. Single factor analysis is to consider that only one factor will affect the mean value to be compared, while multi factor analysis is that multiple factors will affect […]

• ## What is heteroscedasticity

Time：2021-7-23

Today, let’s talk about heteroscedasticity. Before heteroscedasticity, let’s talk about another concept similar to heteroscedasticity: homovariance. What is homovariance? The same variance = the same + variance, as the name suggests, is the same variance. What is the variance? Variance is used to reflect the fluctuation of data. The same variance means that the fluctuation […]

• ## One way linear regression analysis of Statistics

Time：2021-7-22

1. Introduction of regression model Let’s take a look at the regression model. The following explanation comes from Baidu Encyclopedia: Regression model is a predictive modeling technology, which studies the relationship between dependent variable (target) and independent variable (predictor). This technique is usually used forForecast analysisTime series model and the relationship between variablescausal relationship。 The […]

• ## Multiple regression analysis of Statistical Science

Time：2021-7-21

01. Preface In front of us, we talked about univariate linear regression. If you haven’t seen it, you can take a look at it first: [univariate linear regression analysis]. In this article, let’s talk about multiple linear regression. Univariate linear regression means that there is only one X in the independent variable, while multivariate linear […]

• ## Analysis of variance in Statistical Science

Time：2021-3-2

The last one talked about hypothesis testing, and this one about analysis of variance. 1. Background: If you have proposed three strategies a, B and C to increase the unit price of customers, how can we know the difference in the effect of these three strategies? The simplest way is to do an experiment. We […]

• ## The lecture of Statistical Science

Time：2021-3-1

01. Preface We talked about multiple linear regression. In this article, let’s talk about gradual regression. What is stepwise regression? It’s literally a step-by-step return. We know that the element in multiple regression refers to the independent variable, and multiple variables are multiple independent variables, namely multiple X. One of the questions we need to […]

• ## Finding prime numbers

Time：2021-2-17

At the same time as the first emperor, eradose, the first person to measure the circumference of the earth, developed the eradose sieve method thinking In the given sequence (2-N) table, all the multiples (not 0, 1, itself) of each element are erased, and all the prime numbers less than n are left Eradose sieve […]

• ## Blue Bridge Cup – sum of squares problem

Time：2021-1-13

For more articles, please pay attention to the official account of “BLOG of the sea”. Question: The sum of Squares Theorem is also called Lagrange theoremEvery positive integer can be expressed as the sum of squares of up to four positive integers.If 0 is included, it can be expressed as the sum of the squares […]

• ## The difference between R and adjusted R in regression analysis

Time：2020-11-19

By aniruddha BhandariCompile | VKSource | analytics vidhya summary Understand the concept of R-side and adjust R-side Understand the key differences between the R side and the adjust r side introduce When I started my journey to data science, the first algorithm I explored was linear regression. After understanding the concept of linear regression and […]

• ## Square Euclidean distance and error square sum of spark kmeans and source code analysis

Time：2020-7-3

1. Euclidean distanced(x,y) = √( (x-y)^2 + (x-y)^2 + … + (x[n]-y[n])^2 )2. Square Euclidean distanceThe distance formula of spark kmeans uses the square Euclidean distance. The square Euclidean distance is the square of the Euclidean distance (excluding the open root sign)d(x,y) = (x-y)^2 + (x-y)^2 + … + (x[n]-y[n])^2 3. Sum of squared error […]

• ## Square Euclidean distance and error sum of spark kmeans and source code analysis

Time：2020-5-21

1. Euclidean distanced(x,y) = √( (x-y)^2 + (x-y)^2 + … + (x[n]-y[n])^2 )2. Squared Euclidean distanceThe distance formula of spark kmeans uses the square Euclidean distance. The square Euclidean distance is the square of the Euclidean distance (without the root sign)d(x,y) = (x-y)^2 + (x-y)^2 + … + (x[n]-y[n])^2 3. Sum of squared error (SSE)The […]