Author | soner y ı ld ı r ı m
Source: towards Data Science
Machine learning and deep learning models need a lot of data. Their performance depends largely on the amount of data. Therefore, we tend to collect as much data as possible to establish a robust and accurate model. Data is collected in many different formats, from numbers to images, from text to sound waves. However, we need to convert data into numbers for analysis and modeling.
It is not enough to convert data to scalars (single numbers). As the amount of data increases, operations using scalars begin to become inefficient. We need vectorization or matrix operation to calculate effectively. This is where linear algebra works.
Linear algebra is one of the important topics in the field of data science. In this article, we will introduce the basic concepts of linear algebra by using the example of numpy.
Numpy is a scientific computing library of Python and the basis of many libraries (such as pandas).
Object types in linear algebra
Object (or data structure) type in Linear Algebra:
Scalar: single number
Vectors: numeric arrays
Matrices: 2D numeric arrays
Tensor: n-dimensional sequence with n > 2
A scalar is a number. As we will see in the following example, it can be used for vectorization operations.
A vector is a set of numbers. For example, a vector of 5 elements:
We can use scalars in vectorization operations. Performs the specified operation on each element of the vector. for example
A matrix is a two-dimensional vector
It looks like a pandas data frame with rows and columns. In fact, the pandas data frame is converted into a matrix and then input into the machine learning model.
A tensor is an array of N dimensions where n is greater than 2. Tensors are mainly used for deep learning models with three-dimensional input data.
It’s hard to express it numerically, but you can think of t as three 3×2 shaped matrices.
The shape method can be used to check the shape of a numpy array.
The size of the array is calculated by multiplying the size of each dimension.
Common matrix terms
If the number of rows equals the number of columns, the matrix is called a square matrix. Therefore, the above matrix A is a square matrix.
The identity matrix, expressed as I, is a square matrix with yes on the diagonal and all other positions are 0. Numpy’s identity function can be used to create identity matrices of any size.
What is special about an identity matrix is that it does not change when the matrix is multiplied by it. In this sense, it is similar to the number 1 in real numbers. We will use the identity matrix as an example in the matrix multiplication section of this article.
The inverse matrix of a matrix is a matrix that is multiplied by the original matrix to obtain the identity matrix.
Not every matrix has an inverse matrix. If matrix A has an inverse matrix, it is called invertible or nonsingular
Point multiplication and matrix multiplication
Point multiplication and matrix multiplication are components of complex machine learning and deep learning models, so a comprehensive understanding of them is very valuable.
The dot product of two vectors is the sum of the product of elements relative to their positions. The first element of the first vector is multiplied by the first element of the second vector, and so on. The sum of these products is the dot product. The function to calculate the dot product in numpy isdot()。
Let’s first create two simple vectors in the form of a numpy array and calculate the dot product.
The dot product is calculated as (1 * 2) + (2 * 4) + (3 * 6), i.e. 28.
Because we multiply at the same position, the length of these two vectors must be the same to get the dot product.
In the field of data science, we mainly deal with matrices. A matrix is a set of row and column vectors combined in a structured manner. Therefore, the multiplication of two matrices involves many dot product operations of vectors. It will be clearer if we look at some more examples. Let’s first create two 2×2 matrices with numpy.
The 2×2 matrix has 2 rows and 2 columns. Row and column indexes start with 0. For example, the first row of a (row with index 0) is an array of [4,2]. The first column of a is an array of [4,0]. The elements in the first row and the first column are 4.
We can access a single row, column, or element as follows:
These are important concepts for understanding matrix multiplication.
The multiplication of two matrices involves the point multiplication between the rows of the first matrix and the columns of the second matrix. The first step is the dot product between the first row of a and the first column of B. The result of this dot product is the element of the matrix obtained at position [0,0] (i.e. first row, first column).
Therefore, the resulting matrix C will have a (4 * 0) + (2 * 4) in the first row and the first column. C[0,0]=18。
The next step is the dot product of the first row of a and the second column of B.
C has a (4 * 0) + (2 * 4) in the first row and the second column. C[0,1]=8。
The first line a has been completed, so we start with the second line of a and follow the same steps.
C has a (0 * 4) + (3 * 1) in the second row and the first column. C[1,0]=3。
The last step is the dot product between the second row of a and the second column of B.
C has a (0 * 0) + (3 * 4) in the second row and the second column. C[1,1]=12。
We have seen how it is done step by step. All these operations are done withnp.dotOperation:
As you may remember, we mentioned that the identity matrix does not change when multiplied by any matrix. Let’s take an example.
We also mentioned that when a matrix is multiplied by its inverse matrix, the result is the identity matrix. Let’s first create a matrix and then find its inverse. We can use the numpy functionlinalg.inv()Find the inverse of the matrix.
Multiply the inverse matrix C of B by B:
We get the identity matrix.
As we recall in the vector dot product, two vectors must be of the same length to have a dot product. Every dot product operation in matrix multiplication must follow this rule. The dot product is carried out between the rows of the first matrix and the columns of the second matrix. Therefore, the rows of the first matrix and the columns of the second matrix must be the same length.
The requirement of matrix multiplication is that the number of columns of the first matrix must be equal to the number of rows of the second matrix.
For example, we can multiply a 3×2 matrix by a 2×3 matrix.
The shape of the resulting matrix will be 3X3, because we perform a 3-point product on each row of a, and a has 3 rows. A simple way to determine the shape of the resulting matrix is to extract the number of rows from the first matrix and the number of columns from the second matrix:
- 3×2 and 2×3 multiply to return 3×3
- 3×2 and 2×2 multiply to return 3×2
- 2×4 and 4×3 multiply to return 2×3
We have discussed the basic operations of linear algebra. These basic operations are the basis for the construction of complex machine learning and deep learning models. In the process of model optimization, a lot of matrix multiplication is needed. Therefore, it is also very important to understand the basic knowledge.
Thank you for reading. If you have any feedback, please let me know.
Welcome to panchuang AI blog:
Official Chinese document of sklearn machine learning:
Welcome to panchuang blog resources summary station: