Tag:Data analysis
-
Data science distribution – beta distribution
beta distribution concept Parameter influence quantity proportion Randomly generated data probability density function Cumulative probability density function concept Beta distribution is a density function as a conjugate prior distribution of Bernoulli distribution and binomial distribution. It has important applications in machine learning and mathematical statistics. In probability theory, beta distribution, also known as Β Distribution […]
-
Redis transaction (optimistic lock and pessimistic lock)
Opening In the previous articles, we took you to the door. In today’s article, we will talk about redis transactions. We have studied relational databases before and know the importance of transactions. In fact, redis transactions are a little different from those of relational databases.Let’s talk about it in detail Redis transaction Concept: Redis transaction […]
-
Seaborn of Python data visualization killer: 90% data analysis and drawing can be realized after learning
Python data visualization Personal home page:JOJO’s data analysis Adventure Personal introduction: I’m reading statistics in my senior year. At present, I’ve reached the guarantee and research levelStatistics top3Colleges and universities continue to study statistics postgraduates If the article is helpful to you, welcomeFollow, like, collect and subscribe to columns This series mainly introduces the application […]
-
Python picture character recognition — installation and use of Tesseract OCR under Windows
Python picture character recognition — installation and use of Tesseract OCR under Windows preface Installation and configuration of Tesseract OCR under Windows Introduction and version selection of Tesseract OCR Tesseract OCR installation Tesseract OCR configuration Installs the dependencies required for Python to call the Tesseract API Tesseract OCR test and use Command line mode Calling […]
-
Fundamentals of Python data analysis 005 – detailed explanation of pandas_ The introduction to pandas is enough
Article catalogue preface (1) Introduction to pandas Foundation 1. What is pandas 2. Why learn pandas 3. Installation of pandas 4. Import pandas Library (2) Common data types of pandas 1. Series (one-dimensional, tagged array) 1.1 index creation 1.2 create series through dictionary 1.3 slicing and indexing of series 1.3.1 display a value 1.3.2 display […]
-
Machine learning series (1)_ Data analysis: kaggle Titanic disaster
This blog draws some conclusions by analyzing the information of passengers in the Titanic accidentCorrelationAnd usePython visualizationThe means of more specific display. Note: references for this blog: 1. Introduction to kaggle – the disaster of the Titanic (a Book) 2. Machine learning series (3)_ Application of logistic regression: kaggle the disaster of Titanic 3. Several […]
-
Does GPU blessing bring new challenges and opportunities to vector big data analysis? (case sharing)
Recently, I often visit foreign websites in my spare time to learn about new technologies. I saw an article on PostGIS vs GPU in the PostGIS community. The author also saw an article on spatial data connection using GPU, which was also very interesting. After reading it, he also built an environment and ran through […]
-
Decision tree algorithm
catalogue 1、 Decision tree principle 2、 Decision tree API 3、 Case: Titanic passenger survival prediction 4、 Decision tree summary 1、 Decision tree principle Cognitive decision treeThe source of the idea of decision tree is very simple. The conditional branch structure in programming is if then structure. The earliest decision tree is a classification learning method […]
-
Data visualization – draw a simple line chart
✅ About the author: Hello, I’m hacker 707. You can call me hackerPersonal homepage:CSDN blog of hacker 707Series column:pythonIf you think the blogger’s article is good, please support the blogger for three times Data visualization – draw line chart Draw a simple line chart Change label text and line thickness Correction pattern Use scatter() to […]
-
Detailed explanation of the usage of iloc and LOC in Python pandas data analysis
Pandas is a fast and efficient data analysis tool for Python. It can be used for data mining and data analysis, and also provides data cleaning function. The contents of this part are as follows: 1、 Iloc 1. Definitions Iloc indexer is used to index or select based on integer position by position. 2. […]
-
Data cleaning of Python data analysis (taking the sales data of motorcycles as an example)
Article catalogue 1、 Get the dataset and look for problems 1. Read dataset description 2. View data and find problems 2、 Cleaning steps 1. Data format conversion 2. To repeat 3. Missing value processing 4. Outlier handling 5. Data discretization reference resources 1、 Get the dataset and look for problems 1. Read dataset description 2. […]
-
[Python] use pandas to merge all excel in the folder
source code import pandas as pd import os #Target folder you want to merge target_dir=’C:/Users/Kinglake/Desktop/666/’ #Get the list of file names in this directory Print (“merge”, target_dir, “path file:”) for root,dirs,files in os.walk(target_dir): Print (“total”, len (files), “PCs”) #Generate absolute path for index in range(len(files)): files[index]=target_dir+files[index] #print(files) #Read in files and merge for index in […]