Complete collection of Python methods for operating CSV format files

Time:2021-12-8
catalogue
  • (1) CSV format file
  • (2) CSV library operation CSV format text
  • (3) Pandas library operation CSV file
  • summary

(1) CSV format file

1. Description

CSV is a file type with comma separated values. In databases or spreadsheets, the common import and export file format is CSV format. CSV format stores data, usually in plain text.

(2) CSV library operation CSV format text

Operate the following table data:

这里写图片描述

1. How to read the header

#Mode 1
import csv
with open("D:\\test.csv") as f:
    reader = csv.reader(f)
    rows=[row for row in  reader]
    print(rows[0])


----------
#Mode II
import csv
with open("D:\\test.csv") as f:
    #1. Create reader object
    reader = csv.reader(f)
    #2. Read the data in the first line of the file
    head_row=next(reader)
    print(head_row)

Result demonstration: [name ‘,’ age ‘,’ occupation ‘,’ home address’, ‘salary’]

2. Read a column of data in the file

#1. Obtain a column of data in the file
import csv
with open("D:\\test.csv") as f:
    reader = csv.reader(f)
    column=[row[0] for row in  reader]
    print(column)

Result demonstration: [name ‘,’ Zhang San ‘,’ Li Si ‘,’ Wang Wu ‘,’ kaina ‘]

3. Write data to CSV file

#1. Write data to CSV file
import csv
with open("D:\\test.csv",'a') as f:
     Row = ['Cao Cao', '23', 'student', 'Heilongjiang', '5000']
     write=csv.writer(f)
     write.writerow(row)
     Print ("write completed!")

Result demonstration:

这里写图片描述

4. Get the file header and its index

import csv
with open("D:\\test.csv") as f:
    #1. Create reader object
    reader = csv.reader(f)
    #2. Read the data in the first line of the file
    head_row=next(reader)
    print(head_row)
    #4. Get the file header and its index
    for index,column_header in enumerate(head_row):
        print(index,column_header)

Result demonstration:
[‘name’, ‘age’, ‘occupation’, ‘home address’,’ salary ‘]
0 name
1 Age
2 occupation
3 home address
4 salary

5. Get the maximum value of a column

#['name', 'age', 'occupation', 'home address',' salary ']
import csv
with open("D:\\test.csv") as f:
    reader = csv.reader(f)
    header_row=next(reader)
    # print(header_row)
    salary=[]
    for row in reader:
        #Save the fifth column of data to the list salary
         salary.append(int(row[4]))
    print(salary)
    Print ("the maximum salary of employees is:" + str (max (salary)))

Result demonstration: the maximum salary of employees is 10000

6. Copy CSV format file

Original file test.csv

这里写图片描述

import csv
f=open('test.csv')
#1. Newline = '' eliminate blank lines
aim_file=open('Aim.csv','w',newline='')
write=csv.writer(aim_file)
reader=csv.reader(f)
rows=[row for row in reader]
#2. Traverse the rows list
for row in rows:
    #3. Write each line to aim.csv
    write.writerow(row)

01. Result of not adding keyword parameter newline = ”:

这里写图片描述

02 add the content of aim.csv file with keyword parameter newline = ”:

这里写图片描述

(3) Pandas library operation CSV file

CSV file content:

这里写图片描述

1. Install pandas Library: PIP install pandas

2. Read all data from CSV file


 import pandas as pd
path= 'D:\\test.csv'
with open(path)as file:
    data=pd.read_csv(file)
    print(data)

Result demonstration:
      full name   Age    occupation   Home address      wages
0      Zhang San   twenty-two    cook    Beijing    six thousand
one      Li Si   twenty-six   Photographer   Changsha, Hunan    eight thousand
two      Wang Wu   twenty-eight   programmer     Shenzhen   ten thousand
three   Kaina   twenty-two    student    Heilongjiang    two thousand
four      Cao Cao   twenty-eight    sale     Shanghai    six thousand

3. Describe () method data statistics

import pandas as pd
path= 'D:\\test.csv'
with open(path)as file:
    data=pd.read_csv(file)
    #Learn more about describe (), CTR + left mouse button
    print(data.describe())

Result demonstration:
             Age             wages
count   5.00000      5.000000
mean   25.20000   6400.000000
std     3.03315   2966.479395
min    22.00000   2000.000000
25%    22.00000   6000.000000
50%    26.00000   6000.000000
75%    28.00000   8000.000000
max    28.00000  10000.000000

4. Read the first few lines of the file

import pandas as pd
path= 'D:\\test.csv'
with open(path)as file:
    data=pd.read_csv(file)
    #Read the first 2 lines of data
    # head_datas = data.head(0)
    head_datas=data.head(2)
    print(head_datas)

Result demonstration:
   full name   Age    occupation   Home address     wages
0   Zhang San   twenty-two    cook    Beijing   six thousand
one   Li Si   twenty-six   Photographer   Changsha, Hunan   eight thousand

5. Read all data in a row

import pandas as pd
path= 'D:\\test.csv'
with open(path)as file:
    data=pd.read_csv(file)
    #Read all data in the first row
    print(data.ix[0,])

Result demonstration:
full name         Zhang San
Age         twenty-two
occupation         cook
Home address      Beijing
wages       six thousand

6. Read some rows of data

import pandas as pd
path= 'D:\\test.csv'
with open(path)as file:
    data=pd.read_csv(file)
    #Read all data of the first, second and fourth rows
    print(data.ix[[0,1,3],:])

Result demonstration:
      full name   Age    occupation   Home address     wages
0      Zhang San   twenty-two    cook    Beijing   six thousand
one      Li Si   twenty-six   Photographer   Changsha, Hunan   eight thousand
three   Kaina   twenty-two    student    Heilongjiang   two thousand

7. Read all row and column data

import pandas as pd
path= 'D:\\test.csv'
with open(path)as file:
    data=pd.read_csv(file)
    #Read all row and column data
    print(data.ix[:,:])

Result demonstration:
      full name   Age    occupation   Home address      wages
0      Zhang San   twenty-two    cook    Beijing    six thousand
one      Li Si   twenty-six   Photographer   Changsha, Hunan    eight thousand
two      Wang Wu   twenty-eight   programmer     Shenzhen   ten thousand
three   Kaina   twenty-two    student    Heilongjiang    two thousand
four      Cao Cao   twenty-eight    sale     Shanghai    six thousand

8. Read all row data of a column

import pandas as pd
path= 'D:\\test.csv'
with open(path)as file:
    data=pd.read_csv(file)
    # print(data.ix[:, 4])
    Print (data. IX [:, 'salary'])

Result demonstration:
0     6000
1     8000
2    10000
3     2000
4     6000
Name: salary, dtype: Int64

9. Read some rows of some columns

import pandas as pd
path= 'D:\\test.csv'
with open(path)as file:
    data=pd.read_csv(file)
    Print (data. IX [[0,1,3], ['name', 'occupation', 'salary'])

Result demonstration:
      full name    occupation     wages
0      Zhang San    cook   six thousand
one      Li Si   Photographer   eight thousand
three   Kaina    student   two thousand

10. Read the data corresponding to a row and a column

import pandas as pd
path= 'D:\\test.csv'
with open(path)as file:
    data=pd.read_csv(file)
    #Read the third column of the third row
    Print ("occupation --" + data. IX [2,2])

Result demonstration: profession — programmer

11. Import and export of CSV data (copy CSV file)

Read mode 01:

import pandas as pd
#1. Read in data
data=pd.read_csv(file)

Write data 02:

import pandas as pd
#1. Write out the data. The target file is aim.csv
data.to_csv('Aim.csv')

other:

01. Read network data:
import pandas as pd 
data_url = "https://raw.githubusercontent.com/mwaskom/seaborn-data/master/tips.csv"
#Fill in URL to read
df = pd.read_csv(data_url)


----------
02. Read excel file data
import pandas as pd 
data = pd.read_excel(filepath)

Example demonstration:

1. Test.csv original file content

这里写图片描述

2. Now copy the contents of test.csv to aim.csv

import pandas as pd
file=open('test.csv')
#1. Read the data in the file
data=pd.read_csv(file)
#2. Write the data to the target file aim.csv
data.to_csv('Aim.csv')
print(data)

Result demonstration:

这里写图片描述

Note: the processing of Excel files in pandas module is similar to that of CSV files!

Reference documents: https://docs.python.org/3.6/library/csv.html

summary

This is the end of this article on Python operation of CSV format files. For more information about Python operation of CSV files, please search the previous articles of developeppaer or continue to browse the relevant articles below. I hope you will support developeppaer in the future!

Recommended Today

Heavyweight Tencent cloud open source industry’s first etcd one-stop governance platform kstone

​ Kstone open source At the kubecon China Conference held by CNCF cloud native foundation on December 9, 2021,Tencent cloud container tke team released the open source project of kstone etcd governance platform. KstoneIt was initiated by the TKE team of Tencent cloud containerCloud native one-stop etcd governance project based on kubernetes。 The project originates […]