We learned in the last sectionSeries
The basic operations of structure addition, deletion, modification and query are mastered in this sectionDataFrame
It will be very easy to add, delete, modify and check~
First, let’s construct aDataFrame
:
data = [[1,2,3], [4,5,6], [7,8,9]]
index = ['a', 'b', 'c']
columns = ['A', 'B', 'C']
df = pd.DataFrame(data=data, index=index, columns=columns)
df

check
Query specified column:
>> df['A']
a 1
b 4
c 7
Name: A, dtype: int64
>> df[['A','C']]
A C
a 1 3
b 4 6
c 7 9
useloc
andiloc
Query the specified row:
>> df.loc['a']
A 1
B 2
C 3
Name: a, dtype: int64
>> df.iloc[0]
A 1
B 2
C 3
Name: a, dtype: int64
>> df.loc['a':'b']
A B C
a 1 2 3
b 4 5 6
>> df.iloc[:2]
A B C
a 1 2 3
b 4 5 6
In addition,iloc
andloc
You can also receive a coordinate and queryDataFrame
Specified value or area of:
>> df.loc['b','B']
5
>> df.loc['a':'b',['A','C']]
A C
a 1 3
b 4 6
Finally, there are frequently used Boolean indexes:
>> df[[True, False, True]]
A B C
a 1 2 3
c 7 8 9
change

Modify the specified value:
>> df.loc['a', 'A']
1
>> df.loc['a', 'A'] = 1000
>> df

Modify index and column names:
>> df.index = ['aa','bb','cc']
>> df.columns = ['AA','BB','CC']
>> df

increase
Add a line:
>> df.loc['dd'] = [0,0,0]
>> df

Add multiple lines of content (splicing two vertically)DataFrame
)First, construct a new dataframedf2
:
>> df2 = pd.DataFrame(data=[[10,10,10], [100, 100, 100]],
index=['dd', 'ee'],
columns=['AA', 'BB', 'CC'])
>> df2

Splice two dataframes:
>> df3 = pd.concat([df, df2])
>> df3

pd.concat
Only simple splicing is done, and even repeated indexes will not be overwritten:
>> df3.loc['dd']
AA BB CC
dd 0 0 0
dd 10 10 10
Usually, we useignore_index=True
To reproduce the digital index:
>> df3 = pd.concat([df, df2], ignore_index=True)
>> df3

bydf2
Add a columnDD
:
>> df2['DD'] = [1000, 1000]
>> df2

What about adding multiple columns? We still use the samepd.concat
, but set the parameter toaxis=1
。 Let’s construct a dataframe with two rows and two columnsdf4
:
>> df4 = pd.DataFrame([[1,2],[3,4]], index=['dd','ee'], columns=['E','F'])
>> df4
E F
dd 1 2
ee 3 4
Splicingdf2
Anddf4
:
>> df5 = pd.concat([df2,df4], axis=1)
>> df5

Delete
Delete the abovedf5
MediumE
Column sumF
Columns:
>> del df5['E']
>> del df5['F']
>> df5

When deleting multiple columns, you can also usedrop
Method, but specifyaxis=1
:
>> df5.drop(['CC','DD'], axis=1, inplace=True)
>> df5

You can also usedrop
Method to delete multiple rows. The default parameter is used when deleting rowsaxis=0
You can:
>> df5.drop(['dd'], inplace=True)
>> df5
