Pandas new data column

Time:2021-11-30
There are four methods to add data columns
  1. Direct assignment
  2. Df.apply method
  3. Df.assign method
  4. Select groups according to conditions and assign values respectively
1. Direct assignment

Note that DF [“bwendu”] is actually a series, and the subsequent subtraction returns a series

df.loc[:,"wencha"] = df["bWendu"] - df["yWendu"]
print(df.head())
Pandas new data column

One more column is wencha
2. Df.apply method

Requirements:
Example: add a list of temperature types:

  1. If the maximum temperature is greater than 33 degrees, it is high temperature
  2. Below – 10 degrees is low temperature
  3. Otherwise it is normal temperature
    Note that you need to set axis = 1, which is series
def get_wendu_type(x):
    if x["bWendu"] > 33:
        Return "high temperature."
    if x["yWendu"] < -10 :
        Return "low temperature."
    Return "normal temperature."

df.loc[:,"wendu_type"] = df.apply(get_wendu_type,axis = 1)
print(df)
Pandas new data column

image.png

View counts for temperature types

wendu_count = df["wendu_type"].value_counts()
print(wendu_count)
Pandas new data column

Temperature type count operation results
3. Df.assign method

You can add multiple new columns at the same time without changing DF itself, and a new dataframe will be generated

c =df.assign(
    yWendu_huashi = lambda x : x["yWendu"] * 9 / 5 + 32,
    #Celsius to Fahrenheit
    bWendu_huashi = lambda x : x["bWendu"] * 9 / 5 + 32
)

print(c.head())
Pandas new data column

Operation results
4. Select groups according to conditions and assign values respectively

Select the data according to the conditions, and then assign a new column to this part of the data
Example: if the high-low temperature difference is greater than 10 degrees, it is considered that the temperature difference is large

#Create an empty column first (this is the first way to create a new column)
df['wencha_type'] = ''

DF. LOC [DF ["bwendu"] - DF ["ywendu"] > 10, "wencha_type"] = "large temperature difference"
DF. LOC [DF ["bwendu"] - DF ["ywendu"] < = 10, "wencha_type"] = "normal temperature difference"
wencha_count = df["wencha_type"].value_counts()
print(wencha_count)
Pandas new data column

image.png