How to write simple and beautiful Python code

Time:2020-11-23

By aniruddha Bhandari
Compile | VK
Source | analytics vidhya

summary

  • Python style tutorial will enable you to write neat and beautiful Python code

  • Learn about the different Python conventions and other nuances of Python Programming in this style tutorial

introduce

Have you ever encountered a poorly written Python code? I know a lot of you will nod.

Writing code is part of the role of a data scientist or analyst. Writing nice, neat Python code, on the other hand, is quite another thing. As a programmer proficient in the field of analysis or data Science (or even software development), this is likely to change your image.

So how do we write this so-called beautiful Python code?

Welcome to the python style tutorial

Many people in the field of data science and analysis come from non programming backgrounds. We start by learning the basics of programming, then we understand the theory behind machine learning, and then we begin to conquer datasets.

In the process, we often don’t practice core programming or pay attention to programming conventions.

This is what this Python style tutorial will address. We’ll review the python programming conventions described in the pep-8 documentation, and you’ll be a better programmer!

catalog

  • Why is this Python style tutorial important for data science?

  • What is pep8?

  • Understanding Python naming conventions

  • Code layout of Python style tutorial

  • Be familiar with the correct Python annotations

  • Spaces in Python code

  • General Python advice for programming

  • Auto format Python code

Why is this Python style tutorial important for data science

There are several reasons why formatting is an important aspect of programming, especially for data science projects:

  • Readability

A good code format will inevitably improve the readability of the code. This will not only make your code more organized, but also make it easier for readers to understand what’s going on in the program. This is especially useful if your program runs thousands of lines.

You will have a lot of data frames, lists, functions, drawings, etc. if you don’t follow the correct format guidelines, you will easily lose track of your own code!

  • cooperation

If you work together on a team project, and most data scientists do, good formatting becomes an essential task.

This ensures that the code is understood correctly without any trouble. In addition, following a common format pattern can maintain program consistency throughout the project life cycle.

  • Bug fix

When you need to fix bugs in your program, having a well formed code will also help you. It’s not easy to make a wrong name for a bad debugging!

So it’s best to start writing your program in the right style!

With this in mind, let’s give a quick overview of the pep-8 style tutorial that this article will introduce!

What is pep-8

Pep-8 or Python enhancement suggestions are a style tutorial for Python programming. It’s written by Glen Watson and Nick Van Gogh. It describes the rules for writing nice and readable Python code.

Following the pep-8 coding style will ensure the consistency of Python code, making it easier for other readers, contributors, or yourself to understand the code.

This article introduces the most important aspects of the pep-8 guidelines, such as how to name Python objects, how to construct code, when to include comments and spaces, and finally some general programming suggestions that are very important but easy to be ignored by most Python programmers.

Let’s learn to write better code!

The official pep-8 documentation can be found here.

https://www.python.org/dev/peps/pep-0008/

Understanding Python naming conventions

Shakespeare has a famous saying: “what is in the name?” If he had met a programmer at the time, he would have quickly received a reply – “a lot of it!” .

Yes, when you write a piece of code, the names you choose for variables, functions, and so on have a great impact on the comprehensibility of the code. Take a look at the following code:

#Function 1
def func(x):
   a = x.split()[0]
   b = x.split()[1]
   return a, b
print(func('Analytics Vidhya'))

#Function 2
def name_split(full_name):
   first_name = full_name.split()[0]
   last_name = full_name.split()[1]
   return first_name, last_name
print(name_split('Analytics Vidhya'))
#Output
('Analytics', 'Vidhya')
('Analytics', 'Vidhya')

The two functions work the same, but the latter provides a better sense of what’s going on, even without any comments!

That’s why choosing the right name and following the right naming convention can make a huge difference when writing a program. Having said that, let’s take a look at how to name objects in Python!

Initial naming

These techniques can be applied to naming any entity and should be strictly followed.

  • Follow the same pattern
thisVariable, ThatVariable, some_other_variable, BIG_NO
  • Avoid long names, and don’t be too frugal
this_could_be_a_bad_name = “Avoid this!”
t = “This isn\’t good either”
  • Use reasonable and descriptive names. This will help you remember the purpose of the code later
X "my name" prevents this
full_ Name "my name" ා this is better
  • Avoid names that start with numbers
1_name = “This is bad!”
  • Avoid using special characters such as @! , yuan, etc
phone_  #Not good

Variable naming

  • Variable names should always be lowercase
blog = "Analytics Vidhya"
  • For long variable names, use an underline to separate words. This improves readability
awesome_blog = "Analytics Vidhya"
  • Try not to use single character variable names, such as “I” (capital I), “O” (capital o), “L” (lowercase L). They are indistinguishable from the numbers 1 and 0. have a look:
O = 0 + l + I + 1
  • The naming of global variables follows the same Convention

Function naming

  • Follow lowercase and underline naming conventions

  • Use expressive names

#Avoid
def con():
    ...
#This is better
def connect():
    ...
  • If function parameter names conflict with keywords, use trailing underscores instead of abbreviations. For example, convert a break to a break_ U instead of BRK
#Avoid name conflicts
def break_time(break_):
    print(“Your break time is”, break_,”long”)

Class name naming

  • Follow the naming convention of capword (or camelCase or studycaps). Each word begins with a capital letter, and do not underline between words
#Follow capword
class MySampleClass:
    pass
  • If the class contains subclasses with the same property name, consider adding double underscores to the class properties

This ensures the properties in the class person__ageWas interviewed as_ Person__ age。 This is Python’s name confusion, which ensures that there are no name conflicts

class Person:
    def __init__(self):
        self.__age = 18

obj = Person() 
obj.__ Age ා error
obj._ Person__ Age ා correct
  • Use the suffix “error” for exception classes
class CustomError(Exception):
    Custom exception class

Class method naming

  • The first parameter of an instance method (a base class method that does not attach a string) should always be self. It points to the calling object

  • The first method parameter of CLS should always be a class parameter. This points to classes, not object instances

class SampleClass:
    def instance_method(self, del_):
        print(“Instance method”)

    @classmethod
    def class_method(cls):
        print(“Class method”)

Package and module naming

  • Try to keep your name short and clear

  • The naming convention of lowercase should be followed

  • For long module names, underline is preferred

  • Avoid using underscores for package names

Testpackage ා package name
sample_ module.py  #Module name

Constant Names

  • Constants are usually declared and assigned in modules

  • Constant names should be all capital letters

  • Underline long names

#The following constant variables are global.py modular
PI = 3.14
GRAVITY = 9.8
SPEED_OF_Light = 3*10**8

Code layout of Python style tutorial

Now that you know how to name entities in Python, the next question should be how to construct code in Python!

To be honest, this is very important because without the right structure, your code can go wrong, which is the biggest obstacle for any reviewer.

So, no more effort, let’s take a look at the basics of code layout in Python!

indent

It is the most important aspect of code layout and plays a crucial role in Python. Indentation tells the code block which lines of code to include for execution. Lack of indentation can be a serious error.

Indentation determines which code block the code statement belongs to. Imagine trying to write a nested for loop code. Writing a line of code outside of the respective loops may not cause you a syntax error, but you are bound to end up with a logical error that can take time to debug.

Follow the indentation style mentioned below to achieve a consistent Python scripting style.

  • Always follow the 4-space indentation rule
#Examples
if value<0:
    print(“negative value”)

#Another example
for i in range(5):
    print(“Follow this rule religiously!”)
  • It is recommended to use spaces instead of tabs

It is recommended to use spaces instead of tabs. However, you can use tabs when your code is already indented with tabs.

if True:
    print('4 spaces of indentation used!')
  • Split large expressions into several lines

There are several ways to deal with this situation. One way is to align subsequent statements with the starting separator.

#Align with the start separator.
def name_split(first_name,
               middle_name,
               last_name)

#Another example.
ans = solution(value_one, value_two,
               value_three, value_four)

The second method is to use an indentation rule of four spaces. This will require an additional level of indentation to distinguish parameters from other code within the block.

#Use extra indentation.
def name_split(
        first_name,
        middle_name,
        last_name):
    print(first_name, middle_name, last_name)

Finally, you can even use hanging indent. Hanging indent, in Python context, is a text style in which lines containing parentheses end in open parentheses, followed by lines indented until the end of parentheses.

#Hanging indent
ans = solution(
    value_one, value_two,
    value_three, value_four)
  • Indenting if statements can be a problem

If statements with multiple conditions naturally contain four spaces. As you can see, it could be a problem. Subsequent lines are indented as well, and the if statement cannot be distinguished from the block of code it executes. Now, what should we do?

Well, there are several ways to get around it:

#It's a problem.
if (condition_one and
    condition_two):
    print(“Implement this”)

One way is to use extra indentation!

#Use extra indentation
if (condition_one and
        condition_two):
    print(“Implement this”)

Another method is to add comments between the if statement condition and the code block to distinguish the two

#Add comments.
if (condition_one and
    condition_two):
    #This condition is valid
    print(“Implement this”)
  • Closure of brackets

Suppose you have a very long dictionary. You put all the key value pairs on a separate line, but where do you put the right parenthesis? Is it in the last line? Or follow the last key value pair? If placed on the last line, what is the indent at the position of the right bracket?

There are several ways to solve this problem.

One way is to align the right bracket with the first non space character on the previous line.

# 
learning_path = {
    ‘Step 1’ : ’Learn programming’,
    ‘Step 2’ : ‘Learn machine learning’,
    ‘Step 3’ : ‘Crack on the hackathons’
    }

The second way is to make it the first character of a new line.

learning_path = {
    ‘Step 1’ : ’Learn programming’,
    ‘Step 2’ : ‘Learn machine learning’,
    ‘Step 3’ : ‘Crack on the hackathons’
}
  • Wrap before binary operator

If you try to put too many operators and operands in a row, this is bound to be troublesome. Instead, break it into lines for better readability.

Now the obvious question is – break before or after the operator? The Convention is to break the line before the operator. This helps to identify the operator and the operands it acts on.

#Break line before operator
gdp = (consumption
      + government_spending
      + investment
      + net_exports
      )

Use blank lines

Putting lines of code together will only make it harder for readers to understand your code. A good way to make your code look cleaner and more beautiful is to introduce a corresponding number of blank lines into your code.

  • Top level functions and classes should be separated by two empty lines
#Separating classes from top-level functions
class SampleClass():
    pass


def sample_function():
    print("Top level function")
  • Methods in the class should be separated by a space line
#Separating methods from classes
class MyClass():
    def method_one(self):
        print("First method")

    def method_two(self):
        print("Second method")
  • Try not to include empty lines between code snippets with related logic and functions
def remove_stopwords(text): 
    stop_words = stopwords.words("english")
    tokens = word_tokenize(text) 
    clean_text = [word for word in tokens if word not in stop_words] 
      
    return clean_text
  • You can use fewer empty lines to separate logical parts in a function. This makes the code easier to understand
def remove_stopwords(text): 
    stop_words = stopwords.words("english")
    tokens = word_tokenize(text) 
    clean_text = [word for word in tokens if word not in stop_words] 

    clean_text = ' '.join(clean_text)
    clean_text = clean_text.lower()

    return clean_text

Maximum row length

  • No more than 79 characters per line

When you write code in Python, you can’t compress more than 79 characters on a line. This is a limitation and should be a guiding principle to keep the statement short.

  • You can split statements into multiple lines and convert them to shorter lines of code
#Split into multiple lines
num_list = [y for y in range(100) 
            if y % 2 == 0 
            if y % 5 == 0]
print(num_list)

Import package

Many data scientists like Python in part because there are too many libraries that make it easier to process data. So let’s assume that you’ll eventually import a bunch of libraries and modules to do anything in data science.

  • It should always be at the top of a python script

  • Separate libraries should be imported on separate lines

import numpy as np
import pandas as pd

df = pd.read_csv(r'/sample.csv')
  • Imports should be grouped in the following order:

    • Standard library import
    • Related third party imports
    • Local application / kutten import
  • Include a blank line after each import

import numpy as np
import pandas as pd
import matplotlib
from glob import glob
import spaCy 
import mypackage
  • The first mock exam can import multiple classes from the same module in one row.
from math import ceil, floor

Be familiar with the correct Python annotations

Understanding an uncommented piece of code can be a laborious task. Even the original author of the code has a hard time remembering what happened in a line of code over time.

Therefore, it is best to comment on the code in time so that readers can understand exactly what you are trying to achieve with this code.

General tips

  • Comments always start with a capital letter

  • The notes should be complete sentences

  • Update comments when updating code

  • Avoid writing notes about the obvious

Style of notes

  • Describe the code snippets that follow them

  • Same indentation as code snippet

  • Start with a space

#Removes non alphanumeric characters from the user input string.
import re

raw_text = input(‘Enter string:‘)
text = re.sub(r'\W+', '  ', raw_text)

Inline Comments

  • These comments are on the same line as the code statements

  • At least two spaces should be separated from the code statement

  • Begin with the usual ා, followed by a space

  • Don’t use them to state the obvious

  • Use them as little as possible because they are distracting

info_ Dict = {} ා dictionary, used to store the extracted information

Document string

  • Used to describe common modules, classes, functions, and methods

  • Also known as “docstrings”

  • They stand out in the other three because they stand out from the rest

  • If the docstring ends in a single line, include the terminator ” in the same line

  • If the docstring is divided into multiple lines, please add the terminator “” to the new line

def square_num(x):
    "Returns the square of a number." "
    return x**2

def power(x, y):
    "Multiline comments.
	   Returns x * * y
    """
    return x**y

Spaces in Python code

When writing beautiful code, whitespace is often overlooked as a trivial aspect. But the correct use of whitespace can greatly improve the readability of the code. They help prevent code statements and expressions from being overcrowded. This inevitably helps readers navigate the code easily.

crux

  • Avoid placing spaces in parentheses immediately
#The right way
df[‘clean_text’] = df[‘text’].apply(preprocess)
  • Do not precede commas, semicolons, or colons with spaces
#Right
name_split = lambda x: x.split()
  • Do not include spaces between the character and the opening bracket
#Right
print(‘This is the right way’)
#Right
for i in range(5):
    name_dict[i] = input_list[i]
  • When using multiple operators, only spaces are included around the lowest priority operator
#Right
ans = x**2 + b*x + c
  • In shards, colons act as binary operators

They should be treated as operators with the lowest priority. Each colon must contain equal spaces around it

#Right
df_valid = df_train[lower_bound+5 : upper_bound-5]
  • Trailing spaces should be avoided

  • The default value of function parameters should not have spaces around the = sign

def exp(base, power=2):
    return base**power
  • Always enclose the following binary operators with a single space:
    • Assignment operators (=, + =, – =, etc.)
    • Compare (=,! =, < >, < =, > =, input, not in, yes, no)
    • Boolean (and, or, not)
#Right
brooklyn = [‘Amy’, ‘Terry’, ‘Gina’, 'Jake']
count = 0
for name in brooklyn:
    if name == ‘Jake’:
        print(‘Cool’)
        count += 1

General Python advice for programming

Usually, there are many ways to write a piece of code. When they do the same task, it’s best to use the recommended authoring method and maintain consistency. I’ve covered some of them in this section.

  • Always use “is” or “is not” when comparing with things like “none.”. Do not use the equality operator
#Error
if name != None:
    print("Not null")
#Right
if name is not None:
    print("Not null")
  • Do not use comparison operators to compare Boolean values with true or false. While it may be intuitive to use the comparison operator, it is not necessary to use it. Just write Boolean expressions
#Right
if valid:
    print("Correct")
#Error
if valid == True:
    print("Wrong")
  • Instead of binding lambda functions to identifiers, use generic functions. Because assigning a lambda function to an identifier violates its purpose. It will also be easier to go back
#Choose this
def func(x):
    return None

#Not this one
func = lambda x: x**2
  • When you catch an exception, name the exception you want to catch. Don’t just use a bare exception. This will ensure that when you try to interrupt execution, the exception block does not mask other exceptions through keyboard interrupt exceptions
try:
    x = 1/0
except ZeroDivisionError:
    print('Cannot divide by zero')
  • Be consistent with your return statement. That is, all return statements in a function should return an expression, or none of them should return an expression. In addition, if the return statement does not return any value, return none instead of nothing
#Error
def sample(x):
    if x > 0:
        return x+1
    elif x == 0:
        return
    else:
        return x-1

#Right
def sample(x):
    if x > 0:
        return x+1
    elif x == 0:
        return None
    else:
        return x-1

If you want to check for prefixes or suffixes in strings, use “. Startswitch()” and “. Endswitch()” instead of string slicing. They are cleaner and less error prone

#Right
if name.endswith('and'):
    print('Great!')

Auto format Python code

When you write small programs, formatting doesn’t become a problem. But imagine, for a complex program that runs thousands of lines, you have to follow the correct formatting rules! This is definitely a difficult task. And, most of the time, you don’t even remember all the formatting rules.

How can we solve this problem? Well, we can do this with some automatic formatters!

The autoformatter is a program that can identify format errors and fix them in place. Black is such an automatic formatter, which can automatically format Python code into pep8 code style code, thus reducing your burden.

BLACK:https://pypi.org/project/black/

You can easily install it using PIP by typing the following command in the terminal:

pip install black

But let’s see how helpful black can be in the real world. Let’s use it to format programs with the following types of errors:

Now, all we have to do is go to the terminal and type the following command:

black style_script.py

When it’s done, black may have completed the changes, and you will receive the following message:

Once you try to open the program again, these changes will be reflected in the program:

As you can see, it has formatted the code correctly, and it can help if you accidentally violate the formatting rules.

Black can also be integrated with atom, sublime text, visual studio code, and even Jupiter notebook! This is a plug-in that you will never miss.

In addition to black, there are other automatic formatting programs, such as autoep8 and yapf, you can also try it!

ending

We’ve covered a lot of key points in the python style tutorial. If you always follow these principles in your code, you’ll end up with a cleaner and readable code.

In addition, when you work on a project as a team, it is beneficial to follow a common standard. It makes it easier for other collaborators to understand. Start adding these style tips to Python code!

Link to the original text: https://www.analyticsvidhya.com/blog/2020/07/python-style-guide/

Welcome to visit pan Chuang AI blog station:
http://panchuang.net/

Sklearn machine learning Chinese official document:
http://sklearn123.com/

Welcome to pay attention to pan Chuang blog resource collection station:
http://docs.panchuang.net/

Recommended Today

Summary of recent use of gin

Recently, a new project is developed by using gin. Some problems are encountered in the process. To sum up, as a note, I hope it can help you. Cross domain problems Middleware: func Cors() gin.HandlerFunc { return func(c *gin.Context) { //Here you can use * or the domain name you specify c.Header(“Access-Control-Allow-Origin”, “*”) //Allow header […]