Five powerful techniques for visualizing data using Matplotlib

Time:2021-7-13

By rizky Maulana nurhidayat
Compile VK
Source: towards Data Science

Data visualization is used to display data in a more direct way and is easier to understand. It can be formed in the form of histogram, scatter chart, line chart, pie chart, etc. Many people still use Matplotlib as a back-end module to visualize their graphics. In this story, I’ll give you some tips, five powerful tips for creating a good chart using Matplotlib.

1. Use latex font

By default, we can use some good fonts provided by Matplotlib. However, some symbols are not good enough to be created by Matplotlib. For example, the symbol phi( φ), As shown in Figure 1.

As you can see in y-label, it’s still phi( φ) But for some, it’s not enough to be a drawing label. To make it more beautiful, you can use latex fonts. How to use it? Here’s the answer.

plt.rcParams['text.usetex'] = True
plt.rcParams['font.size'] = 18

You can add the above code at the beginning of the Python code. Line 1 defines the latex font used in the drawing. You also need to define a font size larger than the default size. If you don’t change it, I think it will give you a small label. I chose 18. The result of applying the above code is shown in Figure 2.

You need to write a double dollar sign at the beginning and end of the sign, like this ($… $)

plt.xlabel('x')
plt.ylabel('$\phi$ (phi)')

If you have some errors or do not install the libraries required to use latex fonts, you need to install them by running the following code in jupyter notebook.

!apt install texlive-fonts-recommended texlive-fonts-extra cm-super dvipng

If you want to install through the terminal, you can enter

apt install texlive-fonts-recommended texlive-fonts-extra cm-super dvipng

Of course, you can use some different font families, such as serif, sans serif (example above), etc. To change the font family, use the following code.

plt.rcParams['font.family'] = "serif"

If you add the above code to the code, it will give you a diagram, as shown in Figure 3.

Can you understand the difference between figure 3 and Figure 2? Yes, if you analyze it carefully, the difference is at the end of the font. The latter graph uses serif, while the former uses sans serif. In short, serif means tail and sans means none. If you want to learn more about font families or fonts, I suggest you use this link.

https://en.wikipedia.org/wiki/Typeface

You can also use the jupyter themes library to set font families / fonts. I’ve done a tutorial to use it. Just click on the link below. Jupyter theme can also change your jupyter theme, such as dark mode theme:https://medium.com/@rizman18/how-can-i-customize-jupyter-notebook-into-dark-mode-7985ce780f38

We want to insert complex text, as shown in the title of Figure 4.

If you want to create Figure 4, you can use this complete code

#Import library
import numpy as np
import matplotlib.pyplot as plt

#Adjusting the Matplotlib parameter
plt.rcParams.update(plt.rcParamsDefault)
plt.rcParams['text.usetex'] = True
plt.rcParams['font.size'] = 18
plt.rcParams['font.family'] = "serif"

#Create simulation data
r = 15
theta = 5
rw = 12
gamma = 0.1

err = np.arange(0., r, .1)
z = np.where(err < rw, 0, gamma * (err-rw)**2 * np.sin(np.deg2rad(theta)))
    
#Visualization data
plt.scatter(err, z, s = 10)
plt.title(r'$\Sigma(x) = \gamma x^2 \sin(\theta)$', pad = 20)
plt.xlabel('x')
plt.ylabel('$\phi$')

#Save chart
plt.savefig('latex.png', dpi = 300, pad_inches = .1, bbox_inches = 'tight')

2. Create zoom effect

In this tip, I’ll give you a code to generate the drawing, as shown in Figure 5.

First, you need to understandplt.axes() andplt.figure()You can see it in the link below. codeplt.figure()Covers all objects in a single container, including axes, graphics, text, and labels. codeplt.axes() contains only specific parts. I think Figure 6 can give you a simple understanding.

Use of black boxplt.figure(), red and blue boxesplt.axes()In Figure 6, there are two axes, red and blue. You can view this link for basic reference:https://medium.com/datadriveninvestor/python-data-visualization-with-matplotlib-for-absolute-beginner-python-part-ii-65818b4d96ce

After understanding, you can analyze how to create Figure 5. Yes, simply put, there are two axes in Figure 5. The first axis is a large drawing, with enlarged versions ranging from 580 to 650, and the second is a reduced version. Here’s the code to create Figure 5.

#Create main container
fig = plt.figure()

#Set random seed
np.random.seed(100)

#Create simulation data
x = np.random.normal(400, 50, 10_000)
y = np.random.normal(300, 50, 10_000)
c = np.random.rand(10_000)

#Create enlarged view
ax = plt.scatter(x, y, s = 5, c = c)
plt.xlim(400, 500)
plt.ylim(350, 400)
plt.xlabel('x', labelpad = 15)
plt.ylabel('y', labelpad = 15)

#Create enlarged view
ax_ new = fig.add_ Axes ([0.6, 0.6, 0.2, 0.2]) comparison of the position and scale of enlarged drawing
plt.scatter(x, y, s = 1, c = c)

#Save the drawing and keep the margins
plt.savefig('zoom.png', dpi = 300, bbox_inches = 'tight', pad_inches = .1)

If you need an explanation of the code, you can visit this link:https://medium.com/datadriveninvestor/data-visualization-with-matplotlib-for-absolute-beginner-part-i-655275855ec8

I also provided another version of the zoom effect, you can use Matplotlib. As shown in Figure 7.

To create Figure 7, you need to use add in Matplotlib_ Subblot or other syntax creates three axes. To make it easier to use, I add it here. To create them, use the following code.

fig = plt.figure(figsize=(6, 5))
plt.subplots_adjust(bottom = 0., left = 0, top = 1., right = 1)

#Create the first axis, the upper left corner of the graph with a green graph
sub1 = fig.add_ Subplot (2,2,1) # two rows, two columns, first cell

#Create a second axis, the orange axis in the upper left corner
sub2 = fig.add_ Subplot (2,2,2) # two rows, two columns, second cell

#Create a third axis, a combination of the third and fourth cells
sub3 = fig.add_ Subplot (2,2, (3,4)) # two rows and two columns, merging the third and fourth cells

The code will generate a diagram, as shown in Figure 8. It tells us that it will generate two rows and two columns. Axis sub1 (2, 2, 1) is the first axis in the subgraph (first row, first column). The sequence starts from top left to right. Axis sub2 (2, 2, 2) is placed in the first row and second column. Axis sub3 (2, 2, (3, 4)), is the merging axis between the first column of the second row and the second column of the second row.

Of course, we need to define a simulation data for visualization in the drawing. Here, I define a simple combination of linear and sinusoidal functions, as shown in the following code.

#Using lambda to define functions
stock = lambda A, amp, angle, phase: A * angle + amp * np.sin(angle + phase)

#Defining parameters
Theta = NP. Linspace (0., 2 * NP. PI, 250) # X-axis
np.random.seed(100)
noise = 0.2 * np.random.random(250)
Y = stock (. 1,. 2, theta, 1.2) + noise # Y axis

If you apply the code to the previous code, you will get a diagram, as shown in Figure 9.

The next step is to limit the X and Y axes of the first and second axes (sub1 and sub2), create blocking regions for the two axes in sub3, and create a connectionpatch that represents the scaling effect. This can be done with the following complete code (remember, I didn’t use loops for simplicity).

#Using lambda to define functions
stock = lambda A, amp, angle, phase: A * angle + amp * np.sin(angle + phase)

#Defining parameters
Theta = NP. Linspace (0., 2 * NP. PI, 250) # X-axis
np.random.seed(100)
noise = 0.2 * np.random.random(250)
Y = stock (. 1,. 2, theta, 1.2) + noise # Y axis

#Create a 6X5 main container
fig = plt.figure(figsize=(6, 5))
plt.subplots_adjust(bottom = 0., left = 0, top = 1., right = 1)

#Create the first axis, the upper left corner of the graph with a green graph
sub1 = fig.add_ Subplot (2,2,1) # two rows, two columns, first cell
sub1.plot(theta, y, color = 'green')
sub1.set_xlim(1, 2)
sub1.set_ylim(0.2, .5)
sub1.set_ylabel('y', labelpad = 15)

#Create a second axis, the orange axis in the upper left corner
sub2 = fig.add_ Subplot (2,2,2) # two rows, two columns, second cell
sub2.plot(theta, y, color = 'orange')
sub2.set_xlim(5, 6)
sub2.set_ylim(.4, 1)

#Create a third axis, a combination of the third and fourth cells
sub3 = fig.add_ Subplot (2,2, (3,4)) # two rows and two columns, merging the third and fourth cells
sub3.plot(theta, y, color = 'darkorchid', alpha = .7)
sub3.set_xlim(0, 6.5)
sub3.set_ylim(0, 1)
sub3.set_xlabel(r'$\theta$ (rad)', labelpad = 15)
sub3.set_ylabel('y', labelpad = 15)

#Create a blocking area in the third axis
sub3.fill_ Between ((1,2), 0,1, facecolor ='green ', alpha = 0.2) # the blocking region of the first axis
sub3.fill_ Between ((5,6), 0,1, facecolor ='Orange ', alpha = 0.2) # blocking region of the second axis

#Create a connectionpatch for the first axis on the left
con1 = ConnectionPatch(xyA=(1, .2), coordsA=sub1.transData, 
                       xyB=(1, .3), coordsB=sub3.transData, color = 'green')
#Add to left
fig.add_artist(con1)

#Create a connectionpatch for the first axis on the right
con2 = ConnectionPatch(xyA=(2, .2), coordsA=sub1.transData, 
                       xyB=(2, .3), coordsB=sub3.transData, color = 'green')
#Add to right
fig.add_artist(con2)

#Create a connectionpatch for the second axis on the left
con3 = ConnectionPatch(xyA=(5, .4), coordsA=sub2.transData, 
                       xyB=(5, .5), coordsB=sub3.transData, color = 'orange')
#Add to left
fig.add_artist(con3)

#Create a connectionpatch for the second axis on the right
con4 = ConnectionPatch(xyA=(6, .4), coordsA=sub2.transData, 
                       xyB=(6, .9), coordsB=sub3.transData, color = 'orange')
#Add to right
fig.add_artist(con4)

#Save the drawing and keep the margins
plt.savefig('zoom_effect_2.png', dpi = 300, bbox_inches = 'tight', pad_inches = .1)

The code will give you an excellent zoom effect, as shown in Figure 7.

3. Create Legend

Do you have many illustrations to show? If so, they need to be placed outside the spindle.

To place the legend outside the main container, you need to use this code to adjust the position

plt.legend(bbox_ to_ Anchor = (1.05, 1.04)) # location of legend

The values 1.05 and 1.04 are in the X and y-axis coordinates toward the main container. You can change it. Now, apply the above code to our code,

#Using lambda to create wave function
wave = lambda amp, angle, phase: amp * np.sin(angle + phase)

#Setting parameter values
theta = np.linspace(0., 2 * np.pi, 100)
amp = np.linspace(0, .5, 5)
phase = np.linspace(0, .5, 5)

#Create the main container and its title
plt.figure()
plt.title(r'Wave Function $y = \gamma \sin(\theta + \phi_0) $', pad = 15)

#Create a plot for each amplifier and stage
for i in range(len(amp)):
    lgd1 = str(amp[i])
    lgd2 = str(phase[i])
    plt.plot(theta, wave(amp[i], theta, phase[i]), label = (r'$\gamma = $'+lgd1+', $\phi = $' +lgd2))
    
plt.xlabel(r'$\theta$ (rad)', labelpad = 15)
plt.ylabel('y', labelpad = 15)

#Adjust legend
plt.legend(bbox_to_anchor=(1.05, 1.04))

#Save the drawing and keep the margins
plt.savefig('outbox_legend.png', dpi = 300, bbox_inches = 'tight', pad_inches = .1)

After running the code, it gives you a diagram, as shown in Figure 11.

If you want to make the legend box more beautiful, you can use the following code to add a shadow effect. It will display a graph, as shown in Figure 12.

plt.legend(bbox_to_anchor=(1.05, 1.04), shadow=True)

4. Create continuous error chart

Over the past decade, the style of data visualization has shifted to a clean drawing theme. We can see this change by reading some new papers in international journals or web pages. One of the most popular methods is to visualize data with continuous errors instead of using error bars. You can see it in Figure 13.

Fig. 13 is an example of the use offill_betweenGenerated. In fill_ In the between syntax, you need to define the upper and lower limits, as shown in Figure 14.

To apply it, use the following code.

plt.fill_between(x, upper_limit, lower_limit)

The upper and lower limits of parameters are interchangeable. This is the complete code.

N = 9
x = np.linspace(0, 6*np.pi, N)

mean_stock = (stock(.1, .2, x, 1.2))
np.random.seed(100)
upper_stock = mean_stock + np.random.randint(N) * 0.02
lower_stock = mean_stock - np.random.randint(N) * 0.015

plt.plot(x, mean_stock, color = 'darkorchid', label = r'$y = \gamma \sin(\theta + \phi_0)$')

plt.fill_between(x, upper_stock, lower_stock, alpha = .1, color = 'darkorchid')
plt.grid(alpha = .2)

plt.xlabel(r'$\theta$ (rad)', labelpad = 15)
plt.ylabel('y', labelpad = 15)
plt.legend()
plt.savefig('fill_between.png', dpi = 300, bbox_inches = 'tight', pad_inches = .1)

5. Adjust the margin

If you analyze each line of code above,plt.savefig()Then there will be a complex parameter: bbox_ Includes and pad_ inches。 When you are writing a journal or article, they will provide you with margin. If they are not included, the margins of the drawing will be larger after saving. Figure 15 shows a bbox_ Includes and pad_ Includes and different drawings without them.

I don’t think you can see the difference between the two figures in Figure 15. I’ll try to display it with a different background color, as shown in Figure 16.

Again, this technique will help you when you insert your chart into a paper or an article. You don’t need to crop it to save space.

conclusion

Matplotlib is a multi platform library that can be used in many operating systems. It’s one of the old libraries for visualizing data, but it’s still powerful. Because developers always make some updates according to the trend of data visualization. Some of the techniques mentioned above are examples of updates.

Link to the original text:https://towardsdatascience.com/5-powerful-tricks-to-visualize-your-data-with-matplotlib-16bc33747e05

Welcome to panchuang AI blog:
http://panchuang.net/

Sklearn machine learning official Chinese document:
http://sklearn123.com/

Welcome to pancreato blog Resource Hub:
http://docs.panchuang.net/