Translation: practical Python Programming 02_ 04_ Sequences

Time：2021-5-4

Contents | previous section (2.3 formatting) | next section (2.5 collections module)

2.4 sequence

Sequence data type

Python has three types of sequential data.

• String: such as`'Hello'`. A string is a sequence of characters
• List: such as`[1, 4, 5]`
• Tuple: such as`('GOOG', 100, 490.1)`

All sequences are ordered, indexed by integers, and of length.

``````a = 'Hello'               # String
b = [1, 4, 5]             # List
c = ('GOOG', 100, 490.1)  # Tuple

# Indexed order
a[0]                      # 'H'
b[-1]                     # 5
c[1]                      # 100

# Length of sequence
len(a)                    # 5
len(b)                    # 3
len(c)                    # 3``````

The sequence can be repeated by the repeat operator *`s * n`

``````>>> a = 'Hello'
>>> a * 3
'HelloHelloHello'
>>> b = [1, 2, 3]
>>> b * 2
[1, 2, 3, 1, 2, 3]
>>>``````

The same type of sequences can be spliced by plus sign +`s + t`

``````>>> a = (1, 2, 3)
>>> b = (4, 5)
>>> a + b
(1, 2, 3, 4, 5)
>>>
>>> c = [1, 5]
>>> a + c
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
TypeError: can only concatenate tuple (not "list") to tuple``````

section

Slicing is to extract subsequences from sequences. The syntax of slicing is`s[start:end]``start`and`end`Is the index of the desired subsequence.

``````a = [0,1,2,3,4,5,6,7,8]

a[2:5]    # [2,3,4]
a[-5:]    # [4,5,6,7,8]
a[:3]     # [0,1,2]``````
• Indexes`start`and`end`Must be an integer.
• The slice does not include the end value. It’s like a half open interval in mathematics.
• If indexes are omitted, they default to the beginning or end of the sequence.

Slicing and reassignment

On the list, slices can be reassigned and deleted.

``````# Reassignment
a = [0,1,2,3,4,5,6,7,8]
a[2:4] = [10,11,12]       # [0,1,10,11,12,4,5,6,7,8]``````

Note: reassigned slices do not need to have the same length.

``````# Deletion
a = [0,1,2,3,4,5,6,7,8]
del a[2:4]                # [0,1,4,5,6,7,8]``````

Sequence reduction

There is a common function to reduce a sequence to a single value.

``````>>> s = [1, 2, 3, 4]
>>> sum(s)
10
>>> min(s)
1
>>> max(s)
4
>>> t = ['Hello', 'World']
>>> max(t)
'World'
>>>``````

Iterative sequence

You can use the for loop to iterate over the elements in the sequence.

``````>>> s = [1, 4, 9, 16]
>>> for i in s:
...     print(i)
...
1
4
9
16
>>>``````

In each iteration of the loop, a new item is obtained for processing. The new value is put into the iteration variable. In this example, the iteration variable is X:

``````for x in s:         # `x` is an iteration variable
...statements``````

In each iteration, the previous value of the iteration variable is overridden, if any. At the end of the loop, the iteration variable retains the last value.

Break statement

have access to`break`The statement jumps out of the loop ahead of time.

``````for name in namelist:
if name == 'Jake':
break
...
...
statements``````

When`break`When a statement is executed, it exits the loop and advances to the next statement.`break`Statement applies only to the innermost loop. If this loop is inside another loop, then`break`The external loop is not interrupted.

Continue statement

To skip one element and go to the next, use the`continue`sentence.

``````for line in lines:
if line == '\n':    # Skip blank lines
continue
# More statements
...``````

If the current item is not important or needs to be ignored during processing, use`continue`Sentences are useful.

Traversing integers

If you need to count, use the`range()`Function.

``````for i in range(100):
# i = 0,1,...,99``````

The syntax of the range() function is`range([start,] end [,step])`

``````for i in range(100):
# i = 0,1,...,99
for j in range(10,20):
# j = 10,11,..., 19
for k in range(10,50,2):
# k = 10,12,...,48
# Notice how it counts in steps of 2, not 1.``````
• The ending value is not included. This is similar to slicing.
• `start`Is optional, the default value is`0`
• `step`Is optional, the default value is`1`
• When a value is needed`range()`In fact, it doesn’t store a large range of numbers.

Enumerate() function

`enumerate`Function to add an extra count to the iteration.

``````names = ['Elwood', 'Jake', 'Curtis']
for i, name in enumerate(names):
# Loops with i = 0, name = 'Elwood'
# i = 1, name = 'Jake'
# i = 2, name = 'Curtis'``````

The general format is`enumerate(sequence [, start = 0])``start`Is optional, a good use example: to track the number of rows when reading a file.

``````with open(filename) as f:
for lineno, line in enumerate(f, start=1):
...``````

`enumerate`It can be regarded as the abbreviation of the following statement:

``````i = 0
for x in s:
statements
i += 1``````

use`enumerate`Function can reduce input and run a little faster.

For and tuples

Multiple variables can be iterated:

``````points = [
(1, 4),(10, 40),(23, 14),(5, 6),(7, 8)
]
for x, y in points:
# Loops with x = 1, y = 4
#            x = 10, y = 40
#            x = 23, y = 14
#            ...``````

When multiple variables are used, each tuple is unpacked as a set of iterative variables. The number of variables must match the number of items in each tuple.

Zip() function

`zip`Function takes multiple sequences and generates iterators that combine them.

``````columns = ['name', 'shares', 'price']
values = ['GOOG', 100, 490.1 ]
pairs = zip(columns, values)
# ('name','GOOG'), ('shares',100), ('price',490.1)``````

To get results, you have to iterate. You can unpack tuples with multiple variables as shown earlier.

``````for column, value in pairs:
...``````

`zip`A common use of the function is to create key value pairs for constructing dictionaries.

``d = dict(zip(columns, values))``

practice

Exercise 2.13: counting

Try some basic counting examples:

``````>>> for n in range(10):            # Count 0 ... 9
print(n, end=' ')

0 1 2 3 4 5 6 7 8 9
>>> for n in range(10,0,-1):       # Count 10 ... 1
print(n, end=' ')

10 9 8 7 6 5 4 3 2 1
>>> for n in range(0,10,2):        # Count 0, 2, ... 8
print(n, end=' ')

0 2 4 6 8
>>>``````

Exercise 2.14: more sequence operations

Experiment with some sequence reduction operations interactively.

``````>>> data = [4, 9, 1, 25, 16, 100, 49]
>>> min(data)
1
>>> max(data)
100
>>> sum(data)
204
>>>``````

Try traversing the data.

``````>>> for x in data:
print(x)

4
9
...
>>> for n, x in enumerate(data):
print(n, x)

0 4
1 9
2 1
...
>>>``````

occasionally,`for`sentence,`len()`and`range()`Functions are used by beginners in scary code snippets that look like they come from old C programs.

``````>>> for n in range(len(data)):
print(data[n])

4
9
1
...
>>>``````

Don’t do that. Reading these codes is not only eye-catching, but also memory inefficient and slow. If you want to iterate over the data, use the normal`for`Just cycle. If you happen to need an index for some reason, use the`enumerate()`Function.

Exercise 2.15: example of using the enumerate() function

In retrospect,`Data/missing.csv`The file contains data for a stock portfolio, but some lines are missing values. Please use`enumerate()`Function modification`pcost.py`Program to print a line number with a warning message when an incorrect input is encountered.

``````>>> cost = portfolio_cost('Data/missing.csv')
Row 4: Couldn't convert: ['MSFT', '', '51.23']
Row 7: Couldn't convert: ['IBM', '', '70.44']
>>>``````

To do this, you need to modify part of the code.

``````...
for rowno, row in enumerate(rows, start=1):
try:
...
except ValueError:

Exercise 2.16: using the zip() function

stay`Data/portfolio.csv`In the file, the first row contains column headings. Of all the previous code, we dropped it.

``````>>> f = open('Data/portfolio.csv')
['name', 'shares', 'price']
>>>``````

But what if the title is going to be used for something else useful? This involves`zip()`Function. First, try to pair the file title with the data row.

``````>>> row = next(rows)
>>> row
['AA', '100', '32.20']
[ ('name', 'AA'), ('shares', '100'), ('price', '32.20') ]
>>>``````

Please note that`zip()`Function pairs the column header with the column value. Here, we use`list()`Function to convert the result to a list for viewing. In general,`zip()`Function to create an iterator that must be used by the for loop.

This pairing is an intermediate step in building a dictionary. Now try:

``````>>> record = dict(zip(headers, row))
>>> record
{'price': '32.20', 'name': 'AA', 'shares': '100'}
>>>``````

This transformation is one of the most useful techniques when dealing with a large number of data files. For example, suppose you need to make`pcost.py`The program processes all kinds of input files, but does not consider the name, share, and the number of the price column.

modify`pcost.py`In the program`portfolio_cost()`To make it look like this:

``````# pcost.py

def portfolio_cost(filename):
...
for rowno, row in enumerate(rows, start=1):
try:
nshares = int(record['shares'])
price = float(record['price'])
total_cost += nshares * price
# This catches errors in int() and float() conversions above
except ValueError:
...``````

Now, in a completely different data file`Data/portfoliodate.csv`Try portfolio on (as shown below)_ Cost() function.

``````name,date,time,shares,price
"AA","6/11/2007","9:50am",100,32.20
"IBM","5/13/2007","4:20pm",50,91.10
"CAT","9/23/2006","1:30pm",150,83.44
"MSFT","5/17/2007","10:30am",200,51.23
"GE","2/1/2006","10:45am",95,40.37
"MSFT","10/31/2006","12:05pm",50,65.10
"IBM","7/9/2006","3:15pm",100,70.44``````
``````>>> portfolio_cost('Data/portfoliodate.csv')
44671.15
>>>``````

If the operation is correct, you will find that the program can still run normally, even if the column format of the data file is completely different from the previous one, which is cool!

The changes made here are subtle, but significant. New edition`portfolio_cost()`You can read any CSV file and select the values you want instead of hard coding to read a single fixed file format. The code works as long as the file has the necessary columns.

Modify the`report.py`Program so that you can use the same technique to pick out column headings.

Try to`Data/portfoliodate.csv`File as input, run`report.py`Program, and observe whether the same answer is generated as before.

Exercise 2.17: flip the dictionary

The dictionary maps keys to values. For example, a dictionary of stock prices.

``````>>> prices = {
'GOOG' : 490.1,
'AA' : 23.45,
'IBM' : 91.1,
'MSFT' : 34.23
}
>>>``````

If you use a dictionary`items()`Method, you can get the key value pair`(key,value)`

``````>>> prices.items()
dict_items([('GOOG', 490.1), ('AA', 23.45), ('IBM', 91.1), ('MSFT', 34.23)])
>>>``````

But if you want to get`(value, key)`What about the key value pair list?

Tip: use`zip()`Function.

``````>>> pricelist = list(zip(prices.values(),prices.keys()))
>>> pricelist
[(490.1, 'GOOG'), (23.45, 'AA'), (91.1, 'IBM'), (34.23, 'MSFT')]
>>>``````

Why do you do this? First, this allows for the exact type of data processing to be performed on dictionary data.

``````>>> min(pricelist)
(23.45, 'AA')
>>> max(pricelist)
(490.1, 'GOOG')
>>> sorted(pricelist)
[(23.45, 'AA'), (34.23, 'MSFT'), (91.1, 'IBM'), (490.1, 'GOOG')]
>>>``````

Secondly, it also shows an important feature of tuples. When tuples are used in comparison, they are compared element by element starting from the first item, similar to the comparison of characters in a string one by one.

`zip()`Functions are often used to pair data from different places. For example, to build a dictionary with named values, pair column names and column values.

Please note that,`zip()`Functions are not limited to one pair. For example, you can use any number of lists as input.

``````>>> a = [1, 2, 3, 4]
>>> b = ['w', 'x', 'y', 'z']
>>> c = [0.2, 0.4, 0.6, 0.8]
>>> list(zip(a, b, c))
[(1, 'w', 0.2), (2, 'x', 0.4), (3, 'y', 0.6), (4, 'z', 0.8))]
>>>``````

In addition, please note that once the shortest input sequence is exhausted,`zip()`The function will stop.

``````>>> a = [1, 2, 3, 4, 5, 6]
>>> b = ['x', 'y', 'z']
>>> list(zip(a,b))
[(1, 'x'), (2, 'y'), (3, 'z')]
>>>``````

Contents | previous section (2.3 formatting) | next section (2.5 collections module)