Translation: practical Python Programming 02_ 04_ Sequences

Time:2021-5-4

Contents | previous section (2.3 formatting) | next section (2.5 collections module)

2.4 sequence

Sequence data type

Python has three types of sequential data.

  • String: such as'Hello'. A string is a sequence of characters
  • List: such as[1, 4, 5]
  • Tuple: such as('GOOG', 100, 490.1)

All sequences are ordered, indexed by integers, and of length.

a = 'Hello'               # String
b = [1, 4, 5]             # List
c = ('GOOG', 100, 490.1)  # Tuple

# Indexed order
a[0]                      # 'H'
b[-1]                     # 5
c[1]                      # 100

# Length of sequence
len(a)                    # 5
len(b)                    # 3
len(c)                    # 3

The sequence can be repeated by the repeat operator *s * n

>>> a = 'Hello'
>>> a * 3
'HelloHelloHello'
>>> b = [1, 2, 3]
>>> b * 2
[1, 2, 3, 1, 2, 3]
>>>

The same type of sequences can be spliced by plus sign +s + t

>>> a = (1, 2, 3)
>>> b = (4, 5)
>>> a + b
(1, 2, 3, 4, 5)
>>>
>>> c = [1, 5]
>>> a + c
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
TypeError: can only concatenate tuple (not "list") to tuple

section

Slicing is to extract subsequences from sequences. The syntax of slicing iss[start:end]startandendIs the index of the desired subsequence.

a = [0,1,2,3,4,5,6,7,8]

a[2:5]    # [2,3,4]
a[-5:]    # [4,5,6,7,8]
a[:3]     # [0,1,2]
  • IndexesstartandendMust be an integer.
  • The slice does not include the end value. It’s like a half open interval in mathematics.
  • If indexes are omitted, they default to the beginning or end of the sequence.

Slicing and reassignment

On the list, slices can be reassigned and deleted.

# Reassignment
a = [0,1,2,3,4,5,6,7,8]
a[2:4] = [10,11,12]       # [0,1,10,11,12,4,5,6,7,8]

Note: reassigned slices do not need to have the same length.

# Deletion
a = [0,1,2,3,4,5,6,7,8]
del a[2:4]                # [0,1,4,5,6,7,8]

Sequence reduction

There is a common function to reduce a sequence to a single value.

>>> s = [1, 2, 3, 4]
>>> sum(s)
10
>>> min(s)
1
>>> max(s)
4
>>> t = ['Hello', 'World']
>>> max(t)
'World'
>>>

Iterative sequence

You can use the for loop to iterate over the elements in the sequence.

>>> s = [1, 4, 9, 16]
>>> for i in s:
...     print(i)
...
1
4
9
16
>>>

In each iteration of the loop, a new item is obtained for processing. The new value is put into the iteration variable. In this example, the iteration variable is X:

for x in s:         # `x` is an iteration variable
    ...statements

In each iteration, the previous value of the iteration variable is overridden, if any. At the end of the loop, the iteration variable retains the last value.

Break statement

have access tobreakThe statement jumps out of the loop ahead of time.

for name in namelist:
    if name == 'Jake':
        break
    ...
    ...
statements

WhenbreakWhen a statement is executed, it exits the loop and advances to the next statement.breakStatement applies only to the innermost loop. If this loop is inside another loop, thenbreakThe external loop is not interrupted.

Continue statement

To skip one element and go to the next, use thecontinuesentence.

for line in lines:
    if line == '\n':    # Skip blank lines
        continue
    # More statements
    ...

If the current item is not important or needs to be ignored during processing, usecontinueSentences are useful.

Traversing integers

If you need to count, use therange()Function.

for i in range(100):
    # i = 0,1,...,99

The syntax of the range() function isrange([start,] end [,step])

for i in range(100):
    # i = 0,1,...,99
for j in range(10,20):
    # j = 10,11,..., 19
for k in range(10,50,2):
    # k = 10,12,...,48
    # Notice how it counts in steps of 2, not 1.
  • The ending value is not included. This is similar to slicing.
  • startIs optional, the default value is0
  • stepIs optional, the default value is1
  • When a value is neededrange()In fact, it doesn’t store a large range of numbers.

Enumerate() function

enumerateFunction to add an extra count to the iteration.

names = ['Elwood', 'Jake', 'Curtis']
for i, name in enumerate(names):
    # Loops with i = 0, name = 'Elwood'
    # i = 1, name = 'Jake'
    # i = 2, name = 'Curtis'

The general format isenumerate(sequence [, start = 0])startIs optional, a good use example: to track the number of rows when reading a file.

with open(filename) as f:
    for lineno, line in enumerate(f, start=1):
        ...

enumerateIt can be regarded as the abbreviation of the following statement:

i = 0
for x in s:
    statements
    i += 1

useenumerateFunction can reduce input and run a little faster.

For and tuples

Multiple variables can be iterated:

points = [
  (1, 4),(10, 40),(23, 14),(5, 6),(7, 8)
]
for x, y in points:
    # Loops with x = 1, y = 4
    #            x = 10, y = 40
    #            x = 23, y = 14
    #            ...

When multiple variables are used, each tuple is unpacked as a set of iterative variables. The number of variables must match the number of items in each tuple.

Zip() function

zipFunction takes multiple sequences and generates iterators that combine them.

columns = ['name', 'shares', 'price']
values = ['GOOG', 100, 490.1 ]
pairs = zip(columns, values)
# ('name','GOOG'), ('shares',100), ('price',490.1)

To get results, you have to iterate. You can unpack tuples with multiple variables as shown earlier.

for column, value in pairs:
    ...

zipA common use of the function is to create key value pairs for constructing dictionaries.

d = dict(zip(columns, values))

practice

Exercise 2.13: counting

Try some basic counting examples:

>>> for n in range(10):            # Count 0 ... 9
        print(n, end=' ')

0 1 2 3 4 5 6 7 8 9
>>> for n in range(10,0,-1):       # Count 10 ... 1
        print(n, end=' ')

10 9 8 7 6 5 4 3 2 1
>>> for n in range(0,10,2):        # Count 0, 2, ... 8
        print(n, end=' ')

0 2 4 6 8
>>>

Exercise 2.14: more sequence operations

Experiment with some sequence reduction operations interactively.

>>> data = [4, 9, 1, 25, 16, 100, 49]
>>> min(data)
1
>>> max(data)
100
>>> sum(data)
204
>>>

Try traversing the data.

>>> for x in data:
        print(x)

4
9
...
>>> for n, x in enumerate(data):
        print(n, x)

0 4
1 9
2 1
...
>>>

occasionally,forsentence,len()andrange()Functions are used by beginners in scary code snippets that look like they come from old C programs.

>>> for n in range(len(data)):
        print(data[n])

4
9
1
...
>>>

Don’t do that. Reading these codes is not only eye-catching, but also memory inefficient and slow. If you want to iterate over the data, use the normalforJust cycle. If you happen to need an index for some reason, use theenumerate()Function.

Exercise 2.15: example of using the enumerate() function

In retrospect,Data/missing.csvThe file contains data for a stock portfolio, but some lines are missing values. Please useenumerate()Function modificationpcost.pyProgram to print a line number with a warning message when an incorrect input is encountered.

>>> cost = portfolio_cost('Data/missing.csv')
Row 4: Couldn't convert: ['MSFT', '', '51.23']
Row 7: Couldn't convert: ['IBM', '', '70.44']
>>>

To do this, you need to modify part of the code.

...
for rowno, row in enumerate(rows, start=1):
    try:
        ...
    except ValueError:
        print(f'Row {rowno}: Bad row: {row}')

Exercise 2.16: using the zip() function

stayData/portfolio.csvIn the file, the first row contains column headings. Of all the previous code, we dropped it.

>>> f = open('Data/portfolio.csv')
>>> rows = csv.reader(f)
>>> headers = next(rows)
>>> headers
['name', 'shares', 'price']
>>>

But what if the title is going to be used for something else useful? This involveszip()Function. First, try to pair the file title with the data row.

>>> row = next(rows)
>>> row
['AA', '100', '32.20']
>>> list(zip(headers, row))
[ ('name', 'AA'), ('shares', '100'), ('price', '32.20') ]
>>>

Please note thatzip()Function pairs the column header with the column value. Here, we uselist()Function to convert the result to a list for viewing. In general,zip()Function to create an iterator that must be used by the for loop.

This pairing is an intermediate step in building a dictionary. Now try:

>>> record = dict(zip(headers, row))
>>> record
{'price': '32.20', 'name': 'AA', 'shares': '100'}
>>>

This transformation is one of the most useful techniques when dealing with a large number of data files. For example, suppose you need to makepcost.pyThe program processes all kinds of input files, but does not consider the name, share, and the number of the price column.

modifypcost.pyIn the programportfolio_cost()To make it look like this:

# pcost.py

def portfolio_cost(filename):
    ...
        for rowno, row in enumerate(rows, start=1):
            record = dict(zip(headers, row))
            try:
                nshares = int(record['shares'])
                price = float(record['price'])
                total_cost += nshares * price
            # This catches errors in int() and float() conversions above
            except ValueError:
                print(f'Row {rowno}: Bad row: {row}')
        ...

Now, in a completely different data fileData/portfoliodate.csvTry portfolio on (as shown below)_ Cost() function.

name,date,time,shares,price
"AA","6/11/2007","9:50am",100,32.20
"IBM","5/13/2007","4:20pm",50,91.10
"CAT","9/23/2006","1:30pm",150,83.44
"MSFT","5/17/2007","10:30am",200,51.23
"GE","2/1/2006","10:45am",95,40.37
"MSFT","10/31/2006","12:05pm",50,65.10
"IBM","7/9/2006","3:15pm",100,70.44
>>> portfolio_cost('Data/portfoliodate.csv')
44671.15
>>>

If the operation is correct, you will find that the program can still run normally, even if the column format of the data file is completely different from the previous one, which is cool!

The changes made here are subtle, but significant. New editionportfolio_cost()You can read any CSV file and select the values you want instead of hard coding to read a single fixed file format. The code works as long as the file has the necessary columns.

Modify thereport.pyProgram so that you can use the same technique to pick out column headings.

Try toData/portfoliodate.csvFile as input, runreport.pyProgram, and observe whether the same answer is generated as before.

Exercise 2.17: flip the dictionary

The dictionary maps keys to values. For example, a dictionary of stock prices.

>>> prices = {
        'GOOG' : 490.1,
        'AA' : 23.45,
        'IBM' : 91.1,
        'MSFT' : 34.23
    }
>>>

If you use a dictionaryitems()Method, you can get the key value pair(key,value)

>>> prices.items()
dict_items([('GOOG', 490.1), ('AA', 23.45), ('IBM', 91.1), ('MSFT', 34.23)])
>>>

But if you want to get(value, key)What about the key value pair list?

Tip: usezip()Function.

>>> pricelist = list(zip(prices.values(),prices.keys()))
>>> pricelist
[(490.1, 'GOOG'), (23.45, 'AA'), (91.1, 'IBM'), (34.23, 'MSFT')]
>>>

Why do you do this? First, this allows for the exact type of data processing to be performed on dictionary data.

>>> min(pricelist)
(23.45, 'AA')
>>> max(pricelist)
(490.1, 'GOOG')
>>> sorted(pricelist)
[(23.45, 'AA'), (34.23, 'MSFT'), (91.1, 'IBM'), (490.1, 'GOOG')]
>>>

Secondly, it also shows an important feature of tuples. When tuples are used in comparison, they are compared element by element starting from the first item, similar to the comparison of characters in a string one by one.

zip()Functions are often used to pair data from different places. For example, to build a dictionary with named values, pair column names and column values.

Please note that,zip()Functions are not limited to one pair. For example, you can use any number of lists as input.

>>> a = [1, 2, 3, 4]
>>> b = ['w', 'x', 'y', 'z']
>>> c = [0.2, 0.4, 0.6, 0.8]
>>> list(zip(a, b, c))
[(1, 'w', 0.2), (2, 'x', 0.4), (3, 'y', 0.6), (4, 'z', 0.8))]
>>>

In addition, please note that once the shortest input sequence is exhausted,zip()The function will stop.

>>> a = [1, 2, 3, 4, 5, 6]
>>> b = ['x', 'y', 'z']
>>> list(zip(a,b))
[(1, 'x'), (2, 'y'), (3, 'z')]
>>>

Contents | previous section (2.3 formatting) | next section (2.5 collections module)

Note: please refer to https://github.com/codists/practical-python-zh