Original address:https://miguendes.me/everythi…
By Miguel Brito
Translator: Dean Wu
This article discusses Pythonnamedtuple
The key usage of. We will introduce from the simple to the deepnamedtuple
The concept of. You’ll learn why and how to use them, so the code is simpler. After studying this guide, you will love to use it.
Learning objectives
At the end of this tutorial, you should be able to:
- Learn why and when to use it
- Convert regular tuples and dictionaries to
Namedtuple
- take
Namedtuple
Convert to dictionary or regular tuple - Yes
Namedtuple
Sort the list - understand
Namedtuple
And data class - Create with optional fields
Namedtuple
- take
Namedtuple
Serialize to JSON - Add document string (docstring)
Why use itnamedtuple
?
namedtuple
It’s a very interesting (and underrated) data structure. We can easily find Python code that relies heavily on regular tuples and dictionaries to store data. I’m not saying that it’s not good. It’s just that sometimes they are often abused and listen to me.
Suppose you have a function that converts strings to colors. Colors must be represented in 4-dimensional RGBA.
def convert_string_to_color(desc: str, alpha: float = 0.0):
if desc == "green":
return 50, 205, 50, alpha
elif desc == "blue":
return 0, 0, 255, alpha
else:
return 0, 0, 0, alpha
Then we can use it like this:
r, g, b, a = convert_string_to_color(desc="blue", alpha=1.0)
OK, yes. But we have a few problems here. The first is that the order of the returned values cannot be guaranteed. In other words, nothing can prevent other developers from calling like this
convert_string_to_color:
g, b, r, a = convert_string_to_color(desc="blue", alpha=1.0)
In addition, we may not know that the function returns four values. We may call the function as follows:
r, g, b = convert_string_to_color(desc="blue", alpha=1.0)
Therefore, because the return value is not enough, theValueError
Error, call failed.
Such is the case. But, you might ask, why not use a dictionary?
Python’s dictionary is a very general data structure. They are a simple way to store multiple values. However, dictionaries are not without shortcomings. Because of its flexibility, dictionaries are easy to be abused. Give Way
Let’s look at the examples after using dictionaries.
def convert_string_to_color(desc: str, alpha: float = 0.0):
if desc == "green":
return {"r": 50, "g": 205, "b": 50, "alpha": alpha}
elif desc == "blue":
return {"r": 0, "g": 0, "b": 255, "alpha": alpha}
else:
return {"r": 0, "g": 0, "b": 0, "alpha": alpha}
Well, we can now use it like this, expecting only one value to be returned:
color = convert_string_to_color(desc="blue", alpha=1.0)
There is no need to remember the order, but it has at least two disadvantages. The first is that we have to track the name of the key. If we change it{"r": 0, “g”: 0, “b”: 0, “alpha”: alpha}
by{”red": 0, “green”: 0, “blue”: 0, “a”: alpha}
When you access the field, you will getKeyError
Back because of the keyr,g,b
andalpha
No longer exists.
The second problem with dictionaries is that they are not hashable. This means that we can’t store them in set or other dictionaries. Suppose we want to track how many colors a particular image has. If we usecollections.Counter
Count, we’ll getTypeError: unhashable type: ‘dict’
。
Moreover, the dictionary is variable, so we can add any number of new keys as needed. Believe me, these are some nasty mistakes that are hard to spot.
Okay, good. So what now? What can I use instead?
namedtuple
! Yes, that’s it!
Convert our function to usenamedtuple
:
from collections import namedtuple
...
Color = namedtuple("Color", "r g b alpha")
...
def convert_string_to_color(desc: str, alpha: float = 0.0):
if desc == "green":
return Color(r=50, g=205, b=50, alpha=alpha)
elif desc == "blue":
return Color(r=50, g=0, b=255, alpha=alpha)
else:
return Color(r=50, g=0, b=0, alpha=alpha)
As with dict, we can assign values to individual variables and use them as needed. There is no need to remember the order. Moreover, if you use ide such as pychar and vscode, you can automatically prompt for completion.
color = convert_string_to_color(desc="blue", alpha=1.0)
...
has_alpha = color.alpha > 0.0
...
is_black = color.r == 0 and color.g == 0 and color.b == 0
most important of allnamedtuple
It is immutable. If another developer on the team thinks it’s a good idea to add new fields at run time, the program will report an error.
>>> blue = Color(r=0, g=0, b=255, alpha=1.0)
>>> blue.e = 0
---------------------------------------------------------------------------
AttributeError Traceback (most recent call last)
<ipython-input-13-8c7f9b29c633> in <module>
----> 1 blue.e = 0
AttributeError: 'Color' object has no attribute 'e'
Not only that, we can now use its counter to track how many colors a collection has.
>>> Counter([blue, blue])
>>> Counter({Color(r=0, g=0, b=255, alpha=1.0): 2})
How to convert a regular tuple or dictionary to a named double
Now that we know why we use namedtuple, it’s time to learn how to convert regular tuples and dictionaries into named tuples. Suppose, for some reason, that you have a dictionary instance that contains color RGBA values. If you want to convert it toColor namedtuple
The following steps can be taken:
>>> c = {"r": 50, "g": 205, "b": 50, "alpha": alpha}
>>> Color(**c)
>>> Color(r=50, g=205, b=50, alpha=0)
We can take advantage of this**
The structure will be decompresseddict
bynamedtuple
。
What if I want to create a namedtupe from dict?
No problem. Here’s how to do it
>>> c = {"r": 50, "g": 205, "b": 50, "alpha": alpha}
>>> Color = namedtuple("Color", c)
>>> Color(**c)
Color(r=50, g=205, b=50, alpha=0)
By passing the dict instance to the namedtuple factory function, it will create fields for you. Then, color decompresses dictionary C like the example above to create a new instance.
How to convert a named double to a dictionary or regular tuple
We just learned how to convertnamedtuple
bydict
。 And vice versa? How can we convert it into a dictionary instance?
Experiments show that named duplex has a method called._asdict()
。 Therefore, converting it is as simple as calling a method.
>>> blue = Color(r=0, g=0, b=255, alpha=1.0)
>>> blue._asdict()
{'r': 0, 'g': 0, 'b': 255, 'alpha': 1.0}
You may want to know why the method uses_
start. This is one of the inconsistencies with Python’s regular specifications. Usually,_
Represents a private method or property. But,namedtuple
for fear ofname conflictThey are added to the public method. except_asdict
, and_replace
,_fields
and_field_defaults
。 You can use thehereFind all of these.
We shouldnamedtupe
To a regular tuple, just pass it to the tuple constructor.
>>> tuple(Color(r=50, g=205, b=50, alpha=0.1))
(50, 205, 50, 0.1)
How to sort the namedtables list
Another common use case is to combine multiplenamedtuple
And sort them in the list according to some conditions. For example, suppose we have a list of colors that we need to sort by alpha strength.
Fortunately, python allows you to do this in a very Python way. We can use itoperator.attrgetter
Operator. according tofile,attrgetter
“Returns the callable object that gets attr from its operands.”. In short, we can use this operator to get the fields passed to the sorted function for sorting. For example:
from operator import attrgetter
...
colors = [
Color(r=50, g=205, b=50, alpha=0.1),
Color(r=50, g=205, b=50, alpha=0.5),
Color(r=50, g=0, b=0, alpha=0.3)
]
...
>>> sorted(colors, key=attrgetter("alpha"))
[Color(r=50, g=205, b=50, alpha=0.1),
Color(r=50, g=0, b=0, alpha=0.3),
Color(r=50, g=205, b=50, alpha=0.5)]
Now, the list of colors is in ascending order of alpha intensity!
How to serialize namedtuples to JSON
Sometimes you may need to storenamedtuple
To JSON. Python dictionaries can be converted to JSON through the JSON module. So we can use it_ The asdict method converts tuples into dictionaries, and then just like dictionaries. For example:
>>> blue = Color(r=0, g=0, b=255, alpha=1.0)
>>> import json
>>> json.dumps(blue._asdict())
'{"r": 0, "g": 0, "b": 255, "alpha": 1.0}'
How to add docstring to namedtuple
In Python, we can use pure strings to record methods, classes, and modules. This string can then be used as a special property named__doc__
。 Having said that, how do we respond to usColor namedtuple
Add docstring?
We can do this in two ways. The first (more cumbersome) is the use of wrappers to extend tuples. In this way, we can define docstring in this wrapper. For example, consider the following code snippet:
_Color = namedtuple("Color", "r g b alpha")
class Color(_Color):
"""A namedtuple that represents a color.
It has 4 fields:
r - red
g - green
b - blue
alpha - the alpha channel
"""
>>> print(Color.__doc__)
A namedtuple that represents a color.
It has 4 fields:
r - red
g - green
b - blue
alpha - the alpha channel
>>> help(Color)
Help on class Color in module __main__:
class Color(Color)
| Color(r, g, b, alpha)
|
| A namedtuple that represents a color.
| It has 4 fields:
| r - red
| g - green
| b - blue
| alpha - the alpha channel
|
| Method resolution order:
| Color
| Color
| builtins.tuple
| builtins.object
|
| Data descriptors defined here:
|
| __dict__
| dictionary for instance variables (if defined)
As above, by inheritance_Color
Tuple, we add a__doc__
Property.
Add the second method, set directly__doc__
Property. This method does not need to extend tuples.
>>> Color.__doc__ = """A namedtuple that represents a color.
It has 4 fields:
r - red
g - green
b - blue
alpha - the alpha channel
"""
Note that these methods only apply toPython 3+
。
What is the difference between namedtuples and data class?
function
Before Python 3.7, you could create a simple data container using any of the following methods:
- namedtuple
- General class
- Third party library,
attrs
If you want to use regular classes, that means you will have to implement several methods. For example, a regular class will need a__init__
Method to set properties during class instantiation. If you want the class to be hashable, you mean implementing one yourself__hash__
method. To compare different objects, you also need to__eq__
Implement a method. Finally, to simplify debugging, you need a__repr__
method.
Let’s use the regular class to implement our color use case.
class Color:
"""A regular class that represents a color."""
def __init__(self, r, g, b, alpha=0.0):
self.r = r
self.g = g
self.b = b
self.alpha = alpha
def __hash__(self):
return hash((self.r, self.g, self.b, self.alpha))
def __repr__(self):
return "{0}({1}, {2}, {3}, {4})".format(
self.__class__.__name__, self.r, self.g, self.b, self.alpha
)
def __eq__(self, other):
if not isinstance(other, Color):
return False
return (
self.r == other.r
and self.g == other.g
and self.b == other.b
and self.alpha == other.alpha
)
As mentioned above, you need to implement many methods. You just need a container to hold the data for you without worrying about distracting details. Again, one of the key differences that people prefer to implement classes is that regular classes are mutable.
In fact, the introduction ofData class
OfPEPCall them “variable namedtuples with default values.”https://docs.python.org/zh-cn…。
Now, let’s see how to use itData class
To achieve.
from dataclasses import dataclass
...
@dataclass
class Color:
"""A regular class that represents a color."""
r: float
g: float
b: float
alpha: float
WOW! It’s that simple. Because there is no__init__
You just need to define the property after the docstring. In addition, you must annotate it with type hints.
In addition to being mutable, data classes can be used out of the box to provide optional fields. Suppose our color class doesn’t need an alpha field. Then we can set it to optional.
from dataclasses import dataclass
from typing import Optional
...
@dataclass
class Color:
"""A regular class that represents a color."""
r: float
g: float
b: float
alpha: Optional[float]
We can instantiate it like this:
>>> blue = Color(r=0, g=0, b=255)
Because they are mutable, we can change any fields we need. We can instantiate it like this:
>>> blue = Color(r=0, g=0, b=255)
>>> blue.r = 1
>>>You can set more property fields
>>> blue.e = 10
By contrast,namedtuple
By default, there are no optional fields. We need a little bit of programming and a little bit of skill.
Tip: to add__hash__
Method, you need to set theunsafe_hash
To make it immutableTrue
:
@dataclass(unsafe_hash=True)
class Color:
...
Another difference is that unpacking is a first class citizen feature of named tops. If you want toData class
If you have the same behavior, you must realize yourself.
from dataclasses import dataclass, astuple
...
@dataclass
class Color:
"""A regular class that represents a color."""
r: float
g: float
b: float
alpha: float
def __iter__(self):
yield from dataclasses.astuple(self)
performance comparison
It’s not enough to just compare functionality, and namedtuple and data classes also differ in performance. The data class implements dict based on pure python. This makes them faster when accessing fields. Namedtuples, on the other hand, are just a regular extension of tuple. This means that their implementation is based on faster C code and has a smaller memory footprint.
To prove this, consider doing this experiment on Python 3.8.5.
In [6]: import sys
In [7]: ColorTuple = namedtuple("Color", "r g b alpha")
In [8]: @dataclass
...: class ColorClass:
...: """A regular class that represents a color."""
...: r: float
...: g: float
...: b: float
...: alpha: float
...:
In [9]: color_tup = ColorTuple(r=50, g=205, b=50, alpha=1.0)
In [10]: color_cls = ColorClass(r=50, g=205, b=50, alpha=1.0)
In [11]: %timeit color_tup.r
36.8 ns ± 0.109 ns per loop (mean ± std. dev. of 7 runs, 10000000 loops each)
In [12]: %timeit color_cls.r
38.4 ns ± 0.112 ns per loop (mean ± std. dev. of 7 runs, 10000000 loops each)
In [15]: sys.getsizeof(color_tup)
Out[15]: 72
In [16]: sys.getsizeof(color_cls) + sys.getsizeof(vars(color_cls))
Out[16]: 152
As mentioned above, data classes access fields a little faster in, but they take up more memory space than nametuple.
How to add type hints to namedtuple
Data classes use type hints by default. We can also put them on named tops. We can annotate color tuples by importing and inheriting from the namedtuple annotation type.
from typing import NamedTuple
...
class Color(NamedTuple):
"""A namedtuple that represents a color."""
r: float
g: float
b: float
alpha: float
Another detail that may not be noticed is that this approach also allows us to use docstring. If you type, help (color) we will be able to see them.
Help on class Color in module __main__:
class Color(builtins.tuple)
| Color(r: float, g: float, b: float, alpha: Union[float, NoneType])
|
| A namedtuple that represents a color.
|
| Method resolution order:
| Color
| builtins.tuple
| builtins.object
|
| Methods defined here:
|
| __getnewargs__(self)
| Return self as a plain tuple. Used by copy and pickle.
|
| __repr__(self)
| Return a nicely formatted representation string
|
| _asdict(self)
| Return a new dict which maps field names to their values.
How to add optional default values to namedtuple
In the previous section, we learned that data classes can have optional values. In addition, I mentioned to imitate the same behavior,namedtuple
Some operation skills are needed. It turns out that we can use inheritance, as shown in the following example.
from collections import namedtuple
class Color(namedtuple("Color", "r g b alpha")):
__slots__ = ()
def __new__(cls, r, g, b, alpha=None):
return super().__new__(cls, r, g, b, alpha)
>>> c = Color(r=0, g=0, b=0)
>>> c
Color(r=0, g=0, b=0, alpha=None)
conclusion
Tuples are a very powerful data structure. Make them cleaner and more reliable. Although with the newData class
The competition is fierce, but they still have a lot of scenarios available. In this tutorial, we learned how to usenamedtuples
There are several ways you can use them.