Python Standard Library — 13. Built in type: binary sequence type (bytes, byte array)

Time:2019-12-2

Previous article: Python Standard Library — 12. Built in type: text sequence type (STR)
Next article:

Binary sequence type — bytes, byte array, memoryview

The core built-in types for manipulating binary data are bytes and byte array. They are supported by memoryview, which uses the buffer protocol to access the memory of other binary objects without creating a copy of the object.

The array module supports efficient storage of basic data types, such as 32-bit integers and IEEE754 double precision floating-point values.

Bytes object

The bytes object is an immutable sequence of individual bytes. Because many of the major binary protocols are based on ASCII text encoding, the bytes object provides methods that are only available for processing ASCII compatible data, and are closely related to string objects in many properties.

class bytes([source[, encoding[, errors]]])

First, the syntax for representing the bytes literal is roughly the same as the string literal, except that a B prefix is added:

    Single quotes: B 'also allows embedding of "double" quotes.

    Double quotes: B "also allows embedding of 'single' quotes.".

    Triple quote: B '' 'triple single quote' ', B' '"triple double quote'

Only ASCII characters (regardless of the encoding declared by the source code) are allowed in the bytes literal value. Any binary value greater than 127 must have the bytes literal added using the corresponding escape sequence form.

Like string literals, bytes literals can also use the R prefix to disable escape sequence processing. See string and byte string literals for details on various byte literal forms, including supported escape sequences.

Although the bytes literal value and representation are based on ASCII text, the behavior of the bytes object is actually more like an immutable sequence of integers, and the size of each value in the sequence is limited to 0 < = x < 256 (if this limit is violated, a valueerror will be raised). This limitation is intentionally designed to emphasize the fact that although many binary formats contain ASCII based elements that can be used for useful operations through some text-oriented algorithms, it is usually not the case for any binary data (blindly applying text processing algorithms to incompatible ASCII binary data formats will often result in data loss Bad.

In addition to the literal form, the bytes object can be created in several other ways:
  • Bytes object with zero padding of specified length: bytes (10)
  • Through the iteratable object composed of integers: bytes (range (20))
  • Copy existing binary data through buffer protocol: bytes (obj)

    See also bytes built-in types.

    Because two hexadecimal numbers exactly correspond to one byte, hexadecimal number is a common format to describe binary data. Accordingly, the bytes type has additional class methods that read data from this format:

classmethod fromhex(string)

This bytes class method returns a bytes object that decodes the given string. The string must be composed of two hexadecimal digits representing each byte, where the ASCII whitespace is ignored.

    >>> bytes.fromhex('2Ef0 F1f2  ')
    b'.\xf0\xf1\xf2'

    Change in version 3.7: bytes. Fromhex() now ignores all ASCII whitespace, not just whitespace.

There is a reverse conversion function that converts the bytes object to the corresponding hexadecimal representation.

hex()

Returns a string object that contains two hexadecimal digits for each byte in the instance.

    >>> b'\xf0\xf1\xf2'.hex()
    'f0f1f2'

    3.5 new functions

Because the bytes object is a sequence of integers (similar to tuples), for a byte object B, B [0] will be an integer, and B [0:1] will be a byte object of length 1. (unlike text strings, indexes and slices will produce a string of length 1.).

The representation of the bytes object uses the literal format (B ‘…’), because it is usually better than a format like bytes ([46, 46, 46]). You can always use list (b) to convert a bytes object into a list of integers.

annotation

Note for Python 2. X users: in the python 2. X series, 8-bit strings (objects closest to the built-in binary data type provided by 2. X) are allowed to perform various implicit conversions with Unicode strings. This is a workaround for backward compatibility to accommodate the fact that Python initially only supports 8-bit text and Unicode text was later added. In Python 3. X, these implicit conversions have been canceled — the conversion between 8-bit binary data and Unicode text must be explicit, and the comparison between bytes and string objects will always be unequal.

Bytearray object

A bytearray object is a variable counterpart to a bytes object.

class bytearray([source[, encoding[, errors]]])

Bytearray objects do not have a specific literal syntax, they are always created by calling the constructor:
  • Create an empty instance: bytearray()
  • Create a zero filled instance of the specified length: bytearray (10)
  • Through the iteratable object composed of integers: bytearray (range (20))
  • Copy existing binary data through buffer protocol: bytearray (b’hi! ‘)

    Because the bytearray object is variable, it supports variable sequence operations in addition to the byte and bytearray common operations described in the byte and bytearray operations.

    See also bytearray built-in types.

    Because two hexadecimal numbers exactly correspond to one byte, hexadecimal number is a common format to describe binary data. Accordingly, the bytearray type has additional class methods that read data from this format:

classmethod fromhex(string)

The bytearray class method returns a bytearray object that decodes a given string. The string must be composed of two hexadecimal digits representing each byte, where the ASCII whitespace is ignored.
        >>> bytearray.fromhex('2Ef0 F1f2  ')
        bytearray(b'.\xf0\xf1\xf2')
Change in version 3.7: bytearray. Fromhex() now ignores all ASCII whitespace characters, not just whitespace characters.

There is a reverse conversion function that converts a bytearray object to its corresponding hexadecimal representation.

hex()

Returns a string object that contains two hexadecimal digits for each byte in the instance.
        >>> bytearray(b'\xf0\xf1\xf2').hex()
        'f0f1f2'
3.5 new functions

Because a bytearray object is a sequence of integers (similar to a list), for a bytearray object B, B [0] will be an integer, and B [0:1] will be a bytearray object with a length of 1. (unlike text strings, indexes and slices will produce a string of length 1.).

The representation of a bytearray object uses the byte object literal format (bytearray (B ‘…’), because it is usually better than a format such as bytearray ([46, 46, 46]). You can always use list (b) to convert a bytearray object into a list of integers.

Bytes and byte array operations

Both bytes and byte array objects support common sequence operations. They can interoperate not only with operands of the same type, but also with any bytes like object. Because of this flexibility, they are free to mix in operations without causing errors. However, the return value type of the result of the operation may depend on the order of the operands.

annotation

The methods of bytes and byte array objects do not accept strings as their parameters, just as the methods of strings do not accept bytes as their parameters. For example, you must use the following notation:

a = "abc"
b = a.replace("a", "f")

And:

a = b"abc"
b = a.replace(b"a", b"f")

Some byte and byte array operations assume ASCII compatible binary formats, so they should be avoided when processing arbitrary binary data. These limitations are described below.

annotation

Using these ASCII based operations to process binary data that is not stored in an ASCII based format can cause data corruption.

The following methods of the bytes and byte array objects can be used with any binary data.

bytes.count(sub[, start[, end]])

bytearray.count(sub[, start[, end]])

Returns the number of times the subsequence sub does not overlap in the range of [start, end]. The optional parameters start and end are interpreted as slice representations.

The subsequence to search can be any bytes like object or an integer in the range of 0 to 255.

Changed in version 3.3: integers in the range of 0 to 255 are also accepted as subsequences.

bytes.decode(encoding=”utf-8″, errors=”strict”)

bytearray.decode(encoding=”utf-8″, errors=”strict”)

Returns the string decoded from the given bytes. The default encoding is' UTF-8 '. You can give errors to set different error handling schemes. The default value for errors is' strict ', which means that encoding errors cause Unicode errors. Other available values are 'ignore', 'replace', and any other names registered through codecs. Register? Error(), see the error handlers section. To see a list of available encodings, see the standard encodings section.

annotation

Passing the encoding parameter to STR allows decoding any bytes like object directly without creating a temporary bytes or byte array object.

Changed in version 3.1: added support for keyword parameters.

bytes.endswith(suffix[, start[, end]])

bytearray.endswith(suffix[, start[, end]])

Returns true if the binary data ends with the specified suffix, false otherwise. Suffix can also be a tuple of multiple suffixes for searching. If you have the option start, the check starts at the specified location. If you have the option end, the comparison stops at the specified location.

The suffix to search can be any bytes like object.

bytes.find(sub[, start[, end]])

bytearray.find(sub[, start[, end]])

Returns the minimum index of the subsequence sub found in the data. The sub is contained in the slice s [start: end]. The optional parameters start and end are interpreted as slice representations. Returns - 1 if the sub is not found.

The subsequence to search can be any bytes like object or an integer in the range of 0 to 255.

annotation

The find () method should only be used if you need to know where the sub is. To check if a sub is a substring, use the in operator:
    >>> b'Py' in b'Python'
    True
Changed in version 3.3: integers in the range of 0 to 255 are also accepted as subsequences.

bytes.index(sub[, start[, end]])

bytearray.index(sub[, start[, end]])

Similar to find (), but raises a valueerror when a subsequence cannot be found.

The subsequence to search can be any bytes like object or an integer in the range of 0 to 255.

Changed in version 3.3: integers in the range of 0 to 255 are also accepted as subsequences.

bytes.join(iterable)

bytearray.join(iterable)

Returns a byte or byte array object made up of a sequence of binary data in Iterable. Typeerror is raised if there are any non byte class objects in Iterable, including STR object values. The contents of the bytes or bytearray object that provides the method will be separated between elements.

static bytes.maketrans(from, to)

static bytearray.maketrans(from, to)

This static method returns a conversion cross reference table that can be used for bytes. Translate(), which maps each character in from to characters in the same position in to; both from and to must be byte class objects and have the same length.

3.1 new functions

bytes.partition(sep)

bytearray.partition(sep)

Splits the sequence where SEP first appears, returning a 3-tuple containing the part before the separator, the separator itself or its bytearray copy, and the part after the separator. If the delimiter is not found, the returned 3 tuple contains the original sequence and two empty bytes or byte array objects.

The separator to search can be any bytes like object.

bytes.replace(old, new[, count])

bytearray.replace(old, new[, count])

Returns a copy of the sequence in which all subsequences old will be replaced with new. If the optional parameter count is given, only the first count appears.

The subsequence to be searched and its replacement sequence can be any bytes like object.

annotation

The bytearray version of this method doesn't work in place - it always produces a new object, even if nothing has changed.

bytes.rfind(sub[, start[, end]])

bytearray.rfind(sub[, start[, end]])

Returns the maximum (rightmost) index of the subsequence sub found in the sequence, so that the sub will be included in s [start: end]. The optional parameters start and end are interpreted as slice representations. Returns - 1 if not found.

The subsequence to search can be any bytes like object or an integer in the range of 0 to 255.

Changed in version 3.3: integers in the range of 0 to 255 are also accepted as subsequences.

bytes.rindex(sub[, start[, end]])

bytearray.rindex(sub[, start[, end]])

Similar to rfind (), but raises a valueerror when the subsequence sub is not found.

The subsequence to search can be any bytes like object or an integer in the range of 0 to 255.

Changed in version 3.3: integers in the range of 0 to 255 are also accepted as subsequences.

bytes.rpartition(sep)

bytearray.rpartition(sep)

Split the sequence at the last occurrence of SEP, and return a 3-tuple containing the part before the separator, the separator itself or its bytearray copy, and the part after the separator. If the delimiter is not found, the returned 3-tuple contains two empty bytes or byte array objects and copies of the original sequence.

The separator to search can be any bytes like object.

bytes.startswith(prefix[, start[, end]])

bytearray.startswith(prefix[, start[, end]])

Returns true if the binary data starts with the specified prefix, false otherwise. Prefix can also be a tuple of multiple prefixes for searching. If you have the option start, the check starts at the specified location. If you have the option end, the comparison stops at the specified location.

The prefix to search can be any bytes like object.

bytes.translate(table, delete=b”)

bytearray.translate(table, delete=b”)

Returns a copy of the original bytes or byte array object, and removes all the bytes in the optional parameter delete. The remaining bytes will be mapped through the given conversion table, which must be a byte object with a length of 256.

You can use the bytes. Maketrans () method to create the conversion table.

For transformations that only remove characters, set the table parameter to none:
    >>> b'read this short text'.translate(None, b'aeiou')
    b'rd ths shrt txt'
Change in version 3.6: delete is now supported as a key parameter.

The default behavior of the methods of the following bytes and byte array objects assumes ASCII compatible binary formats, but can still be used for arbitrary binary data by passing in the appropriate parameters. Note that all bytearray methods in this section do not perform operations in place, but instead produce new objects.

bytes.center(width[, fillbyte])

bytearray.center(width[, fillbyte])

Returns a copy of the original object, centered within a sequence of width in length, and fills the spaces on both sides with the specified fillbyte (ASCII space character is used by default). For a bytes object, if the width is less than or equal to len (s), a copy of the original sequence is returned.

annotation

The bytearray version of this method doesn't work in place - it always produces a new object, even if nothing has changed.

bytes.ljust(width[, fillbyte])

bytearray.ljust(width[, fillbyte])

Reverses the copy of the original object and aligns it to the left in a sequence with a length of width. Fills the space with the specified fillbyte (ASCII space character is used by default). For a bytes object, if the width is less than or equal to len (s), a copy of the original sequence is returned.

annotation

The bytearray version of this method doesn't work in place - it always produces a new object, even if nothing has changed.

bytes.lstrip([chars])

bytearray.lstrip([chars])

Returns a copy of the original sequence, removing the specified leading bytes. The chars parameter is a binary sequence that specifies the set of byte values to be removed -- a name that indicates that this method is usually used for ASCII characters. If omitted or none, the chars parameter removes ASCII whitespace by default. The chars parameter does not specify a single prefix; instead, all combinations of parameter values are removed:
    >>> b'   spacious   '.lstrip()
    b'spacious   '
    >>> b'www.example.com'.lstrip(b'cmowz.')
    b'example.com'
The binary sequence of byte values to be removed can be any bytes like object.

annotation

The bytearray version of this method doesn't work in place - it always produces a new object, even if nothing has changed.

bytes.rjust(width[, fillbyte])

bytearray.rjust(width[, fillbyte])

Returns a copy of the original object, right justified in a sequence of length width. Fills the space with the specified fillbyte (ASCII space character is used by default). For a bytes object, if the width is less than or equal to len (s), a copy of the original sequence is returned.

annotation

The bytearray version of this method doesn't work in place - it always produces a new object, even if nothing has changed.

bytes.rsplit(sep=None, maxsplit=-1)

bytearray.rsplit(sep=None, maxsplit=-1)

Split the binary sequence into subsequences of the same type, using SEP as the separator. If maxplit is given, Max maxplit splits are performed, starting from the far right. If SEP is not specified or none, any subsequence containing only ASCII whitespace will be used as a separator. Except for splitting from the right, rsplit () behaves like split () below.

bytes.rstrip([chars])

bytearray.rstrip([chars])

Returns a copy of the original sequence, removing the specified end byte. The chars parameter is a binary sequence that specifies the set of byte values to be removed -- a name that indicates that this method is usually used for ASCII characters. If omitted or none, the chars parameter removes ASCII whitespace by default. The chars parameter does not specify a single suffix; instead, all combinations of parameter values are removed:
    >>> b'   spacious   '.rstrip()
    b'   spacious'
    >>> b'mississippi'.rstrip(b'ipz')
    b'mississ'
The binary sequence of byte values to be removed can be any bytes like object.

annotation

The bytearray version of this method doesn't work in place - it always produces a new object, even if nothing has changed.

bytes.split(sep=None, maxsplit=-1)

bytearray.split(sep=None, maxsplit=-1)

Split the binary sequence into subsequences of the same type, using SEP as the separator. If maxplit is given and is not a negative value, maxplit splits up to times (so the list will have maxplit + 1 elements at most). If maxplit is not specified or - 1, there is no limit to the number of splits (all possible splits).

If SEP is given, consecutive separators are not grouped together but are treated as separated empty subsequences (for example, B'1,, 2 '. Split (B', ') will return [b'1', B '', B'2 ']). The SEP parameter may be a multibyte sequence (for example, B'1 < > 2 < > 3 '. Split (B' < > ') will return [b'1', B'2 ', B'3']). Splitting an empty sequence with the specified separator returns [b ''] or [bytearray (B ''], depending on the type of object being split. The SEP parameter can be any bytes like object.

For example:
    >>> b'1,2,3'.split(b',')
    [b'1', b'2', b'3']
    >>> b'1,2,3'.split(b',', maxsplit=1)
    [b'1', b'2,3']
    >>> b'1,2,,3,'.split(b',')
    [b'1', b'2', b'', b'3', b'']
If SEP is not specified or none, another splitting algorithm is applied: consecutive ASCII blanks are treated as a single separator and the result does not contain blanks at the beginning or end of the sequence. Therefore, splitting an empty sequence or a sequence containing only ASCII blanks without specifying a separator returns [].

For example:
    >>> b'1 2 3'.split()
    [b'1', b'2', b'3']
    >>> b'1 2 3'.split(maxsplit=1)
    [b'1', b'2 3']
    >>> b'   1   2   3   '.split()
    [b'1', b'2', b'3']

bytes.strip([chars])

bytearray.strip([chars])

Returns a copy of the original sequence, removing the specified start and end bytes. The chars parameter is a binary sequence that specifies the set of byte values to be removed -- a name that indicates that this method is usually used for ASCII characters. If omitted or none, the chars parameter removes ASCII whitespace by default. The chars parameter does not specify a single prefix or suffix; instead, all combinations of parameter values are removed:
    >>> b'   spacious   '.strip()
    b'spacious'
    >>> b'www.example.com'.strip(b'cmowz.')
    b'example'
The binary sequence of byte values to be removed can be any bytes like object.

annotation

The bytearray version of this method doesn't work in place - it always produces a new object, even if nothing has changed.

The following methods for bytes and bytearray objects assume ASCII compatible binary formats and should not be applied to arbitrary binary data. Note that all bytearray methods in this section do not perform operations in place, but instead produce new objects.

bytes.capitalize()

bytearray.capitalize()

Returns a copy of the original sequence, where each byte will be interpreted as an ASCII character, with the first byte character uppercase and the rest lowercase. Non ASCII byte values remain unchanged.

annotation

The bytearray version of this method doesn't work in place - it always produces a new object, even if nothing has changed.

bytes.expandtabs(tabsize=8)

bytearray.expandtabs(tabsize=8)

Returns a copy of the sequence in which all ASCII tabs are replaced by one or more ASCII spaces, depending on the current column position and given tab width. Each tabsize byte is set as a tab stop (the tab stop set at the default value of 8 is in column 0, 8, 16 and so on). To expand the sequence, the current column position is set to zero and each byte in the sequence is checked one by one. If the byte is an ASCII tab (B '\ t'), insert one or more spaces in the result until the current column equals the next tab. (tabs themselves are not copied.) If the current byte is ASCII newline (B '\ n') or carriage return (B '\ R'), it is copied and the current column is reset to zero. Any other byte is copied without modification and the current column is added by one, regardless of how the byte value is displayed when it is printed:
    >>> b'01\t012\t0123\t01234'.expandtabs()
    b'01      012     0123    01234'
    >>> b'01\t012\t0123\t01234'.expandtabs(4)
    b'01  012 0123    01234'
annotation

The bytearray version of this method doesn't work in place - it always produces a new object, even if nothing has changed.

bytes.isalnum()

bytearray.isalnum()

If all the bytes in the sequence are alphabetic ASCII characters or ASCII decimal digits and the sequence is not empty, the true value is returned; otherwise, the false value is returned. The alphabetic ASCII character is the character whose byte value is contained in the sequence b'abcdefghijklmnopqrstuvwxyz '. The ASCII decimal number is the character whose byte value is contained in the sequence B'0123456789 '.

For example:
    >>> b'ABCabc1'.isalnum()
    True
    >>> b'ABC abc1'.isalnum()
    False

bytes.isalpha()

bytearray.isalpha()

If all the bytes in the sequence are alphabetic ASCII characters and the sequence is not empty, a true value is returned, otherwise a false value is returned. The alphabetic ASCII character is the character whose byte value is contained in the sequence b'abcdefghijklmnopqrstuvwxyz '.

For example:
    >>> b'ABCabc'.isalpha()
    True
    >>> b'ABCabc1'.isalpha()
    False

bytes.isascii()

bytearray.isascii()

If the sequence is empty or all the bytes in the sequence are ASCII bytes, a true value is returned, otherwise a false value is returned. The value range of ASCII bytes is 0-0x7f.

3.7 new functions

bytes.isdigit()

bytearray.isdigit()

If all the bytes in the sequence are ASCII decimal numbers and the sequence is not empty, the true value is returned; otherwise, the false value is returned. The ASCII decimal number is the character whose byte value is contained in the sequence B'0123456789 '.

For example:
    >>> b'1234'.isdigit()
    True
    >>> b'1.23'.isdigit()
    False

bytes.islower()

bytearray.islower()

Returns true if there is at least one lowercase ASCII character in the sequence and there is no uppercase ASCII character, false otherwise.

For example:
    >>> b'hello world'.islower()
    True
    >>> b'Hello world'.islower()
    False
Lowercase ASCII characters are the characters whose byte values are contained in the sequence b'abcdefghijklmnopqrstuvwxyz '. Uppercase ASCII characters are characters whose byte values are contained in the sequence b'abcdefghijklmnopqrstuvwxyz '.

bytes.isspace()

bytearray.isspace()

If all the bytes in the sequence are ASCII blanks and the sequence is not empty, the true value is returned; otherwise, the false value is returned. ASCII blanks are characters whose byte values are contained in the sequence B '\ t \ n \ R \ x0B \ f' (spaces, tabulation, line feed, carriage return, vertical tabulation, page feed).

bytes.istitle()

bytearray.istitle()

If the sequence is in the form of ASCII title and the sequence is not empty, the true value is returned; otherwise, the false value is returned. See bytes. Title() for a detailed definition of "title form.".

For example:
    >>> b'Hello World'.istitle()
    True
    >>> b'Hello world'.istitle()
    False

bytes.isupper()

bytearray.isupper()

Returns true if there is at least one uppercase ASCII character in the sequence and there is no lowercase ASCII character, false otherwise.

For example:
    >>> b'HELLO WORLD'.isupper()
    True
    >>> b'Hello world'.isupper()
    False
Lowercase ASCII characters are the characters whose byte values are contained in the sequence b'abcdefghijklmnopqrstuvwxyz '. Uppercase ASCII characters are characters whose byte values are contained in the sequence b'abcdefghijklmnopqrstuvwxyz '.

bytes.lower()

bytearray.lower()

Returns a copy of the original sequence, with all uppercase ASCII characters converted to the corresponding lowercase form.

For example:
    >>> b'Hello World'.lower()
    b'hello world'
Lowercase ASCII characters are the characters whose byte values are contained in the sequence b'abcdefghijklmnopqrstuvwxyz '. Uppercase ASCII characters are characters whose byte values are contained in the sequence b'abcdefghijklmnopqrstuvwxyz '.

annotation

The bytearray version of this method doesn't work in place - it always produces a new object, even if nothing has changed.

bytes.splitlines(keepends=False)

bytearray.splitlines(keepends=False)

Returns a list of lines in the original binary sequence, split at the ASCII line boundary. This method uses the universal newlines method to branch. Line breaks are not included in the result list, unless a true value for keeps is given.

For example:
    >>> b'ab c\n\nde fg\rkl\r\n'.splitlines()
    [b'ab c', b'', b'de fg', b'kl']
    >>> b'ab c\n\nde fg\rkl\r\n'.splitlines(keepends=True)
    [b'ab c\n', b'\n', b'de fg\r', b'kl\r\n']
Unlike split (), when the separator SEP is given, this method will return an empty list for the empty string, and the newline at the end will not add additional lines to the result:
    >>> b"".split(b'\n'), b"Two lines\n".split(b'\n')
    ([b''], [b'Two lines', b''])
    >>> b"".splitlines(), b"One line\n".splitlines()
    ([], [b'One line'])

bytes.swapcase()

bytearray.swapcase()

Returns a copy of the original sequence, where all lowercase ASCII characters are converted to the corresponding upper case, and vice versa.

For example:
    >>> b'Hello World'.swapcase()
    b'hELLO wORLD'
Lowercase ASCII characters are the characters whose byte values are contained in the sequence b'abcdefghijklmnopqrstuvwxyz '. Uppercase ASCII characters are characters whose byte values are contained in the sequence b'abcdefghijklmnopqrstuvwxyz '.

Unlike str.swapcase(), bin. Swapcase(). Swapcase() = = bin is always true in some binary versions. Case conversion is symmetric in ASCII, even if it is not always true for any Unicode code point.

annotation

The bytearray version of this method doesn't work in place - it always produces a new object, even if nothing has changed.

bytes.title()

bytearray.title()

Returns the title version of the original binary sequence, where each word begins with an uppercase ASCII character and the remaining letters are lowercase. Case insensitive byte values will remain the same.

For example:
    >>> b'Hello world'.title()
    b'Hello World'
Lowercase ASCII characters are the characters whose byte values are contained in the sequence b'abcdefghijklmnopqrstuvwxyz '. Uppercase ASCII characters are characters whose byte values are contained in the sequence b'abcdefghijklmnopqrstuvwxyz '. All other byte values are case insensitive.

The algorithm uses a simple language independent definition, and treats consecutive letter combinations as words. This definition works in most cases, but it also means that apostrophes representing abbreviations and possessive lattices also become word boundaries, which can lead to undesirable results:
    >>> b"they're bill's friends from the UK".title()
    b"They'Re Bill'S Friends From The Uk"
You can use regular expressions to build special treatments for apostrophes:
    >>> import re
    >>> def titlecase(s):
    ...     return re.sub(rb"[A-Za-z]+('[A-Za-z]+)?",
    ...                   lambda mo: mo.group(0)[0:1].upper() +
    ...                              mo.group(0)[1:].lower(),
    ...                   s)
    ...
    >>> titlecase(b"they're bill's friends.")
    b"They're Bill's Friends."
annotation

The bytearray version of this method doesn't work in place - it always produces a new object, even if nothing has changed.

bytes.upper()

bytearray.upper()

Returns a copy of the original sequence, with all lowercase ASCII characters converted to the corresponding upper case.

For example:
    >>> b'Hello World'.upper()
    b'HELLO WORLD'
Lowercase ASCII characters are the characters whose byte values are contained in the sequence b'abcdefghijklmnopqrstuvwxyz '. Uppercase ASCII characters are characters whose byte values are contained in the sequence b'abcdefghijklmnopqrstuvwxyz '.

annotation

The bytearray version of this method doesn't work in place - it always produces a new object, even if nothing has changed.

bytes.zfill(width)

bytearray.zfill(width)

Returns a copy of the original sequence, filling the left with a B'0 'number to make the sequence length width. The positive and negative value prefixes (B '+' / B '-') are handled by padding after the positive and negative symbols, not before them. For the bytes object, if the width is less than or equal to len (SEQ), the original sequence is returned.

For example:
    >>> b"42".zfill(5)
    b'00042'
    >>> b"-42".zfill(5)
    b'-0042'
annotation

The bytearray version of this method doesn't work in place - it always produces a new object, even if nothing has changed.

Printf style byte string formatting

annotation

The formatting operations described here have a variety of weird features that can lead to many common errors (such as the inability to display tuples and dictionaries correctly). If the value you want to print might be a tuple or a dictionary, put it in a tuple.

The byte / byte array object has a special built-in operation: using the% (modulo) operator. This is also known as a format or interpolation operator for byte strings. For format% values, where format is a byte string object, the% transform flag in format is replaced with zero or more values entries. The effect is similar to using sprintf () in C language.

If format requires a single parameter, values can be a non tuple object. Otherwise, values must be either a tuple containing the same number of items as the number of conversion characters specified in the format byte string object, or a separate mapping object (such as tuples).

The conversion marker contains two or more characters and consists of the following, in the order specified here:

  1. ‘%’ character to mark the beginning of the conversion character.
  2. An optional mapping key consisting of a sequence of parenthesized characters (for example, somename).
  3. Conversion flags (optional) that affect the results of some conversion types.
  4. Minimum field width (optional). If specified as’ * ‘(asterisk), the actual width is read from the next element of the values tuple, and the object to be converted is the element after the minimum field width and optional precision.
  5. Precision (optional), in the form of adding precision value after ‘.’ (point number). If specified as’ * ‘(asterisk), the actual precision is read from the next element of the values tuple, and the object to be converted is the element after the precision.
  6. Length modifier (optional).
  7. Conversion type.

When the parameter on the right is a dictionary (or other mapping type), the format in the byte string object must contain the bracketed mapping key, corresponding to each item in the dictionary after the ‘%’ character. The map key selects the value to format from the map. For example:

>>> print(b'%(language)s has %(number)03d quote types.' %
...       {b'language': b"Python", b"number": 2})
b'Python has 002 quote types.'

In this case, the * flag cannot appear in the format because it requires a list of parameters for a sequence class.

The conversion flag is:

Python Standard Library -- 13. Built in type: binary sequence type (bytes, byte array)

A length modifier (h, l, or L) can be given, but is ignored because it is not necessary for Python — so% l d is equivalent to% d.

The conversion type is:

Python Standard Library -- 13. Built in type: binary sequence type (bytes, byte array)

Notes:

  1. This alternative inserts the prefix (‘0o ‘) that identifies the octal number before the first number.
  2. This alternative inserts a ‘0x’ or ‘0xx’ prefix before the first number (depending on whether the ‘x’ or ‘x’ format is used).
  3. This alternative always includes a decimal point in result, even if there is no number after it.

    The number of digits after the decimal point is determined by the precision, which is 6 by default.

  4. This alternative always contains a decimal point in the result, and the zeros at the end of the result are not removed as in other cases.

    The number of valid digits before and after the decimal point is determined by the precision, and the default value is 6.

  5. If the precision is n, the output is truncated to n characters.
  6. B ‘% s’ is deprecated but will not be removed in the 3. X series.
  7. B ‘% R’ is deprecated but will not be removed in the 3. X series.
  8. See PEP 237.

annotation

The bytearray version of this method doesn’t work in place – it always produces a new object, even if nothing has changed.

See also

PEP 461 – add% formatting for bytes and byte array

3.5 new functions

Previous article: Python Standard Library — 12. Built in type: text sequence type (STR)
Next article: