Python 3 standard library: IO text, decimal and raw stream I / O tools

Time:2020-11-26

1. IO text, decimal and raw stream I / O tools

The IO module implements some classes on top of the interpreter’s built-in open() to complete file based input and output operations. These classes are properly decomposed so that they can be reassembled for different purposes — for example, to support writing Unicode data to a network socket.

1.1 streams in memory

Stringio provides a convenient way to process text in memory using file APIs such as read(), write(), etc. In some cases, using stringio to construct large strings can provide better performance than some other string concatenation techniques. In memory stream buffers are also useful for testing, and writing real files to disk does not slow down the test suite.

Here are some standard examples of using stringio buffers.

import io

# Writing to a buffer
output = io.StringIO()
output.write('This goes into the buffer. ')
print('And so does this.', file=output)

# Retrieve the value written
print(output.getvalue())

output.close()  # discard buffer memory

# Initialize a read buffer
input = io.StringIO('Inital value for read buffer')

# Read from the buffer
print(input.read())

This example uses read (), but you can also use the readLine () and readlines () methods. The stringio class also provides a seek() method that can jump in the buffer when reading text, which is useful for rotation if a forward parsing algorithm is used.

To handle raw bytes instead of Unicode text, you can use bytesio.

import io

# Writing to a buffer
output = io.BytesIO()
output.write('This goes into the buffer. '.encode('utf-8'))
output.write('ÁÇÊ'.encode('utf-8'))

# Retrieve the value written
print(output.getvalue())

output.close()  # discard buffer memory

# Initialize a read buffer
input = io.BytesIO(b'Inital value for read buffer')

# Read from the buffer
print(input.read())

The value written to the byte IO instance must be bytes, not str.

1.2 wrapping byte stream for text data

Raw byte streams, such as sockets, can be packaged as a layer to handle string encoding and decoding, making it easier to process text data. The textiowrapper class supports reading and writing. write_ The through parameter disables buffering and immediately flushes all data written to the wrapper to the underlying buffer.

import io

# Writing to a buffer
output = io.BytesIO()
wrapper = io.TextIOWrapper(
    output,
    encoding='utf-8',
    write_through=True,
)
wrapper.write('This goes into the buffer. ')
wrapper.write('ÁÇÊ')

# Retrieve the value written
print(output.getvalue())

output.close()  # discard buffer memory

# Initialize a read buffer
input = io.BytesIO(
    b'Inital value for read buffer with unicode characters ' +
    'ÁÇÊ'.encode('utf-8')
)
wrapper = io.TextIOWrapper(input, encoding='utf-8')

# Read from the buffer
print(wrapper.read())

This example uses a bytesio instance as a stream. Examples of bz2, HTTP, server, and subprocess show how to use textiowrapper for other types of file like objects.