File operation with Python

Time:2021-9-14
What is a file

A file is a named location in the storage area of the system, which is used to store some information for subsequent access. It can realize continuous storage in nonvolatile memory, such as on hard disk. When we want to read or write a file, we need to open the file; After the operation, we need to close the file to release the system resources related to the file operation. Therefore, the file operation mainly includes the following:

Open file
  • Read or write
  • Close file
  • Open file

Python uses the built-in open () function to open a file and returns a file object, also known as a handle.

F = open ("test. TXT") # a file under this folder
F = open ("C: / Python 33 / readme. TXT") # full path

When opening the file again, we need to specify the file opening mode. When we need to read the file, use f = open (“test. TXT”,’r ‘). When writing the file, use f = open (“test. TXT”,’w’). When adding input, f = open (“test. TXT”,’a ‘). Here a means append. The difference between append mode and write mode is that when a file is opened in write mode, whether the file has content or not, it will be cleared and written again; When the append mode is used, the open file only continues to be written on the original content. At the same time, we should also decide whether to open in text mode or binary mode.

The difference between text mode and binary mode
In text mode, when reading, the line break character of the operating system ('\ n' on UNIX, '\ R \ n' on Windows) will be converted to the default line break character of Python \ n, and when writing, the default line break character will be converted to the line break character of the operating system; No conversion occurs in binary mode. This conversion has no effect on text files, but it will affect binary data, such as image files or exe files. Therefore, when opening this kind of file again, the binary mode is generally used for reading and writing

Common mode

r Text mode, reading
rb Binary mode, reading
w Text mode, writing
wb Binary mode, writing
a Text mode, append
ab Binary mode, append
+ Readable and writable
F = open ("test. TXT",'r ') # read mode
F = open ("test. TXT",'w ') # write mode
F = open ("img. BMP",'r + ') # readable and writable
F = open ("img. BMP",'w + ') # readable and writable
F = open ("img. BMP",'rb ') # binary read

How to close a file

When the file operation ends, we’d better take the initiative to close the file. Although Python has a garbage collector mechanism to clean up unused objects, it’s best to close the file yourself.
The simplest way is:

f = open("app.log", 'r')
do_something()
f.close()

However, this method is not safe, because in other operations, exceptions may occur and the program exits, so the statement to close the file will not be executed.
Therefore, you can useStatement to handle:

try:
   f = open('app.log', 'r')
   do_something()
finally:
   f.close()

The instruction to close the file is executed regardless of whether an exception occurs.
But the official best usage of Python is:

with open('app.log', 'r') as f:
   do_something()

Using this usage, we do not need to call the close () method, and the with statement will be executed inside the program, whether there is an exception or not. The with statement is called the context manager. We can ignore this principle for the time being. We only need to know that when using the with statement, the operation of closing the file will be executed automatically. This is also the best usage recommended by the governmentThe statement is simple to write.

File operation
write file

Two methods are mainly introduced:

1. Write() method
The parameter of this method is a separate string, such as:

lines = ['line1', 'line2']
with open('filename.txt', 'w') as f:
    s = ''
    for data in lines:
        s += data
        s += '\n'
    f.write(s)

In fact, a better way is to use the join function:

lines = ['line1', 'line2']
with open('filename.txt', 'w') as f:
    f.write('\n'.join(lines))

2. Writelines() method
Parameters are a set of iteratable strings, such as

lines = ['line1', 'line2']
with open('filename.txt', 'w') as f:
    new_lines = []
    for data in lines:
        new_lines.append(data+'\n')
    f.writelines(new_lines)

In fact, for more elegant writing, you can use the generator:

lines = ['line1', 'line2']
with open('filename.txt', 'w') as f:
    f.writelines("%s\n" % l for l in lines)
read file

Here are four usages. The following default files have been opened:
1. Read() method

result = f.read()

What is returned here is the file content, which is the result of STR type. This method also takes a numeric parameter to specify how much content to read. If it is omitted or negative, all the contents of the file will be returned.
2. Readline() method

result = f.readline()

The returned string is only one line of content. If you continue to call, the next line of content will be returned
3. Readlines() method

result = f.readlines()

A list is returned here, but when the data is large, this usage will occupy a lot of memory. It is not recommended to use it when the amount of data is large
4. Direct loop file object

for line in f:
    print line
    do_something()

This usage saves memory, is fast, and the code is simple

result = f.readlines()
------------------------
result = list(f)

The two methods return the same result
Obviously, we recommend the fourth usage.

How to handle large files

The main problem with large files is that they occupy a large amount of memory. We can’t read all the contents of the file into memory at once. The best practices are as follows:

with open("log.txt") as f:
    for line in f:
        do_something_with(line)

Read line by line, the memory will not explode, and the speed is faster. Using the with statement, the file object will be closed at the end regardless of whether there are internal exceptions. Therefore, it is best to do this when processing large files.