When inserting a large amount of data into MySQL today, I suddenly think of pymysql and executemany method. Which method is faster? How fast?
The old rule, test code first:
class IN_sql(): def __init__(self): Print ('initializing... ') self.connDB = pymysql.connect(host=DB_HOST, user=DB_USER, passwd=DB_PASSWD, db=DB_DB, charset='utf8mb4') self.connDB.ping() self.cu=self.connDB.cursor() def workOn(self): L= with open("./1W.txt","r",encoding="utf-8")as f: while True: res=f.readline().replace("\r","").replace("\n","") if not res: break L.append(res) #Count the time taken to insert start = time.time() sql = "insert ignore into article(an) values(%s);" #Execute execute code for l in L: self.cu.execute(sql, l) #Executemany execute code #self.cu.executemany(sql, tuple(L)) self.connDB.commit() Print ('total time (seconds): '+ str (round)（ time.time () - start, 2))) self.cu.close() self.connDB.close() if __name__ == '__main__': insert=IN_sql() insert.workOn()
|Method order time (s)||1W||10W||100W|
You can see the obvious difference. If you need to insert the database in batch, it is better to use the executemany method, which is not an order of magnitude with execute!!!
==Recently, a friend asked, “there are many differences between execute any and native SQL statements?”?==
Here I have carried on the test again:
def workOn(self): L= with open("./100W.txt","r",encoding="utf-8")as f: while True: res=f.readline().replace("\r","").replace("\n","") if not res: break L.append(res) #Count the time taken to insert start = time.time() print("start time:" + time.strftime("%Y-%m-%d %H:%M:%S", time.localtime(start))) sql = "insert into article(an) values " for l in L: sql += "('%s'), " % l sql = sql.rstrip(', ') + ';' self.cu.execute(sql) self.connDB.commit() end=time.time() print("end time:"+time.strftime("%Y-%m-%d %H:%M:%S",time.localtime(end))) Print ('total time (seconds): '+ str (round (end - start, 3)))
The final result: 20.176s
==Why is executing native SQL slower than executing any? In theory, native SQL should be the fastest. = =
According to the source code query, executemany actually combines all parameters into a SQL statement for execution (= = insert into article (an) values(), (), (), (), (), (), () =). The optimization is mainly in string splicing.
Finally, the test results show that: the string splicing of my own code took about 12s, writing SQL statements took about 8s, while the execution any string splicing took only about 2S.
Therefore, when you encounter a large number of data writing in the future, try to use the executemany method!