MySQL challenge: establishing 100000 connections

Time:2021-1-24

The purpose of this paper is to explore a method of establishing 10W connections on a MySQL server. What we want to establish is a connection that can execute the query, not 10W idle connections.

You may ask, does my MySQL server really need 10W connection? I have seen many different deployment schemes, such as using connection pool, putting 1000 connections in each application’s connection pool, and deploying 100 such application servers. There are also some very bad practices that use the technology of “slow query, reconnecting and retrying”. This can cause a snowball effect, which can lead to thousands of connections to be made in a few seconds.

So I decided to set a “small goal” to see if it could be achieved.

Preparation stage

Let’s look at the hardware first. The server is made up of packet.net (a cloud service provider) with the following configuration:

instance size: c2.medium.x86
Physical Cores @ 2.2 GHz
(1 X AMD EPYC 7401P)
Memory: 64 GB of ECC RAM
Storage : INTEL® SSD DC S4500, 480GB

We need five such servers, one for MySQL server and the other four for client. MySQL server uses percona server’s MySQL 8.0.13-4 with thread pool plug-in, which needs to support thousands of connections.

Initialize server configuration

Network settings:

- { name: 'net.core.somaxconn', value: 32768 } 
- { name: 'net.core.rmem_max', value: 134217728 } 
- { name: 'net.core.wmem_max', value: 134217728 } 
- { name: 'net.ipv4.tcp_rmem', value: '4096 87380 134217728' } 
- { name: 'net.ipv4.tcp_wmem', value: '4096 87380 134217728' } 
- { name: 'net.core.netdev_max_backlog', value: 300000 } 
- { name: 'net.ipv4.tcp_moderate_rcvbuf', value: 1 } 
- { name: 'net.ipv4.tcp_no_metrics_save', value: 1 } 
- { name: 'net.ipv4.tcp_congestion_control', value: 'htcp' }
- { name: 'net.ipv4.tcp_mtu_probing', value: 1 }
- { name: 'net.ipv4.tcp_timestamps', value: 0 }
- { name: 'net.ipv4.tcp_sack', value: 0 }
- { name: 'net.ipv4.tcp_syncookies', value: 1 }
- { name: 'net.ipv4.tcp_max_syn_backlog', value: 4096 }
- { name: 'net.ipv4.tcp_mem', value: '50576   64768 98152' }
- { name: 'net.ipv4.ip_local_port_range', value: '4000 65000' }
- { name: 'net.ipv4.netdev_max_backlog', value: 2500 }
- { name: 'net.ipv4.tcp_tw_reuse', value: 1 }
- { name: 'net.ipv4.tcp_fin_timeout', value: 5 }

System limit settings:

[Service]
LimitNOFILE=1000000
LimitNPROC=500000

Corresponding MySQL configuration( my.cnf File:

back_log=3500
max_connections=110000

The client uses sysbench version 0.5 instead of 1.0. X. We will explain the specific reasons later.

Execute command: sysbench — Test = sysbench / tests / db/ select.lua –mysql-host=139.178.82.47 –mysql-user=sbtest–mysql-password=sbtest –oltp-tables-count=10 –report-interval=1 –num-threads=10000 –max-time=300 –max-requests=0 –oltp-table-size=10000000 –rand-type=uniform –rand-init=on run

Step one, 10000 connections

This step is very simple, we don’t need to make too many adjustments to achieve it. In this step, only one machine is needed as the client, but the client may have the following errors:

FATAL: error 2004: Can't create TCP/IP socket (24)

This is due to the limitation of the number of open files, which limits the number of TCP / IP sockets and can be adjusted on the client

ulimit -n100000

Now let’s look at the performance:

[  26s] threads: 10000, tps: 0.00, reads: 33367.48, writes: 0.00, response time: 3681.42ms (95%), errors: 0.00, reconnects:  0.00
[  27s] threads: 10000, tps: 0.00, reads: 33289.74, writes: 0.00, response time: 3690.25ms (95%), errors: 0.00, reconnects:  0.00

Second, 25000 connections

In this step, an error will occur on the MySQL server:

Can't create a new thread (errno 11); if you are not out of available memory, you can consult the manualfor a possible OS-dependent bug

The solution to this problem can be found in this link:

https://www.percona.com/blog/2013/02/04/cant_create_thread_errno_11/

However, this method does not apply to our current situation, because we have set all the restrictions to the highest level:

cat /proc/`pidof mysqld`/limits 
Limit                     Soft Limit Hard Limit           Units 
Max cpu time              unlimited  unlimited            seconds 
Max file size             unlimited  unlimited            bytes 
Max data size             unlimited  unlimited            bytes 
Max stack size            8388608    unlimited            bytes 
Max core file size        0          unlimited            bytes 
Max resident set          unlimited  unlimited            bytes
Max processes             500000     500000               processes
Max open files            1000000    1000000              files
Max locked memory         16777216   16777216             bytes
Max address space         unlimited  unlimited            bytes
Max file locks            unlimited  unlimited            locks
Max pending signals       255051     255051               signals
Max msgqueue size         819200     819200               bytes
Max nice priority         0          0
Max realtime priority     0          0
Max realtime timeout      unlimited unlimited            us

This is why we first chose the wired pool servicehttps://www.percona.com/doc/percona-server/8.0/performance/threadpool.html

stay my.cnf Add the following line to the file and restart the service

thread_handling=pool-of-threads

Look at the results

[   7s] threads: 25000, tps: 0.00, reads: 33332.57, writes: 0.00, response time: 974.56ms (95%), errors: 0.00, reconnects:  0.00
[   8s] threads: 25000, tps: 0.00, reads: 33187.01, writes: 0.00, response time: 979.24ms (95%), errors: 0.00, reconnects:  0.00

The throughput is the same, but 95% of the response is reduced from 3690 MS to 979 Ms.

Third, 50000 connections

Here, we have the biggest challenge. First, when trying to establish a 5W connection, sysbench reported an error:

FATAL: error 2003: Can't connect to MySQL server on '139.178.82.47' (99)

Error (99) the error is mysterious, which means that the specified address cannot be assigned. This problem is caused by the limitation of the number of ports that an application can open

cat /proc/sys/net/ipv4/ip_local_port_range : 32768 60999

This means that we only have 28231 ports available (60999 minus 32768), or the maximum number of TCP connections you can establish to the specified IP address. You can expand this range on servers and clients

echo 4000 65000 > /proc/sys/net/ipv4/ip_local_port_range

In this way, we can establish 61000 connections, which is close to the maximum limit of one IP available port (65535). The key point here is that if we want to achieve 10W connection, we need to assign more IP addresses to the MySQL server, so I assigned two IP addresses to the MySQL server.

After solving the problem of the number of ports, we encountered a new problem

sysbench 0.5:  multi-threaded system evaluation benchmark
Running the test with following options:
Number of threads: 50000
FATAL: pthread_create() for thread #32352 failed. errno = 12 (Cannot allocate memory)

This problem is caused by the memory allocation problem of sysbench. The memory allocated by sysbench can only create 32351 connections, which is more serious in version 1.0. X.

Limitation of sysbench 1.0. X

Sysbench version 1.0. X uses different Lua compilers, which makes it impossible for us to create more than 4000 connections. So it seems that sysbench reached the limit earlier than percona server, so we need to use more clients. If each client has 32351 connections at most, we need to use at least 4 clients to achieve the goal of 10W connection.

In order to achieve 5W connection, we use two machines as clients, and each machine starts 25000 threads. The results are as follows

[  29s] threads: 25000, tps: 0.00, reads: 16794.09, writes: 0.00, response time: 1799.63ms (95%), errors: 0.00, reconnects:  0.00
[  30s] threads: 25000, tps: 0.00, reads: 16491.03, writes: 0.00, response time: 1800.70ms (95%), errors: 0.00, reconnects:  0.00

The throughput is similar to the previous step (the total TPS is 16794 * 2 = 33588), but the performance is reduced, and the response time of 95% is doubled. This is expected because we have doubled the number of connections compared to the previous step.

Step four, 75000 connections

In this step, we add another server as the client, and each client runs 25000 threads. The results are as follows

[ 157s] threads: 25000, tps: 0.00, reads: 11633.87, writes: 0.00, response time: 2651.76ms (95%), errors: 0.00, reconnects:  0.00
[ 158s] threads: 25000, tps: 0.00, reads: 10783.09, writes: 0.00, response time: 2601.44ms (95%), errors: 0.00, reconnects:  0.00

Step five, 100000 connections

Finally arrived, this step is also no difficulty, just need to open a client, also run 25000 threads. The results are as follows

[ 101s] threads: 25000, tps: 0.00, reads: 8033.83, writes: 0.00, response time: 3320.21ms (95%), errors: 0.00, reconnects:  0.00
[ 102s] threads: 25000, tps: 0.00, reads: 8065.02, writes: 0.00, response time: 3405.77ms (95%), errors: 0.00, reconnects:  0.00

The throughput remains at the level of 32260 (8065 * 4), and 95% of the response time is 3405ms.

Here is a very important thing, you must have found: in the case of thread, the response speed of 10W connections is even better than that of 1W connections without thread pool.Thread pool enables percona server to manage resources more effectively and provide better response speed.

conclusion

The number of 10W connections can be achieved, and can be more. There are three important components to achieve this goal:

  1. Thread pool of percona server

  2. Correct network settings

  3. Configure multiple IP addresses for MySQL server (limit 65535 connections per IP)

appendix

Finally, paste the complete my.cnf file

[mysqld] 
datadir {{ mysqldir }} 
ssl=0 
skip-log-bin 
log-error=error.log 
# Disabling symbolic-links is recommended to prevent assorted security risks 
symbolic-links=0
character_set_server=latin1 
collation_server=latin1_swedish_ci
skip-character-set-client-handshake
innodb_undo_log_truncate=off
# general
table_open_cache = 200000
table_open_cache_instances=64
back_log=3500
max_connections=110000
# files
innodb_file_per_table
innodb_log_file_size=15G
innodb_log_files_in_group=2
innodb_open_files=4000
# buffers
innodb_buffer_pool_size= 40G
innodb_buffer_pool_instances=8
innodb_log_buffer_size=64M
# tune
innodb_doublewrite= 1
innodb_thread_concurrency=0
innodb_flush_log_at_trx_commit= 0
innodb_flush_method=O_DIRECT_NO_FSYNC
innodb_max_dirty_pages_pct=90
innodb_max_dirty_pages_pct_lwm=10
innodb_lru_scan_depth=2048
innodb_page_cleaners=4
join_buffer_size=256K
sort_buffer_size=256K
innodb_use_native_aio=1
innodb_stats_persistent = 1
#innodb_spin_wait_delay=96
innodb_adaptive_flushing = 1
innodb_flush_neighbors = 0
innodb_read_io_threads = 16
innodb_write_io_threads = 16
innodb_io_capacity=1500
innodb_io_capacity_max=2500
innodb_purge_threads=4
innodb_adaptive_hash_index=0
max_prepared_stmt_count=1000000
innodb_monitor_enable = '%'
performance_schema = ON

Pay attention, don’t get lost

OK, everyone, the above is the whole content of this article, you can see the people here, they are allpersonnel. As I said before, there are many technical points in PHP, but also because there are too many. We can’t write them down, and we won’t see too much when we write them down. So I’ve organized them into PDF and documents here, if necessary

Click to enter the code: blog Garden


More learning content can be accessedAs long as you can read it, your salary will go up a step

The above contents hope to help youA lot of PHPer always encounter some problems and bottlenecks when they are upgrading. They write too much business code and have no sense of direction. They don’t know where to start to improve. For this, I have sorted out some materials, including but not limited to:Distributed architecture, high scalability, high performance, high concurrency, server performance tuning, tp6, laravel, yii2, redis, spool, swoft, Kafka, MySQL optimization, shell script, docker, microservice, nginxAnd so on many knowledge points, advanced dry goods need can be free to share with you, need can join mePHP Technology Exchange Group953224940

Recommended Today

Oracle automatic collection task statistics, SQL tuning

–View auto collection task status SELECT CLIENT_NAME,STATUS FROM DBA_AUTOTASK_CLIENT; SELECT CLIENT_NAME, STATUS FROM DBA_AUTOTASK_CLIENT WHERE CLIENT_NAME = ‘auto optimizer stats collection’; –Turn off auto optimizer stats collection BEGIN DBMS_AUTO_TASK_ADMIN.DISABLE(CLIENT_NAME => ‘auto optimizer stats collection’, OPERATION => NULL, WINDOW_NAME => NULL); END; / –Shut down SQL optimizer SQL Tuning Advisor (STA) BEGIN DBMS_AUTO_TASK_ADMIN.DISABLE(CLIENT_NAME => ‘sql tuning […]