Research on some methods to optimize the performance of Ruby on rails

Time:2022-2-5

1. Cause yourRailsThere are two reasons why applications slow down:

  1. Should not beRubyAnd rails as the preferred place to use Ruby and rails. (doing something you’re not good at with ruby and rails)
  2. Excessive memory consumption leads to the need to use a lot of time for garbage collection.

Rails is a pleasant framework, and ruby is also a concise and elegant language. But if it is abused, it will have a considerable impactperformance。 There are many jobs that are not suitable for ruby and rails. You’d better use other tools. For example, database has obvious advantages in big data processing, and R language is especially suitable for statistics related work.

Memory problems are the primary reason why many Ruby applications slow down. Rails performanceoptimizationThe 80-20 rule is like this: 80% of the acceleration comes from memory optimization, and the remaining 20% belongs to other factors. Why is memory consumption so important? Because the more memory you allocate, the more work Ruby GC (Ruby’s garbage collection mechanism) needs to do. Rails already takes up a lot of memory, and on average, each application takes up nearly 100m of memory just after it starts. If you don’t pay attention to memory control, it is likely that your program memory will grow by more than 1G. So much memory needs to be reclaimed. No wonder GC takes up most of the execution time of the program.

2 how can we make a rails application run faster?

There are three ways to make your application faster: capacity expansion, caching and code optimization.

Capacity expansion is now easy to achieve. Heroku basically does this for you, while hirefire makes the process more automated. Other managed environments offer similar solutions. In short, you can use it if you can. But please keep in mind that capacity expansion is not a silver bullet to improve performance. If your application only needs to respond to a request in five minutes, capacity expansion is useless. Also, using heroku + hirefire can easily lead to overdraft of your bank account. I’ve seen hirefire expand one of my apps to 36 entities, and I paid $3100 for it. I immediately manually reduced the instance size to 2 and optimized the code

Rails caching is also easy to implement. Block caching in rails 4 is very good. Rails documentation is an excellent resource for caching knowledge. However, compared with capacity expansion, caching cannot be the ultimate solution to performance problems. If your code doesn’t work well, you will find that you will spend more and more resources on caching until caching can no longer improve speed.

The only reliable way to make your rails application faster is code optimization. In the rails scenario, this is memory optimization. Of course, if you take my advice and avoid using rails outside its design capabilities, you will have less code to optimize.

2.1 avoid memory intensive rails features

Some rails features cost a lot of memory, resulting in additional garbage collection. The list is as follows.

2.1.1 serializer

A serializer is a practical way to represent strings read from a database as Ruby data types.

?
1
2
3
4
5
class Smth < ActiveRecord::Base
 serialize :data, JSON
end
Smth.find(...).data
Smth.find(...).data = { ... }

It needs to consume more memory to serialize effectively. See for yourself:

?
1
2
3
4
5
6
7
8
class Smth < ActiveRecord::Base
 def data
 JSON.parse(read_attribute(:data))
 end
 def data=(value)
 write_attribute(:data, value.to_json)
 end
end

This will only double the memory overhead. Some people, including myself, have seen a memory leak in rails’s JSON serializer, with about 10% of the data per request. I don’t understand the reason behind this. I don’t know if there is a replicable situation. If you have experience or know how to reduce memory, please let me know.

2.1.2 activity record

It’s easy to manipulate data with activerecord. But activerecord essentially wraps your data. If you have 1g of table data, activerecord indicates that it will cost 2G and more in some cases. Yes, 90% of the time, you get extra convenience. But sometimes you don’t need it. For example, batch updates can reduce activerecord overhead. The following code will neither instantiate any model nor run validation and callback.

Book.where(‘title LIKE ?’, ‘%Rails%’).update_all(author: ‘David’)
In the following scenario, it only executes SQL update statements.

?
1
2
3
4
5
6
7
8
update books
 set author = 'David'
 where title LIKE '%Rails%'
Another example is iteration over a large dataset. Sometimes you need only the data. No typecasting, no updates. This snippet just runs the query and avoids ActiveRecord altogether:
result = ActiveRecord::Base.execute 'select * from books'
result.each do |row|
 # do something with row.values_at('col1', 'col2')
end

2.1.3 string callback

Rails callbacks are like pre / post saving, pre / post actions, and extensive use. But the way you write may affect your performance. There are three ways you can write, such as calling back before saving:

?
1
2
3
4
5
before_save :update_status
before_save do |model|
model.update_status
end
before_save “self.update_status”

The first two methods can work well, but the third one cannot. Why? Because the execution of rails callback needs to store the execution context (variables, constants, global instances, etc.) at the time of callback. If your application is large, you end up copying a lot of data in memory. Because callbacks can be executed at any time, memory cannot be recycled until the end of your program.

Symbolically, the callback saved me 0.6 seconds per request.

2.2 write less Ruby

This is my favorite step. My university professor of computer science likes to say that the best code doesn’t exist. Sometimes other tools are needed to do the task at hand. The most commonly used is the database. Why? Because ruby is not good at dealing with large data sets. Very, very bad. Remember, ruby takes up a lot of memory. So for example, you may need 3G or more memory to process 1g of data. It will take tens of seconds to recycle this 3G. A good database can process these data in one second. Let me give some examples.

2.2.1 attribute preloading

Sometimes the properties of the denormalization model are obtained from another database. For example, imagine that we are building a ToDo list, including tasks. Each task can have one or more tags. The normalized data model is as follows:

  • Tasks
  •  id
  •  name
  • Tags
  •  id
  •  name
  • Tasks_Tags
  •  tag_id
  •  task_id

Load tasks and their rails tags, and you’ll do this:

There is a problem with this code. It creates an object for each tag, which costs a lot of memory. Optional solution to preload tags in the database.

?
1
2
3
4
5
6
7
8
tasks = Task.select <<-END
  *,
  array(
  select tags.name from tags inner join tasks_tags on (tags.id = tasks_tags.tag_id)
  where tasks_tags.task_id=tasks.id
  ) as tag_names
 END
 > 0.018 sec

This only requires memory to store an additional column with an array label. No wonder it’s three times faster.

2.2.2 data set

The data set I’m talking about is any code to summarize or analyze data. These operations can be summarized simply or more complex. Take the group ranking as an example. Suppose we have a data set of employees, departments and wages, and we want to calculate the ranking of employees’ wages in a department.

?
1
SELECT * FROM empsalary;
?
1
2
3
4
5
6
7
8
9
depname | empno | salary
-----------+-------+-------
 develop |  6 | 6000
 develop |  7 | 4500
 develop |  5 | 4200
 personnel |  2 | 3900
 personnel |  4 | 3500
 sales  |  1 | 5000
 sales  |  3 | 4800

You can use ruby to calculate the ranking:

?
1
2
3
4
5
6
7
8
9
10
salaries = Empsalary.all
salaries.sort_by! { |s| [s.depname, s.salary] }
key, counter = nil, nil
salaries.each do |s|
 if s.depname != key
 key, counter = s.depname, 0
 end
 counter += 1
 s.rank = counter
end

The 100k data program in the empsalary table is completed in 4.02 seconds. Instead of Postgres query, use the window function to do the same work more than 4 times in 1.1 seconds.

?
1
2
3
SELECT depname, empno, salary, rank()
OVER (PARTITION BY depname ORDER BY salary DESC)
FROM empsalary;
?
1
2
3
4
5
6
7
8
9
depname | empno | salary | rank
-----------+-------+--------+------
 develop |  6 | 6000 | 1
 develop |  7 | 4500 | 2
 develop |  5 | 4200 | 3
 personnel |  2 | 3900 | 1
 personnel |  4 | 3500 | 2
 sales  |  1 | 5000 | 1
 sales  |  3 | 4800 | 2

4X acceleration is already impressive, and sometimes you get more, up to 20x. Take an example from my own experience. I have a 3D OLAP cube with 600k data rows. My program is sliced and aggregated. In ruby, it takes 1g of memory and about 90 seconds to complete. The equivalent SQL query is completed within 5.

2.3 optimizing Unicorn

If you are using unicorn, the following optimization techniques will apply. Unicorn is the fastest web server in the rails framework. But you can still make it run faster.

2.3.1 preload app application

Unicorn can preload the rails application before creating a new worker process. This has two advantages. First, the main thread can share data in memory through the friendly GC mechanism (Ruby 2.0 or above) copied on write. The operating system will copy these data transparently to prevent it from being modified by the worker. Second, preloading reduces the startup time of the worker process. Rails worker process restart is very common (which will be explained later), so the faster the worker restarts, we can get better performance.

If you need to enable the preloading of the application, you only need to add a line in the configuration file of Unicorn:

preload_app true
2.3.2 GC between requests

Keep in mind that the processing time of GC will account for up to 50% of the application time. This is not the only problem. GC is usually unpredictable and triggers when you don’t want it to run. So, what should you do?

First of all, we will think about what happens if GC is completely disabled? This seems to be a bad idea. Your application is likely to fill 1g of memory soon, and you haven’t found it in time. If your server is running several workers at the same time, your application will soon run out of memory, even if your application is on a self managed server. Not to mention heroku with only 512M memory limit.

In fact, we have a better way. So if we can’t avoid GC, we can try to make the time point of GC running as determined as possible and run it at idle time. For example, between two requests, run GC. This is easily achieved by configuring unicorn.

For Ruby versions prior to 2.1, there was a unicorn module called oobgc:

?
1
2
require 'unicorn/oob_gc'
 use(Unicorn::OobGC, 1) #"1" means "force GC to run after 1 request"

For Ruby 2.1 and later, gctools is the best choice( https://github.com/tmm1/gctools ):

?
1
2
require 'gctools/oobgc'
use(GC::OOB::UnicornMiddleware)

But there are also some considerations for running GC between requests. Most importantly, this optimization technology is perceptible. In other words, users will obviously feel the improvement of performance. But the server needs to do more work. Unlike running GC when needed, this technology requires the server to run GC frequently Therefore, you should make sure that your server has enough resources to run GC, and there are enough workers to handle user requests while other workers are running GC.

2.4 limited growth

I’ve shown you some examples of applications that take up 1g of memory. If you have enough memory, taking up such a large chunk of memory is not a big problem. But Ruby may not return this memory to the operating system. Let me explain why.

Ruby allocates memory through two heaps. All Ruby objects are stored in Ruby’s own heap. Each object occupies 40 bytes (in a 64 bit operating system). When an object needs more memory, it allocates memory in the operating system heap. After the object is garbage collected and released, the memory of the heap in the occupied operating system will be returned to the operating system, but the memory occupied in Ruby’s own heap will only be simply marked as free and will not be returned to the operating system.

This means that Ruby’s heap will only increase, not decrease. Imagine if you read 1 million rows of records from a database, 10 columns per row. Then you need to allocate at least 10 million objects to store this data. Usually, ruby workers occupy 100m memory after startup. In order to accommodate so much data, the worker needs to add an additional 400m of memory (10 million objects, each object occupies 40 bytes). Even if these objects are eventually reclaimed, the worker still uses 500m of memory.

It should be declared here that Ruby GC can reduce the size of this heap. But I haven’t found this function in actual combat. Because in a production environment, conditions that trigger heap reduction rarely occur.

If your worker can only grow, the most obvious solution is to restart the worker whenever it takes up too much memory. Some hosted services do this, such as heroku. Let’s look at other ways to achieve this function.

2.4.1 internal memory control

Trust in God, but lock your car. Moral: most foreigners have religious beliefs and believe that God is omnipotent, but in daily life, who can expect God to help themselves. Faith is faith, but in times of difficulty, we still rely on ourselves. There are two ways to make your application self memory limiting. I do it for them, kind and hard

The kind friendly memory limit is to force the memory size after each request. If the worker takes up too much memory, the worker will end and unicorn will create a new worker. That’s why I call it “kind”. It won’t interrupt your application.

Get the memory size of the process, and use RSS to measure on Linux and MacOS or OS gem on windows. Let me show you how to implement this restriction in the unicorn configuration file:

?
1
2
3
4
5
6
7
8
9
10
class Unicorn::HttpServer
 KIND_MEMORY_LIMIT_RSS = 150 #MB
 alias process_client_orig process_client
 undef_method :process_client
 def process_client(client)
 process_client_orig(client)
 rss = `ps -o rss= -p #{Process.pid}`.chomp.to_i / 1024
 exit if rss > KIND_MEMORY_LIMIT_RSS
 end
end

Hard disk memory limit is to kill your working process by asking the operating system if it grows a lot. On UNIX, you can ask setrlimit to set rssx restrictions. As far as I know, this only works on Linux. The MacOS implementation was broken. I will appreciate any new information.

This fragment comes from the configuration file of Unicorn hard disk restriction:

?
1
2
3
4
5
6
7
8
9
after_fork do |server, worker|
 worker.set_memory_limits
end
class Unicorn::Worker
 HARD_MEMORY_LIMIT_RSS = 600 #MB
 def set_memory_limits
 Process.setrlimit(Process::RLIMIT_AS, HARD_MEMORY_LIMIT * 1024 * 1024)
 end
end

2.4.2 external memory control

Automatic control doesn’t save you from the occasional OMM (out of memory). Usually you should set up some external tools. On heroku, there is no need because they have their own monitoring. But if you are self hosting, using monit, God is a good idea, or other monitoring solution.

2.5 optimizing Ruby GC

In some cases, you can adjust Ruby GC to improve its performance. I would like to say that these GC tuning become less and less important. The default settings of ruby 2.1 have been beneficial to most people later.

My advice is that it’s best not to change GC settings unless you know exactly what you want to do and have enough theoretical knowledge of how to improve performance. This is particularly important for users using Ruby 2.1 or later.

I know there is only one occasion where GC optimization can really improve performance. That is, when you want to load a large amount of data at one time. You can reduce the frequency of GC by changing the following environment variables: Ruby_ GC_ HEAP_ GROWTH_ FACTOR,RUBY_ GC_ MALLOC_ LIMIT,RUBY_ GC_ MALLOC_ LIMIT_ MAX,RUBY_ GC_ OLDMALLOC_ Limit, and ruby_ GC_ OLDMALLOC_ LIMIT。

Note that these variables only apply to Ruby 2.1 and later. For versions before 2.1, a variable may be missing or the variable does not use this name.

RUBY_ GC_ HEAP_ GROWTH_ The default value of factor is 1.8, which is used to determine how much should be added each time when the ruby heap does not have enough space to allocate memory. When you need to use a large number of objects, you want the memory space of the heap to grow faster. In this case, you need to increase the size of the factor.

Memory limit is used to define how often GC is triggered when you need to apply for space from the operating system heap. For Ruby 2.1 and later versions, the default quota is:

?
1
2
3
4
New generation malloc limit RUBY_GC_MALLOC_LIMIT 16M
Maximum new generation malloc limit RUBY_GC_MALLOC_LIMIT_MAX 32M
Old generation malloc limit RUBY_GC_OLDMALLOC_LIMIT 16M
Maximum old generation malloc limit RUBY_GC_OLDMALLOC_LIMIT_MAX 128M

Let me briefly explain the meaning of these values. By setting the above values, ruby will run GC every time a new object is allocated between 16m and 32m and the old object occupies between 16m and 128M (“old object” means that the object has been called by garbage collection at least once). Ruby will dynamically adjust the current limit value according to your memory mode.

Therefore, when you have only a few objects but occupy a lot of memory (for example, reading a large file into a string object), you can increase the limit to reduce the frequency of GC triggering. Remember to increase four limit values at the same time, preferably a multiple of the default value.

My suggestion may be different from others. It may be suitable for me, but not for you. These articles will introduce what works for twitter and what works for discourse.

2.6 Profile

Sometimes, these suggestions are not necessarily universal. You need to figure out your problem. At this time, you need to use the profiler. Ruby Prof is a tool that every Ruby user will use.

To learn more about profiling, please read Chris heal’s and my article on using Ruby Prof in rails. There are also some suggestions about memory profiling that may be a little outdated

2.7 writing performance test cases

Finally, although not the most important skill to improve the performance of rails is to make sure that the performance of the application will not decline again because you modify the code.

3. Concluding remarks

For an article, it is really impossible to cover all aspects of how to improve the performance of ruby and rails. So, after that, I will summarize my experience by writing a book. If you think my suggestion is useful, please register in mailinglist. When I have prepared the preview version of the book, I will inform you as soon as possible. Now, let’s work together to make rails applications run faster!

Recommended Today

JMeter – the difference between HTTP request following redirection and automatic redirection

Automatic redirection: After httpclient receives the request, if the request contains a redirection request, httpclient can jump automatically, but only for get and head requests. If this item is checked, the “follow redirection” will be invalid; Automatic redirection can automatically turn to the final target page, but JMeter does not record the content of the […]