Talk about. Net object life cycle (garbage collection)

Time:2020-3-28

Managed heap: a heap that doesn’t worry programmers

When a program runs on a computer, it will inevitably occupy memory resources to store the data in the process of program running. We divide memory intoHeap memory and stack memory

Stack memory is usually used in the following scenarios:High requirements for access speed and small amount of data

A typical example of using stack memory is function stack. When every function is called, it will be allocated a piece of memory. This memory is called stack memory, which accesses data in the way of first in and last out. In the process of function execution, push data (value type data: int, float, object reference…) into function stack continuously. After function execution, the number in function stack will be added According to pop-up one by one, the access speed is fast because it is accessed in the form of operation stack.

Heap memory, literally, is like a pile of junk in the warehouse. If you need to store something, just throw it inside. There is space in the warehouse. In fact, it is the same. The heap memory can store data with enlarged specifications (such as object resources). These data are not suitable for storage in the stack, because the capacity of the stack space is limited, which is the advantage of heap memory over stack memory: large capacity. But its disadvantage is also obvious, that is, the data of accessing heap memory is very slow compared with accessing stack memory. Imagine how it feels to let you find what you want in a pile of junk in the warehouse.

In terms of memory allocation, heap memory is different from stack memory. Function stack is automatically allocated when each function is executed and automatically recycled after function execution. If you want to use heap memory, you have to do it yourself.

So you can see that C programmers use heap memory like this:

Int * P = (int *) malloc (sizeof (int)); // request a block of heap memory with the number of bytes of int in the heap memory, and return a pointer to the memory area
*p = 10; 
Free (P); // free heap memory resources

You’ll also see C + + programmers write like this:

Car * BMW = new car(); // create a car class object, store the object data in the heap memory, and return a pointer to the object resource
Delete BMW; // release heap memory resources

Of course, you don’t need to panic if you haven’t touched C / C + +. The above is just to let you know that in C / C + + language, if a programmer wants to use heap memory, he must explicitly write code to allocate and release heap memory resources.

Someone asked: what are the consequences of not releasing heap memory manually after using it?

The answer is: because the heap memory resource user does not release the memory in time, the memory can not be used again, resulting in memory resource leakage (waste).

At this time, the C ා programmer smiled and saw that his fingers were very light and elegant, typing out the following line of code on the screen:


 Car bmw = new Car();

The C programmers and C + + programmers around were stunned. They didn’t know whether they were so relaxed when they were typing code. The C + + programmer stroked his shiny forehead with his hand, and suddenly his eyes flashed, shouting: “you haven’t released the heap memory resources, you are very dangerous, and memory leaks. Quickly, write the code to release the heap memory!”

It seems that the c-programmer is not moved. He leans comfortably on the chair, glances at the C + + programmer with Yu Guang, and says, “don’t panic, don’t panic, this object is well placed on the managed heap, don’t worry about me.” so, the c-programmer talks about it

In the. Net world, use the new keyword to create an object. First, the object resource is allocated in the managed heap, and then new will return a reference to the object on the heap instead of the real object itself. If the reference variable is declared as a local variable in the method scope, the reference variable is saved in the stack for later use by the application.

 

Managed heap, as the name implies, is a heap entrusted to others. Who is managing the object resources on this heap?

  The answer is: CLR (common language runtime). After the object is instantiated, GC (garbage collector) will destroy the object when it is no longer needed.

That is to say, by allowing the garbage collector to destroy the objects, the trouble of memory management will be left to CLR, and everything will be fine.

 

It seems that all the problems have been solved. It is just that the CLR is responsible for the heap memory resource recycling. Don’t you want to know more? For example, the following questions:

1. How does the garbage collector judge when an object is no longer needed?

2. When will the garbage collector perform garbage cleaning?

Don’t worry. Look down slowly with the questions.

CIL’s new instruction – the trigger of garbage collection

The new keyword in C ා will eventually be translated into the CIL newobj instruction by the compiler. Let’s take a closer look at the function of the CIL newobj instruction.

First, you need to understand that the managed heap is more than just a random block of memory that can be accessed by the CLR. The. Net garbage collector is the “cleaner” of the heap, compressing free blocks of memory (when needed) for optimization purposes. To aid compression, the managed heap maintains a pointer (usually called the next object pointer or the new object pointer) that identifies the address of the next object in the heap.

In addition, the newobj instruction tells the CLR to perform the following core tasks:

(1) Calculates the total memory required for the object to be allocated (including the memory required for data members of this type and base classes of this type).

(2) Check the managed heap to make sure there is enough space for the requested objects. If there is enough space, the constructor of this type will be called. The constructor will return a reference to the new object in memory. The address of the new object is exactly where the next object pointer pointed to last time.

(3) Finally, before returning the reference to the caller, point the next object pointer to the next available location in the managed heap.

The following figure illustrates the details of allocating objects on a managed heap.

Allocating objects in C is a very frequent operation. In this way, the space on the managed heap will be squandered sooner or later. So, the point is coming,If the CLR finds that the managed heap does not have enough space to allocate the requested type, it performs a garbage collection to free memory

When garbage collection is performed, the garbage collector temporarily suspends all active threads in the current process to ensure that the application does not access the heap during the collection process. (a thread is the execution path in an executing program.). Once the garbage collection is complete, the suspended thread can resume execution. Fortunately, the. Net garbage collector is highly optimized, so users rarely notice short interruptions in the application.

Through the interpretation of the new instruction function of CIL, we know:If the managed heap does not have enough space to allocate a requested object, a garbage collection is performed.

(when it comes to this, the programmer stops and drinks the Chinese wolfberry, red dates and tonic tea in the thermos cup A kind of , clear your throat, continue to solve the puzzle…)

The role of application root – distinguishing unreachable objects

Now let’s talk about how the garbage collector determines when an object is “no longer needed.”. To understand the details, you need to know the concept of the application root.

In short, a root is a reference to an object on the heap. Strictly speaking, a root can have the following situations:

(1) References to global objects (although not supported by C, CIL code allows global objects to be assigned)

(2) Point to any static object

(3) Point to a local object in an application code

(4) Point to an object parameter passed into a function

(5) Point to the object waiting to be finalized

(6) Any CPU register that points to an object

During a garbage collection, the runtime environment checks whether the objects on the managed heap are still reachable from the application root. To check reachability, the CLR creates a graph representing each reachable object on the heap. Object graph is used to record all reachable objects. At the same time, note that the garbage collector never marks an object twice on the graph, thus avoiding annoying circular references.

Suppose there is a collection of objects named a, B, C, D, e, F, and G on the managed heap. During a garbage collection, these objects (including internal object references that these objects may contain) are checked for root reachability. Once the graph is established, unreachable objects (in this case objects C and F) are marked as garbage.

The following figure is a possible object graph of the above scene (you can read the arrow as dependency or need, for example, “e depends on G, indirectly depends on B, a does not depend on any object”, etc.).

(the object graph created is used to determine which objects are reachable by the application root.)

Once an object has been marked as terminated (in this case, C and F — not in the diagram), it is cleaned up in memory. At this point, the remaining memory space on the heap is compressed, which causes the CLR to modify the active application root collection (and corresponding pointers) to point to the correct memory location (this operation is automatically transparent). Finally, adjust the next object pointer to the next available memory location.

The following figure illustrates the process of clearing and compressing the heap.

So far, by understanding the role of application roots, we know how to know that an object is “no longer needed.”.Generally speaking, this object does not need to be accessed in the application, and it becomes an “island”, and naturally it is no longer needed.

(in order to make C + + programmers understand the secrets of. Net garbage collection better, C programmers continue to talk…)

Understand the generation of objects – Optimization of garbage collection process

CLR does not check every object on the managed heap when trying to find an unreachable object. Obviously, this can take a lot of time, especially in large (such as real) programs.

To help optimize this process, each object on the heap is assigned a special generation. The idea behind the concept of generation is simple:The longer an object lives on the heap, the more likely it will continue to exist, i.e., the longer the older object lives, the shorter the newer object lives。 For example, the object that implements main() remains in memory until the end of the program. Instead, objects that have only recently been put on the heap, such as those allocated in a function’s scope, are likely to be inaccessible soon.

  Each object on the heap belongs to one of the following generations:

Generation 0: identifies a recently allocated object that has not been marked for recycling

Generation 1: identifies an object that has survived a garbage collection (for example, it is marked as recycled, but it has not been cleared because there is enough heap space)

Generation 2: identifies an object that has survived more than one round of garbage collection.

The garbage collector first checks all objects of generation 0. If enough memory space is generated after marking and cleaning up these objects, any surviving objects are promoted to generation 1. To understand how the generation of an object affects the recycling process, see the figure below. The following figure explains the process of promoting the surviving objects after a garbage collection in generation 0.

(live objects in generation 0 are promoted to generation 1)

If all generation 0 objects are checked, but the generated memory space is still insufficient, check the reachability of all objects in generation 1 and recycle. The surviving generation 1 object is promoted to generation 2. If the garbage collector still needs extra memory, the object of generation 2 is checked and recycled. At this point, if an object of generation 2 survives, it is still an object of generation 2.

In fact, through the design of object generation, we want to achieve such an effect: new objects (such as local variables) will be recycled quickly, while older objects (such as an application object) will not be harassed frequently.

After all, object generation is designed to optimize the garbage collection process.

“I have one last question,” said the C + + programmer, who couldn’t help wondering all the time. “You’ve said so much about managed resources. Isn’t there any unmanaged resources in. Net? How does. Net release the unmanaged resources.

“That’s a good question! “, C ා the programmer laughed, and then began to solve the puzzle (blow b)

Building a finalizable object — unmanaged resource processing formula 1

In the intuition of a C ා developer, most C ා classes don’t need explicit cleaning logic. The reason is simple: if the type uses other managed objects, everything will eventually be garbage collected.

Q: when does it need to be explicitly cleaned up?

The answer is: when you use unmanaged resources (such as the original operating system file handle, the original unmanaged data connection, or other unmanaged resources), you may need to design a class to clean up your garbage after using up.

For example, the following class:

//Database context class
  public class SqlDbContext
  {
    //... (other referenced object instances)

    //Unmanaged resources contained in class (need to call dispose() function to release resources)
    SqlConnection sqlConnection = new SqlConnection("..."); 
      
  }

Now the problem is that we need to call the method of releasing resources by the database connection class object at the right time (after the SqlConnection class object is used, we need to call the dispose () method to release resources). This is the right time for the object to be garbage collected by the CLR, so the question arises again. Is there a method that can be called and extended at this time?

Yes, we can use the virtual method named finalize() defined in the. Net base class system. Object, which is also called the terminator method. It is as follows:

  

It’s strange to see this, of course. It doesn’t mean that there is a finalize () method. Where is it, tickle me? Don’t be surprised. In fact, ~ object () here is finalize (), just a grammar sugar.

The call to finalize () will (eventually) occur during a “natural” garbage collection or program forced collection via GC. Collect(), so the finalizer method is where the class object releases the internal unmanaged resources. Nice, now we can write code to clean up unmanaged resources like this:

//Database context class
    public class SqlDbContext
    {
      //... (other referenced object instances)

      //Unmanaged resources contained in class (need to call dispose() function to release resources)
      SqlConnection sqlConnection = new SqlConnection("..."); 

      ~SqlDbContext()
      {
        //Clear unmanaged resources here
        this.sqlConnection.Dispose();
      }
      
    }

The object thus constructed is called a finalizable object.

There are details about the termination process, as described in the book “C ා and. Net4 advanced programming (5th Edition)”:

We know from the above: the time to clear unmanaged resources through finalize() can only be in the process of garbage collection of. Net objects, and the ending process is a consuming action.

Here comes the question again: many unmanaged resources are very valuable (such as database and file handle), so these resources should be cleaned up as soon as possible after use, rather than relying on garbage collection, so what form should these resources be displayed and released?

 

Building resolvable objects – unmanaged resource processing

In addition to overriding finalize(), the class can also implement the IDisposable interface, which defines a method called dispose():


 public interface IDisposable
  {
    void Dispose();
  }

Its use method is: write the release code of the unmanaged resource in the dispose () method of the class. The programmer can call the dispose () method of the object manually when the object is no longer needed to release the unmanaged resource in time.

So you can write classes like this:

//Database context class
    public class SqlDbContext:IDisposable
    {
      //... (other referenced object instances)

      //Unmanaged resources contained in class (need to call dispose() function to release resources)
      SqlConnection sqlConnection = new SqlConnection("..."); 

      public void Dispose()
      {
        //Clear unmanaged resources here
        this.sqlConnection.Dispose();
      }
    }

Classes that release unmanaged resources in this way are called disposable objects.

In addition, C ා provides a syntax sugar to simplify the operation of calling dispose(), as follows:

SqlDbContext context = new SqlDbContext();

  try
  {
    //Use sqldbcontext class object context in this scope
  }
  finally
  {
    // ensure that the Dispose () method is invoked after use.
    context.Dispose();
  }

The above code is equivalent to the following code:

using (SqlDbContext context = new SqlDbContext())
  {
    //Use sqldbcontext class object context in this scope
  }

The C + + programmer said, “you don’t have to call it by yourself. If I forget to call dispose(), it’s not all over!”

The programmer sneers, “no, no, I’ll teach you the last move!”

The strongest mode of unmanaged resources

A man is not a sage, but a man can do nothing wrong. There are also times when programmers fail, such as forgetting to call the dispose () method

At this time, we must design a foolproof method to achieve a goal: whether or not we call dispose () manually, the unmanaged resources should be properly released in the end. To solve this problem, we can define a disposable object class as follows:

//Database context class
    public class SqlDbContext:IDisposable
    {
      //... (other referenced object instances)

      //Unmanaged resources contained in class (need to call dispose() function to release resources)
      SqlConnection sqlConnection = new SqlConnection("..."); 

      ~SqlDbContext()
      {
        //Clear unmanaged resources here
        this.sqlConnection.Dispose();
      }

      public void Dispose()
      {
        //Clear unmanaged resources here
        this.sqlConnection.Dispose();

        //Skip the end process
        GC.SuppressFinalize(this);
      }

It can be seen that there is a Dispose () method for rewriting the termination method in this class, which ensures that if the programmer forget to release the unmanaged resource by calling the Dispose () method, the object will call the termination method in the garbage collection process to release the unmanaged resource; if the programmer calls the Dispose () method, then GC.SuppressFinalize (this). It will ensure that the end method of the object will not be called in the garbage collection process and avoid unnecessary resource overhead. It can be said that “the combination of two swords” can guarantee everything.

The above is the whole content of this article. I hope it will help you in your study, and I hope you can support developepaer more.