Scope, link properties, and storage type

Time:2021-7-26

Recently, I read the self cultivation of programmers – links, loading and library. I felt that when I first learned C, I didn’t have a clear understanding of external, static and other keywords. Therefore, I reviewed the relevant parts about scope, link attributes and storage types in C and pointers, added my own understanding, and recorded them on my blog.


Scope

When a variable is declared in a part of a program, it can only be accessed in a certain area of the program

This area is determined by the scope of the identifier, which is the area in the program where the identifier can be used.

As far as I know, it is mentioned in the compilation principle that checking whether the scope of variables conforms to the rules is checked during semantic analysis in compilation.

The compiler can identify four different types of scopes — file scope, function scope, code block scope, and prototype scope.The location of the identifier declaration determines its scope.

  • Code block scope

    All statements between a bunch of curly braces are called a code block, and any identifier declared at the beginning of the code block has the scope of the code block. Indicates that they can be accessed by all statements in this code block.

    In the following figure, a, B, C, D, and Arg all have code block scope.

    /* main.c */
    #include 
    int g;
    
    int func(int x);
    
    int main(int argc, char* argv[]) {
        int a;	
        int b;
        a = 5;
        {
            int c;
            int a; 	// Hide the outer a, and the outer identifier will not be accessible by name in the inner code block.
            c = 5;
            a = 10;
            printf("%d", c + a); 	// Print result: 15 instead of 10
        }
        {
            int d;
            func(d);
        }
    }
    
    int func(int arg) {
        //
    }

    Note: we should avoid the same variable name in nested code blocks, because there is no good reason to use this technique. They will only cause confusion during program debugging or maintenance.

  • File scope

    Any inAllIdentifiers declared outside the code block have a file scope, which indicates that these identifiers are accessible from where they are declared to the end of the source file where they are located. g. Both func and main have file scopes. This is why we write the declaration of func separately before the main function, so that main can call the func function, otherwise main cannot access the func function.

  • Prototype scope

    Prototype scope applies only to parameter names declared in function prototypes, such as X in func declaration statements.

  • Function scope

    Applies only to statement labels, which are used for goto statements.

The latter two scopes are very, very uncommon, so we should focus on the first two scopes.

link property

The link attribute of an identifier determines how to handle identifiers that appear in different files.The scope of an identifier is related to its link properties.But the two properties are not the same.

/* main.c */
#include 
int g;

int func(int x);

int main(int argc, char* argv[]) {
    int a;	
    int b;
    a = 5;
    {
        int c;
        int a; 	// Hide the outer a, and the outer identifier will not be accessible by name in the inner code block.
        c = 5;
        a = 10;
        printf("%d", c + a); 	// Print result: 15 instead of 10
    }
    {
        int d;
        func(d);
    }
}

int func(int arg) {
    //
}
  • external

    Identifiers belonging to external link attributes represent the same entity in several source files no matter how many times they are declared.

    By default, variables or functions declared outside any code block (i.e. with file scope) have external link attributes, and the rest are none. In the code, the G, func and main link attributes are external, and the other variable link attributes are none.

    Extern keyword:

    • External keywordspecifies the external link property for an identifier.
    • The extern keyword is optional for variables whose file scope is already an extern link attribute
    • Extern keyword when used for the first declaration of an identifier in the source file, it specifies that the identifier has an extern link attribute. However, if the identifier is used for the second or subsequent declaration of the identifier, it will not change the link attribute specified by the first declaration.
  • internal

    All declarations of identifiers with internal link attribute in the same source file refer to the same individual, but multiple declarations in different source files belong to different entities.

    If a declaration has an external link attribute under normal circumstances, adding the static keyword in front of it can change its link attribute to internal. For example, if G’s declaration isstatic int g;, the variable g becomes private to the source file. If other source files want to link the variable of G, they refer to a different variable. Similarly, the function declaration can also be static, such asstatic int func(int x);

    staticOnly the declaration that the default link attribute is external has the effect of changing the link attribute.

  • none

    An identifier (none) without a link attribute is always treated as a separate individual, that is, multiple declarations of the identifier are treated as separate individuals.

Storage type

The storage class of a variable refers to the memory type that stores the value of the variable. The storage type of a variable determines when a variable is created, when it is destroyed, and how long its value will remain. There are three places to store variables:

  • General memory

    All variables declared outside any code block (with file scope and external link attribute) are always stored in static memory. Such variables are called static variables and are placed in the. Data section or BSS section of the binary file.

    Static variables are created before the program runs and always exist throughout the execution of the program. He always keeps the original value unless he is attached with a different value or the program ends.

  • Runtime stack

    Of variables declared inside the codeDefault storage typeIt’s automatic. That is, it is stored on the stack and is called an automatic variable.

    If you add the keyword static to him, you can make hisStorage type changes from automatic to static(in the. Data section or. BSS section). A variable with a static storage type exists throughout the execution of the program, not just during the execution of the code block that declares it.Note that modifying the storage type of a variable does not mean modifying the scope of the variable. Although it always exists, it can only be accessed by name after being declared in the code block.

  • Hardware register

    Declaration for automatic variables to remind them that they should be stored in the hardware register of the machine, not in memory. Such variables are called register variables. However, the compiler does not have to pay attention to the register keyword, that is, instead of adding the register keyword before the variable, the variable will finally be stored in the hardware register of the machine. It still depends on the “mood” of the compiler, that is, it depends on the optimization scheme of the compiler.

    Note: the register variable does not provide an address.

On initialization

Initializing a static variable does not require additional time and overhead. The variable will get the correct value. If its initial value is not specified explicitly, the static variable will be initialized to 0. Because static variables exist directly in the. Data section or. BSS section and have been written in by the compiler when generating the target file, the runtime certainly does not take time.

The initialization of automatic variables requires more overhead, because the storage location of automatic variables cannot be determined when the program is linked. In fact, the local variables of the function may occupy different positions in each call of the function. Therefore, for this reason, the automatic variable has no default initial value, and the explicit initialization will insert an implicit assignment statement at the beginning of the code block. I think the implicit here is that an assignment statement is inserted into the code segment, such asmov [ebp -4] , valueIn this caseThe efficiency of initialization and declaration before assignment is not improved, only the difference in style.

Static and external

  • When used in different contexts, the static keyword has different meanings.

    • When used for variables or functions with file scope, the static keyword can change their link properties from external to internal, but the scope and storage type of the identifier are not affected. The function is still placed in the. Text section, and the global variable is placed in the. Data section or. BSS section according to whether it is initialized or not.
    • When it is used for variables with code block scope, its link attribute is none. Static does not change its link attribute, but modifies the storage type of the variable from automatic variable to static variable, and the scope is not affected.
  • Extern keyword

    • When used for variables or functions with file scope, the extern keyword is optional because they have the external link attribute. However, if you define a variable in one place and add the extern keyword to the declarations of other source files that use the variable, readers can better understand your intention.
    • When used for local variables with code segment scope, the external keyword can modify the link attribute of the variable from none to external, which provides a way for us to reference global variables in deeply nested code blocks.