Quantcast
Channel: Bartek's coding blog
Viewing all articles
Browse latest Browse all 325

What happens to your static variables at the start of the program?

$
0
0

Static Variables in C++

Saying that C++ has simple rules for variables initialization is probably quite risky :) For example, you can read Initialization in C++ is Bonkers : r/cpp to see a vibrant discussion about this topic.

But let’s try with just a small part of variables: static variables.
How are they initialized? What happens before main()(*) ?

Warning:: implementation dependent, see explanations in the post.

Intro

Have a look at the following code where I use a global variable t (nice and descriptive name... right? :)) :

classTest
{
public:
Test(){}
public:
int _a;
};

Test t;// <<

int main()
{
return t._a;
}

What is the value of t._a in main()?
Is the constructor of Test even called?

Let’s run the debugger!

Debugging

I’ll be using Visual Studio 2017 to run my apps. Although the initialization phase is implementation depended, runtime systems share a lot of ideas to match with the standard.

I created a breakpoint at the start of Test::Test() and this is the call stack I got:

test_static.exe!Test::Test() Line 12
test_static.exe!`dynamic initializer for '_t''() Line 20
ucrtbased.dll!_initterm(void(*)() * first, void(*)() * last) Line 22
test_static.exe!__scrt_common_main_seh() Line 251
test_static.exe!__scrt_common_main() Line 326
test_static.exe!mainCRTStartup() Line 17

Wow… the runtime invokes a few functions before the main() kicks in!

The debugger stopped in a place called dynamic initializer for '_t''(). What’s more, the member variable _a was already set to 0.

Let’s look at the steps:

Our global variable t is not constant initialized. Because according to the standard constant initialization @cppreference it should have the form:

static T &ref=constexpr;
static T object=constexpr;

So the following things happen:

For all other non-local static and thread-local variables, Zero initialization takes place.

And then:

After all static initialization is completed, dynamic initialization of non-local variables occurs…

In other words: the runtime initializes our variables to zero and then it invokes the dynamic part.

Zero initialization

I’ve found this short and concise summary of Zero Initialization @MSDN:

  • Numeric variables are initialized to 0 (or 0.0, or 0.0000000000, etc.).
  • Char variables are initialized to ‘\0’.
  • Pointers are initialized to nullptr.
    • Arrays, POD classes, structs, and unions have their members initialized to a zero value.

Out object t is a class instance so that the compiler will initialize its members to zero.

What’s more, global variables might be put into BSS segment of the program. Which means that they don’t take any space on disk. The whole BSS segment is represented by only the length (sum of sizes of all global variables). The section is then cleared (something like memset(bssStart, bssLen, 0)).

For example, looking at the asm output from my code it looks like MSVC put t variable in _BSS:

_BSS    SEGMENT
?t@@3VTest@@A DD 01H DUP (?) ; t
_BSS ENDS

You can read more @cppreference - zero initialization

Dynamic initialization

From the standard 6.6.2 Static initialization “basic.start.static”, N4659, Draft

Together, zero-initialization and constant initialization are called static initialization; all other initialization is dynamic initialization.

In MSVC each dynamic initializer is loaded into arrays of functions:

// internal_shared.h
typedefvoid(__cdecl* _PVFV)(void);
// First C++ Initializer
extern _CRTALLOC(".CRT$XCA") _PVFV __xc_a[];
// Last C++ Initializer
extern _CRTALLOC(".CRT$XCZ") _PVFV __xc_z[];

And later, a method called _initterm invokes those functions:

_initterm(__xc_a, __xc_z);

_initterm just calls every function, assuming it’s not null:

extern"C"void __cdecl _initterm(_PVFV*const first,
_PVFV
*const last)
{
for(_PVFV* it = first; it != last;++it)
{
if(*it ==nullptr)
continue;

(**it)();
}
}

If any of the initializers throws an exception, std::terminate() is called.

Dynamic initializer for t will call its constructor. This is exactly what I’ve seen in the debugger.

On Linux

According to Linux x86 Program Start Up and Global Constructors and Destructors in C++:

There’s a function __do_global_ctors_aux that calls all “constructors” (it’s for C, but should be similar for C++ apps). This function calls constructors that are specified in the .ctors of ELF image.

As I mentioned, the details are different vs MSVC, but the idea of function pointers to constructors are the same. At some point before main() the runtime must call those constructors.

Implementation Dependent

Although non-local variables will be usually initialized before main() starts, it's not guaranteed by the standard. So if your code works on one platform, it doesn't mean it will work on some other compiler, or even version of the same compiler...

From: C++ draft: basic.start.dynamic#4:

It is implementation-defined whether the dynamic initialization of a non-local non-inline variable with static storage duration is sequenced before the first statement of main or is deferred. If it is deferred, it strongly happens before any non-initialization odr-use of any non-inline function or non-inline variable defined in the same translation unit as the variable to be initialized.

Storage and Linkage

So far I’ve used one global variable, but it wasn’t even marked as static. So what is a ‘static’ variable?

Colloquially, a static variable is a variable that its lifetime is the entire run of the program. Such a variable is initialized before main() and destroyed after.

In the C++ Standard 6.7.1 Static storage duration “basic.stc.static”, N4659, Draft:

All variables which do not have dynamic storage duration, do not have thread storage duration, and are not local have static storage duration. The storage for these entities shall last for the duration of the program

As you see, for non-local variables, you don’t have to apply the static keyword to end with a static variable.

We have a few options when declaring a static variable. We can distinguish them by using: storage and linkage:

  • Storage:
    • automatic - Default for variables in a scope.
    • static - The lifetime is bound with the program.
    • thread - The object is allocated when the thread begins and deallocated when the thread ends.
    • dynamic - Per request, using dynamic memory allocation functions.
  • Linkage
    • no linkage - The name can be referred to only from the scope it is in.
    • external - The name can be referred to from the scopes in the other translation units (or even from other languages).
    • internal - The name can be referred to from all scopes in the current translation unit

By default, if I write int i; outside of main() (or any other function) this will be a variable with a static storage duration and external linkage.

Here’s a short summary:

int i;// static storage, external linkage
staticint t;// static storage, internal linkage
namespace{
int j;// static storage, internal linkage
}
constint ci =100;// static storage, internal linkage

int main()
{

}

Although usually, we think of static variables as globals it’s not always the case. By using namespaces or putting statics in a class, you can effectively hide it and make available according to requirements.

Static variables in a class

You can apply static to a data member of a class:

classMyClass
{
public:
...
private:
staticint s_Important;
};

// later in cpp file:
int s_Important =0;

s_Important has a static storage duration and it’s a unique value for all class objects. They have external linkage - assuming class also has external linkage.

Before C++17 each static class data member have to be defined in some cpp file (apart from static const integers…). Now you can use inline variables:

classMyClass
{
public:
...
private:
// declare and define in one place!
// since C++17
inlinestaticint s_Important =0;
};

As I mentioned earlier, with classes (or namespaces) you can hide static variables, so they are not “globals”.

Static variables in functions

There’s also another special case that we should cover: statics in a function/scope:

voidFoo()
{
staticbool bEnable =true;
if(bEnable)
{
// ...
}
}

From cppreference: storage duration

Static variables declared at block scope are initialized the first time control passes through their declaration (unless their initialization is zero- or constant-initialization, which can be performed before the block is first entered). On all further calls, the declaration is skipped.

For example, sometimes I like to use static bEnable variables in my debugging sessions (not in production!). Since the variable is unique across all function invocations, I can switch it back and forth from true to false. The variable can that way enable or disable some block of code: let’s say new implementation vs old one. That way I can easily observe the effects - without recompiling the code.

Wrap up

Although globals/statics sounds easy, I found it very hard to prepare this post. Storage, linkage, various conditions and rules.
I was happy to see the code behind the initialization, so it’s more clear how it’s all done.

Few points to remember:

  • static variable’s lifetime is bound with the program lifetime. It’s usually created before main() and destroyed after it.
  • static variable might be visible internally (internal linkage) or externally (external linkage)
  • at the start static variables are zero-initialized, and then dynamic initialization happens
  • Still... be careful, as Static initializers will murder your family :)

Ah… wait… but what about initialization and destruction order of such variables?
Let’s leave this topic for another time :)
For now, you can read about static in static libraries: Static Variables Initialization in a Static Library, Example.


Viewing all articles
Browse latest Browse all 325

Trending Articles