Quantcast
Channel: Bartek's coding blog
Viewing all articles
Browse latest Browse all 325

Quick case: Char Pointer vs Char Array in C++

$
0
0

When you write:

char strA[] = "Hexlo World!";
strA[2] = 'l';

Everything works as expected. But what about:

char *strP = "Hexlo World!";
strP[2] = 'l';

Do you think it will work correctly? If you are not sure, then I guess, you might be interested in the rest of article.

In Visual Studio 2013 I got this message:

vs error

Definitely not nice! And probably some stupid mistake :)

What's the problem?

The first example shows simple array initialization. We can read/write from/to the array strA. We can even print its size:

cout << "sizeof(strA) = "<< sizeof(strA) << endl;

And guess what? The output is of course 13.

Our second case looks almost the same. Though, there is a subtle, but important, difference.

cout << "sizeof(strP) = "<< sizeof(strP) << std;
strP[2] = 'l'; // << crash

This will print size of the pointer (4 or 8 bytes). The problem is that this pointer points to read-only memory! That way, when we want to modify the string (strP[2] = 'l';), we get runtime error.

Let's see C++FAQ for some details:

A string literal (the formal term for a double-quoted string in C source) can be used in two slightly different ways:

1) As the initializer for an array of char, as in the declaration of char a[] , it specifies the initial values of the characters in that array (and, if necessary, its size).

2) Anywhere else, it turns into an unnamed, static array of characters, and this unnamed array may be stored in read-only memory, and which therefore cannot necessarily be modified.

Our first case follows the first rule - it is array initialization. The second code is unnamed static array of character.

It seems that it is up to compiler to decide whether such string goes to read-only or read-write section. Usually compilers (GCC and Visual Studio) places it in the read only block. Any attempt to change such memory location will be a bug.

Advice: do not use char *p = "..."! Use string literals only for const char * or array initialization. Also, remember about std::string which is usually more useful.

GCC Note: It seems that GCC does a better job when compiling such code, you will get the following warning:

deprecated conversion from string constant to 'char*'[-Wwrite-strings]

VC Note: in Visual Studio 2013 there is option "/Zc:strictStrings" to prevent such conversion.

The .rdata/.rodata section

Executable file on Windows has PE (Portable Executable) format. On Linux we have ELF (Executable and Linkable Format).

The above binary formats (and others) have, to simplify, two basic sections: DATA and CODE:

Simplified EXE format
  • DATA - this section stores global and initialized variables. Here lies our read-only-data subsection:
    • rdata - PE
    • rodata - ELF
  • CODE - or TEXT section - stores compiled binary code. This section is also read-only.

In general when we write:

const int globalA = 10; // will be in .r(o)data

but:

int globalArray[100]; // will be stored in .data, no initialization.
// To clarify, it will go .bss section.

String literals (assigned to pointers) also follow 'global variables' rule. So it it important to treat them as read-only and do not try to change them!

Wrap up

  1. char *s = "aaa" creates a pointer, to read only chunk of memory. If you try to modify this data you will get runtime error!
    • Do not use such construction in your code!
  2. char s[] = "aaa" creates a normal array and initializes it.
  3. Executable format (PE or ELF) consists of several sections. Two, most notable, are DATA and TEXT. In DATA all global and initialized variables are stored. In TEXT there is a compiled code.

References

Actions

Did you have similar problems?

Any strange errors/bugs with read only sections?


Thanks for comments:

Viewing all articles
Browse latest Browse all 325

Trending Articles