C++ “char* M” Better Understanding

CPP Charm explain

Our favorite programming language, without a doubt, has one of the most charming sets of rules and conventions in the developing field. Almost every C++ book starts off their initial chapters saying: “C++ is a fancy challenging language and widespread around the world, however, do things using it, even when you know exactly what you want can be tricky.”. Maybe, because of that the world still has few good C++ programmers (I am struggling to become one of them). This post relies on one of the C++ charms that usually takes a considerable amount of time to be processed.

Why do I have to declare a string on this way?

One of the main questions that I spent some time wondering about was: “Why do I have to declare a string in this way:

#include<iostream>

int main(int argc, char **argv)
{
    const char* string = "simple_array_of_char"
}

At the same statement, a const and a *, is that right ??!! And where are the square brackets required for a C style array? Those questions did not hesitate to pursue me, but after some studying and some dummy tests, I got the real role of this and how it works behind the scenes.

C-Style Strings

Actually, considering the concepts established by the C++, the term string is not properly employed to the C-Style char arrays, once, they are a set of characters that uses a null terminator. A null terminator is a special character (‘\0’, ASCII code 0) used to indicate the end of the string. More generically, A C-style string is called a null-terminated string.

C-style strings are the primary string type of the C language. Since C++ is a superset of C, it too has use of C-style strings. Since C-style strings are arrays, they have a fixed maximum length. “Dynamic” strings are possible, but they require the programmer to manually manage memory—a process that can be painful and debug-consuming.

To declare a C-Style string there are two methods: the charming/fancy one and the regular one. I would rather the charming one, obviously. The charming one looks like the first snippet presented above while the regular is equal to:

#include<iostream>

int main(int argc, char **argv)
{
    char string[] = "simple_array_of_char";
}

Once we already know that exist more than one way to declare a C-Style string lets to get a deeper view on it and decide what type of declaration fit better each application. style

C++ Memories Areas

To understand better when and why to use each one is important that some concepts related to C++ accessible memories areas be on our set of skills.

Above we can see all of the memories areas that a conventional computer shares with our applications. In this address you can check more information about that. The 2 main memories that we have to concern is Stack and Heap. The first one is responsible for manage all local variables, arguments passed to functions and the provided function return, which all of them are organized following the Last In First Out (LIFO) basis. In the other hand, all pointers declared using the new keyword is handled by Heap and is managed by the programmer, i.e., is a painful source of problems because a lot of situations might arise and lead your application to undefined behavior and mainly memory leak. A further post regarding this interesting theme will be published soon.

Thus, when you declare it:

#include<iostream>

int main(int argc, char **argv)
{
    const char* string = "simple_array_of_char"
}

We are assigning directly a string value to a pointer, which in most of the compilers, is stored in other kind of memory area, a read only (generally in data segment) that is shared among functions.

Thus, in the above line “simple_array_of_char” is stored in a shared read only location, but pointer string is stored on Stack. You can change string to point something else but cannot change value at present string. So this kind of string should only be used when we don’t want to modify string at a later stage in program and hence we increase the compiler optimization. This kind of string is also know as literal string.

Maybe, now you might be wondering: if the literal part is stored in a only-read memory area, why to use the keyword const in this kind of declaration? Because if try it:

#include<iostream>
int main(int argc, char **argv)
{
    char *string = "simple_array_of_char";
    // The following line tries to change the character 'i' to 'o'
    *(string+1) = 'o'; // remember that the pointer name points to its first 
                       // element
    std::cout << string << std::endl; 
}

In the code above, depending on your compiler type, you probably are going to be able to build it, however, during run-time you are going to get one of the most famous c++ type of error: segmentation fault. Using const in the last example instead of a run-time error you would have a compilation-time error.

STL

Before starting describe the advantages and disadvantages of using STL interface for string lets highlight the main drawbacks that the use of C-style strings might provide.

C arrays do not provide interfaces to keep track of their own size. A workaround is based on the strlen, a linear-time function that returns the string length considering the /0 boundary. As C has no concept of well-defined boundary protection, the use of the null character is critical: the C library functions require it, or else they operate past the bounds of the array
Depending on the programmer’s level of maturity work with literal strings can be tricky, since it has a set of functions that does not have a intuitive behavior. I remember my first days as programmer using those functions, it was a mess since the concepts of strcpy and strcatinvolve passing correctly the arguments according the desired order. Inverting arguments can have a non-obvious yet negative effect.
Another drawback relies on the fixed size array that are true for both literals and char[] styles. To consider a dynamic approach programmers must worry about manually allocating, resizing, and copying strings.

Now, lets pretend that you found a C++ genius and he gave you a right to 3 wishes to be implemented in the next release. Considering the C-style background presented before, what would be the features you ask to Genius to be implemented?

……

std::string came to solve most of the cited issues owned by C-style strings.

Considering the aforementioned characteristics a const char* use is more suitable for string arrays that are not expected to change. Once the regular type (char[]) allows changes throughout the code

OUTLAWS

Matheus Nascimento's experiences and curiosities about robotics and some life aspects.

C++ “char* M” Better Understanding

Leave a comment Cancel reply

Compartilhe isso:

Related

Leave a comment Cancel reply