Pointers and Arrays
In the first post of this series, we talked a little about pointers . In the second, we talk about references . Today we discuss the intimate relationship (ui!) Between pointers and arrays (or arrays).
Recalling Arrays
An array, or matrix, or an arrangement, is a mathematical abstraction used to represent a set of homogeneous data, ie the same type (int, float, etc.). This abstraction is organized in table format, with rows and columns. Each element in the matrix has unique coordinates (row and column), so that a given element E (i, j) represents the only element in the "line i ',' column j".
The syntax for declaring arrays is as follows:
/ / One-dimensional arrays or vectors 10 ] ; // 10 elementos, do 0 ao 9 Ivet int [10] / / 10 elements, from 0-9 23 ] ; // 23 elementos, do 0 ao 22 Cvet char [23] / / 23 elements, from 0-22 / / Two-dimensional arrays 2 ] [ 3 ] ; // 2 linhas (0 a 1) e 3 colunas (0 a 2) IMAT int [2] [3] / / 2 lines (0-1) and three columns (0-2) 10 ] [ 2 ] ; // 10 linhas (0 a 9) e 2 colunas (0 a 1) DMAT double [10] [2] / / 10 lines (0-9) and two columns (0-1)
Each array element is independent of the others and can be accessed as the following syntax:
10 ] ; Ivet int [10]; 3 ] [ 4 ] ; IMAT int [3] [4]; / / Changing the fourth element of the vector Ivet. / / Remember that you begin to count from the zero element ] = 13 ; Ivet [3] = 13; / / Reading the second element of the vector Ivet ivet [ 1 ] ; int num = Ivet [1]; / / Changing the element of first row, second column IMAT ] [ 1 ] = 42 ; IMAT [0] [1] = 42; / / Reading the element in the third row, fourth column of IMAT imat [ 2 ] [ 3 ] ; int foo = IMAT [2] [3];
In this post we will not discuss the background matrix, we only investigate the relationship between arrays and pointers in a fairly intuitive.
Sizes of Arrays
Since an array is an abstraction that contains multiple values of the same type, how big is it? How much space it occupies in memory?
Consider the code below:
17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 | 10 ] ; Ivet int [10]; 13 ] ; Cvet char [13]; 20 ] ; dvet double [20]; 3 ] [ 4 ] ; CMAT char [3] [4]; 5 ] [ 4 ] ; IMAT int [5] [4]; "sizeof(int) = " << sizeof ( int ) << endl ; court <<"sizeof (int) =" <<sizeof (int) <<endl; "sizeof(char) = " << sizeof ( char ) << endl ; court <<"sizeof (char) =" <<sizeof (char) <<endl; "sizeof(double) = " << sizeof ( double ) << endl ; court <<"sizeof (double) =" <<sizeof (double) <<endl; "sizeof(ivet) = " << sizeof ( ivet ) << endl ; court <<"sizeof (Ivet) =" <<sizeof (Ivet) <<endl; "sizeof(cvet) = " << sizeof ( cvet ) << endl ; court <<"sizeof (Cvet) =" <<sizeof (Cvet) <<endl; "sizeof(dvet) = " << sizeof ( dvet ) << endl ; court <<"sizeof (dvet) =" <<sizeof (dvet) <<endl; "sizeof(cmat) = " << sizeof ( cmat ) << endl ; court <<"sizeof (CMAT) =" <<sizeof (CMAT) <<endl; "sizeof(imat) = " << sizeof ( imat ) << endl ; court <<"sizeof (IMAT) =" <<sizeof (IMAT) <<endl; |
The result is quite reasonable. The space occupied by a matrix equals the number of elements multiplied by the size of the type of element (rows x columns x sizeof (type)). However, if each element is independent, it is assumed that each occupies a separate place in memory, otherwise one will overwrite another element. So, is that each element has its own memory address?
Arrays and memory addresses
For convenience, we will then analyze the possible addresses an array of characters, whose size is only given a one byte:
15 16 17 18 19 20 21 22 23 24 25 26 27 28 | 5 ; const int max = 5; max ] = { 'A' , 'B' , 'C' , 'D' , 'E' } ; Cvet char [max] = ('A', 'B', 'C', 'D', 'E'); / / Display the contents, data values and addresses. "Índice \t Valor \t Endereço do elemento \n " ) ; printf ("Index \ t Value \ t element address \ n"); int i = 0 ; i < max ; i ++ ) { for (int i = 0; i <max; i + +) ( "%d \t %c \t %p \n " , i, cvet [ i ] , & cvet [ i ] ) ; printf ("% d \ t% c \ t% p \ n", i, Cvet [i], & Cvet [i]); ) / / Display the address of the array "Endereço da matriz: %p \n " , & cvet ) ; printf ("Address of array:% p \ n", & Cvet); / / Display the address of the array again "Endereço da matriz: %p \n " , cvet ) ; printf ("Address of array:% p \ n", Cvet); |
The addresses of the elements are sequential, ie, every element is stored alongside the former. Additionally there are two more interesting facts:
- The address of the array (& Cvet), shown in line 25, is the same as the first element of the array;
- The very variable Cvet can be interpreted as a pointer, as shown in line 28;
In C + +, a common array is a contiguous block of memory whose name can be interpreted (cast) as a pointer that points to its first element. Additionally it is valid to make a pointer point to an array where the pointer to the destination is the same type as the type of the array elements. During atrubuição of an array to a pointer, the compiler makes an implicit type conversions. The destination pointer is to be interpreted as a pointer to the memory area occupied by the array.
One of the consequences not so obvious is that during the implicit cast is lost the information that the area was a memory array. Thus is lost the information on the size of the array. From the viewpoint of the pointer, it is pointing to the beginning of an arbitrary block of memory, a size too arbitrary. Exit and go in an array to a pointer means going from an abstraction more restrictive and more high-level abstraction to a less restrictive and lower level.
On the other hand try to assign a pointer to an array generates a compile error for incompatible types. An array is a memory block of n data (bytes), because a pointer has only a given, an address. The compiler can not know beforehand whether a pointer points to an area of 1, 2 or 200 bytes.
15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 | 300 ; const int max = 300; max ] ; Cvet char [max]; pc = 0 ; char * pc = 0; " \n Antes da atribuição \n " ) ; printf ("\ n Before the assignment \ n"); "cvet = %p \n " , cvet ) ; printf ("Cvet =% p \ n", Cvet); "pc = %p \n " , pc ) ; printf ("pc =% p \ n", pc); "sizeof(cvet) = %lu \n " , sizeof ( cvet ) ) ; printf ("sizeof (Cvet) =% lu \ n", sizeof (Cvet)); "sizeof(pc) = %lu \n " , sizeof ( pc ) ) ; printf ("sizeof (pc) =% lu \ n", sizeof (pc)); pc = Cvet; " \n Depois da atribuição \n " ) ; printf ("\ n After the assignment \ n"); "cvet = %p \n " , cvet ) ; printf ("Cvet =% p \ n", Cvet); "pc = %p \n " , pc ) ; printf ("pc =% p \ n", pc); "sizeof(cvet) = %lu \n " , sizeof ( cvet ) ) ; printf ("sizeof (Cvet) =% lu \ n", sizeof (Cvet)); "sizeof(pc) = %lu \n " , sizeof ( pc ) ) ; printf ("sizeof (pc) =% lu \ n", sizeof (pc)); |
Note that before the assignment (line 25) the pointer pc. is null, since it was booted so. Already sizes indicate that the array has 300 bytes and the pointer only 8 (my machine is an amd 64). After the assignment both pass the "point" to the same area of memory, but the sizes do not change. There was an implicit cast to char [300] to char * and the pointer in that game pc can not know the size of the memory area to which it points. But the array Cvet still knowing exactly what he is without existential crisis.
Pointer arithmetic - Vulgo, so what?
Why, if I know that the data in an array are arranged side by side, I can use a pointer that will jump to the next address and accessing the next element. The name of this is arithmetic pointers.
15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 | 6 ; const int max = 6; max ] = { 'B' , 'L' , 'A' , 'B' , 'O' , 'S' } ; Cvet char [max] = ('B', 'L', 'A', 'B', 'O', 'S'); pc = cvet ; char * pc = Cvet; int i = 0 ; i < max ; i ++ ) { for (int i = 0; i <max; i + +) ( "%c" , * ( pc + i ) ) ; printf ("% c", * (pc + i)); ) " \n " ) ; printf ("\ n"); int i = 0 ; i < max ; i ++ ) { for (int i = 0; i <max; i + +) ( "%c" , * pc ++ ) ; printf ("% c", * pc + +); ) " \n " ) ; printf ("\ n"); / / Now with integers max ] = { 1 , 2 , 3 , 4 , 5 , 6 } ; Ivet int [max] = (1, 2, 3, 4, 5, 6); pi = ivet ; int * pi = Ivet; int i = 0 ; i < max ; i ++ ) { for (int i = 0; i <max; i + +) ( "%p = %d \n " , pi, * pi ++ ) ; printf ("% p =% d \ n", pi * pi + +); ) " \n " ) ; printf ("\ n"); / In two dimensions 2 ] [ 3 ] = { { 'B' , 'L' , 'A' } , { 'B' , 'O' , 'S' } } ; CMAT char [2] [3] = (('B', 'L', 'A'), ('B', 'O', 'S')); ppc ; char * ppc; char * ) cmat ; ppc = (char *) CMAT; int i = 0 ; i < 2 ; i ++ ) { for (int i = 0; i <2; i + +) ( int j = 0 ; j < 3 ; j ++ ) { for (int j = 0, j <3 j + +) ( "%c" , * ( ppc + 3 * i + j ) ) ; printf ("% c", * (ppc + 3 * i + j)); ) " \n " ) ; printf ("\ n"); ) " \n " ) ; printf ("\ n"); |
In line 18 the pointer pc is now point to the array Cvet, and consequently to its first element, the character 'B'. In line 21 the contents of PC, which is the address where it was stored the character 'B', ie is increased then de-referenced. In the first pass the value of i is zero, so it is dereferencing the value for 'B'. In the following steps, the following address is being de-referenced to other characters stored in the original array. It is more or less what the compiler does internally when you use the syntax Cvet [i]. The abstraction of array gives you a more friendly way of dealing with contiguous areas of memory that * (pc + i).
But if the abstraction of array is simpler to use than pointer arithmetic?
One answer is line 26. She does the same thing as line 21, but a little faster. In the syntax of line 21, or similarly, the array syntax, access to any given data can be roughly summed up well in the commands:
- Take the base address of the array;
- Add to address the value of the index;
- De-referencing the new address;
Already with pointer arithmetic looks like this:
- De-referencing this address;
The command increment (or account made with addresses) will not tell because it is part of the loop, although i + +; is faster than a = b + c;. Now imagine this small gain of 66% applied to a data area of 1 MB. There will be over 2 million commands less!
The technique of using a pointer to handle an arbitrary area of memory is generally used in low level programming (closer to the machine), manipulation of buffers and strings, among other dirty tricks. In the bowels of computers, operations that sweep large areas of memory, are often performed with pointer arithmetic. At this level, Darwin reigns supreme and only the most prepared will survive. From here the language begins to give a power that only the pure of heart can understand.
An important note is that between lines 31 and 38 the experience is repeated with integers. Note that as the integers are 4 bytes, the increments are automatically made 4 in 4 bytes, not 1 on 1, ie the increment is automatically calculated for sizeof (type). Incrementing a pointer means to access the next area of memory similar to actual data, not just the next address. As the size of a char is one byte, when incremetamos a pointer to char, move only one byte. If you increment a pointer to double, we will move 8 bytes, and so on.
Another observation is that a two-dimensional array can be "linearized" as shown in lines 40-52. This is useful, where applicable, to make better use of processor cache, for example.
void *, the pansexual of pointers
Earlier I said that was only possible to assign an array to a pointer that was for the same type that the array data. I blatantly lying! The reason is that for someone who resigned the post before this topic, it is safer to believe he can not
!
There are two exceptions to the rule. The first is when there is an explicit type conversions and pointer target "thinks" that is pointing to the right kind. An example is on line 44 of the previous code.
The second is the case of pointers to void. A pointer to void is a pointer that makes no demands on the type of data that is in the area of memory to which it points. He is a pointer to a generic area of memory, something very low level.
To use a data pointed to by a pointer to void prior to de-reference it, we must make an explicit cast to a valid type, for if an int * is de-referenced to an int and char * is de-referenced to char, guess what it is de-referenced a void *?
15 16 17 18 19 20 21 22 23 24 25 26 27 | 6 ; const int max = 6; max ] = { 'B' , 'L' , 'A' , 'B' , 'O' , 'S' } ; Cvet char [max] = ('B', 'L', 'A', 'B', 'O', 'S'); pv = cvet ; void * pv = Cvet; / / Compile error. / / * Pv / / How much is sizeof (void)? int i = 0 ; i < max ; i ++ ) { for (int i = 0; i <max; i + +) ( "%c" , * ( ( ( char * ) pv ) + i ) ) ; printf ("% c", * (((char *) pv) + i)); ) " \n " ) ; printf ("\ n"); |
Pointer to void are used when one needs to point to a general area of memory without having control / knowledge of the type of data that this area contains, or functions that can not make assumptions about the types of its parameters, such as the API lib pthreads (link arbitrary).
Closing
The more we delve into the topics on pointers, we are closer to the machine. Much of the power of C and C + + derived therefrom, and a lot of problems too. The complexity is increasing and the risks too. For many therein lies the fun!
Links
- arrays.zip (all sources of the post);
- Pointers on cplusplus.com
- Pointers in blabos.org
- References in blabos.org
- The Guilty


