Pointers and Arrays

May 18, 2009 · Posted in C / C + +

In the first post in this series, we talked a little about pointers . In the second, we speak of References . Today we will explore the intimate relationship (ui!) between pointers and arrays (or matrices).

Recalling Arrays

An array or matrix, or an arrangement, it is a mathematical abstraction used to represent a set of homogeneous data, ie the same type (int, float, etc.). This abstraction is organized in a table format with rows and columns. Each element in the array has unique coordinates (row and column), so that a given element E (i, j) represents the only element in the "row i", "column j".

The array declaration syntax is as follows:

  / / One-dimensional matrices or vectors
 10 ] ; // 10 elementos, do 0 ao 9 IVET int [10] / / 10 elements, from 0 to 9
 23 ] ; // 23 elementos, do 0 ao 22 cvet char [23] / / 23 elements, from 0 to 22

 / / Two-dimensional matrices
 2 ] [ 3 ] ; // 2 linhas (0 a 1) e 3 colunas (0 a 2) IMAT int [2] [3] / / 2 lines (0-1) and three columns (0-2)
 10 ] [ 2 ] ; // 10 linhas (0 a 9) e 2 colunas (0 a 1) DMAT double [10] [2] / / 10 lines (0-9) and two columns (0-1) 

Each array element is independent of the others and can be accessed as the following syntax:

 10 ] ; int imat [ 3 ] [ 4 ] ; // Alterando o quarto elemento do vetor ivet. // Lembre-se que começa-se a contar a partir do elemento zero ivet [ 3 ] = 13 ; // Lendo o segundo elemento do vetor ivet int num = ivet [ 1 ] ; // Alterando o elemento da primeira linha, segunda coluna de imat imat [ 0 ] [ 1 ] = 42 ; // Lendo o elemento na terceira linha, quarta coluna de imat int foo = imat [ 2 ] [ 3 ] ; IVET int [10]; IMAT int [3] [4] / / Changing the fourth element of the IVET vector. / / Remember that one starts counting from element zero IVET [3] = 13 / / Reading the second element of the array int a = IVET IVET [1] / / Changing the element of first row, second column IMAT IMAT [0] [1] = 42 / / Reading the element in the third row, fourth column IMAT IMAT int foo = [2] [3]; 

In this post we will not discuss the background matrices, we will only investigate the relationship between arrays and pointers in a fairly intuitive.

Sizes of matrices

Since an array is an abstraction that can hold several values ​​of the same type, what size it? How much space it occupies in memory?

Consider the code below:

  17
 18
 19
 20
 21
 22
 23
 24
 25
 26
 27
 28
 29
 30
 31
 32
  10 ] ; IVET int [10];
 13 ] ; cvet char [13];
 20 ] ; dvet double [20];

 3 ] [ 4 ] ; CMAT char [3] [4];
 5 ] [ 4 ] ; IMAT int [5] [4];

 "sizeof(int) = " << sizeof ( int ) << endl ; court <<"sizeof (int) =" <<sizeof (int) <<endl;
 "sizeof(char) = " << sizeof ( char ) << endl ; court <<"sizeof (char) =" <<sizeof (char) <<endl;
 "sizeof(double) = " << sizeof ( double ) << endl ; court <<"sizeof (double) =" <<sizeof (double) <<endl;

 "sizeof(ivet) = " << sizeof ( ivet ) << endl ; court <<"sizeof (IVET) =" <<sizeof (IVET) <<endl;
 "sizeof(cvet) = " << sizeof ( cvet ) << endl ; court <<"sizeof (cvet) =" <<sizeof (cvet) <<endl;
 "sizeof(dvet) = " << sizeof ( dvet ) << endl ; court <<"sizeof (dvet) =" <<sizeof (dvet) <<endl;
 "sizeof(cmat) = " << sizeof ( cmat ) << endl ; court <<"sizeof (CMAT) =" <<sizeof (CMAT) <<endl;
 "sizeof(imat) = " << sizeof ( imat ) << endl ; court <<"sizeof (IMAT) =" <<sizeof (IMAT) <<endl; 

The result is quite reasonable. The space occupied by a matrix is ​​equal to the number of elements multiplied by the size of the element type (rows x columns x sizeof (type)). Now, if each element is independent, it is assumed that each occupies a separate place in memory, otherwise overwrite the other one element. So is that each element has its own memory address?

Arrays and memory addresses

For convenience, we will then analyze the possible addresses an array of characters, whose size is only given a 1 byte:

  15
 16
 17
 18
 19
 20
 21
 22
 23
 24
 25
 26
 27
 28
  5 ; const int max = 5;
 max ] = { 'A' , 'B' , 'C' , 'D' , 'E' } ; cvet char [max] = {'A', 'B', 'C', 'D', 'E'};

 / / Display the contents, data values ​​and addresses.
 "Índice \t Valor \t Endereço do elemento \n " ) ; printf ("index \ t value \ t address element \ n");
 int i = 0 ; i < max ; i ++ ) { for (int i = 0; i <max; i + +) {
     "%d \t %c \t %p \n " , i, cvet [ i ] , & cvet [ i ] ) ; printf ("% d \ t% c \ t% p \ n", i, cvet [i], & cvet [i]);
 }

 / / Display the address of the array
 "Endereço da matriz: %p \n " , & cvet ) ; printf ("Address of the array:% p \ n", & cvet);

 / / Display the address of the array again
 "Endereço da matriz: %p \n " , cvet ) ; printf ("Address of the array:% p \ n", cvet); 

The addresses of the elements are sequential, ie each element is stored next to the former. Additionally there are two more interesting facts:

  1. The address of the array (& cvet), shown in line 25, is the same as the first element of the array;
  2. Cvet The variable itself can be interpreted as a pointer, as shown in line 28;

In C + +, a common array is a contiguous block of memory whose name can be interpreted (cast) as a pointer that points to its first element. Additionally it is a valid pointer to point to an array as long as the pointer to the destination is the same type as the type of the array elements. During atrubuição of an array to a pointer, the compiler makes an implicit type conversions. The target pointer is to be interpreted as a pointer to the memory area occupied by the array.

One of the consequences not so obvious is that implicit in the cast, the information is lost that the area was a memory array. So the information is lost on the size of the array. From the standpoint of the pointer, it is pointing to the beginning of an arbitrary block of memory, of a size too arbitrary. Exit and go for an array to a pointer means going from a more restrictive and more abstract high-level abstraction for a less restrictive and lower level.

On the other hand try to assign a pointer to an array generates a compile error for incompatible types. An array is a memory block of n (bytes) as a pointer has only a given, an address. The compiler has no way of knowing beforehand whether a pointer points to an area of ​​1, 2 or 200 bytes.

  15
 16
 17
 18
 19
 20
 21
 22
 23
 24
 25
 26
 27
 28
 29
 30
 31
  300 ; const int max = 300;
 max ] ; cvet char [max];
 pc = 0 ; char * pc = 0;

 " \n Antes da atribuição \n " ) ; printf ("\ n Before the assignment \ n");
 "cvet = %p \n " , cvet ) ; printf ("cvet =% p \ n", cvet);
 "pc = %p \n " , pc ) ; printf ("pc =% p \ n", pc);
 "sizeof(cvet) = %lu \n " , sizeof ( cvet ) ) ; printf ("sizeof (cvet) =% lu \ n", sizeof (cvet));
 "sizeof(pc) = %lu \n " , sizeof ( pc ) ) ; printf ("sizeof (cp) =% lu \ n", sizeof (cp));

 pc = cvet;

 " \n Depois da atribuição \n " ) ; printf ("\ n After the assignment \ n");
 "cvet = %p \n " , cvet ) ; printf ("cvet =% p \ n", cvet);
 "pc = %p \n " , pc ) ; printf ("pc =% p \ n", pc);
 "sizeof(cvet) = %lu \n " , sizeof ( cvet ) ) ; printf ("sizeof (cvet) =% lu \ n", sizeof (cvet));
 "sizeof(pc) = %lu \n " , sizeof ( pc ) ) ; printf ("sizeof (cp) =% lu \ n", sizeof (cp)); 

Note that before the assignment (line 25) the pointer pc. is zero, so it was initialized. Since the sizes indicate that the array has only 300 bytes and the pointer 8 (my machine is an amd 64). After assigning both pass the "point" to the same memory area, but the sizes do not change. There was an implicit cast from char [300] to char *, and in this game the pointer pc can not know the size of the memory area to which it points. But the array continues cvet knowing exactly what he is, without any existential crisis.

Pointer arithmetic - Vulgar, so what?

Why, if I know that the data in a matrix are arranged side by side, I can use a pointer that will jump to the next address and accessing the next element. It's called pointer arithmetic.

  15
 16
 17
 18
 19
 20
 21
 22
 23
 24
 25
 26
 27
 28
 29
 30
 31
 32
 33
 34
 35
 36
 37
 38
 39
 40
 41
 42
 43
 44
 45
 46
 47
 48
 49
 50
 51
 52
  6 ; const int max = 6; 
 max ] = { 'B' , 'L' , 'A' , 'B' , 'O' , 'S' } ; cvet char [max] = {'B', 'L', 'A', 'B', 'O', 'S'};

 pc = cvet ; char * pc = cvet;

 int i = 0 ; i < max ; i ++ ) { for (int i = 0; i <max; i + +) {
     "%c" , * ( pc + i ) ) ; printf ("% c", * (pc + i));
 }
 " \n " ) ; printf ("\ n");

 int i = 0 ; i < max ; i ++ ) { for (int i = 0; i <max; i + +) {
     "%c" , * pc ++ ) ; printf ("% c", * pc + +);
 }
 " \n " ) ; printf ("\ n");


 / / Now with integers
 max ] = { 1 , 2 , 3 , 4 , 5 , 6 } ; IVET int [max] = {1, 2, 3, 4, 5, 6};
 pi = ivet ; int * pi = IVET;

 int i = 0 ; i < max ; i ++ ) { for (int i = 0; i <max; i + +) {
     "%p = %d \n " , pi, * pi ++ ) ; printf ("% p =% d \ n", pi, pi * + +);
 }
 " \n " ) ; printf ("\ n");

 / / In two dimensions
 2 ] [ 3 ] = { { 'B' , 'L' , 'A' } , { 'B' , 'O' , 'S' } } ; CMAT char [2] [3] = {{'B', 'L', 'A'}, {'B', 'O', 'S'}};
 ppc ; char * ppc;

 char * ) cmat ; ppc = (char *) CMAT;

 int i = 0 ; i < 2 ; i ++ ) { for (int i = 0, i <2; i + +) {
     int j = 0 ; j < 3 ; j ++ ) { for (int j = 0, j <3 j + +) {
         "%c" , * ( ppc + 3 * i + j ) ) ; printf ("% c", * (ppc + 3 * i + j));
     }
     " \n " ) ; printf ("\ n");
 }
 " \n " ) ; printf ("\ n"); 

In line 18 the pointer pc is now point to the array cvet, and consequently to its first element, the character 'B'. In line 21 the contents of PC, which is the address where it was stored the character 'B', ie is increased then de-referenced. In the first pass the value of i is zero, so is the value for dereferencing 'B'. In the following steps, the next address is being de-referenced to other characters stored in the original array. It is more or less what the compiler does internally when you use the syntax cvet [i]. The abstraction of array gives you a friendly way of dealing with contiguous areas of memory that * (pc + i).

But if the array abstraction is simpler to use that pointer arithmetic?

One answer is line 26. She does the same thing that the line 21, but a little faster. In the syntax of line 21, or similarly, the array syntax, access to any given data can be summed up in very rough commands:

  1. Take the base address of the array;
  2. Add to address the value of the index;
  3. De-reference this new address;

Already pointer arithmetic looks like this:

  1. De-reference this address;

The command increment (or has done with addresses) will not tell because it is part of the loop, although i + +; is faster than a = b + c,. Now imagine this small gain of 66% applied to a data area of ​​1 MB. There will be over 2 million commands less!

The technique of using a pointer to manipulate an arbitrary area of memory is generally used in low-level programming (closest to the machine), manipulation of buffers and strings, among other dirty tricks. In the bowels of computers, operations that sweep large areas of memory, are often performed with pointer arithmetic. At this level, Darwin reigns supreme and only the prepared will survive. From here begins to give the language a power that only the pure in heart can comprehend.

An important observation is that between lines 31 and 38 is repeated with the whole experience. Note that as the integers are 4 bytes, the increments are automatically made in 4 of 4 bytes, not 1 on 1, ie the increment is automatically calculated to sizeof (type). Incrementing a pointer means to access the next area of ​​memory similar to the actual data, not just the next address. As the size of a char is one byte, when incremetamos a pointer to char, move only one byte. If you increment a pointer to double, we will move 8 bytes, and so on.

Another observation is that a two-dimensional array can be "linearized" as shown in lines 40 to 52. This is useful, when applicable, to make better use of processor cache, for example.

void * Pansexual of pointers

Earlier I said that was only possible to assign an array to a pointer that went to the same type as the array data. I lied outright! The reason is that for someone who gave up the post before this topic, it is safer to believe that it can not :) !

There are two exceptions to the rule. The first is when there is an explicit type conversions and the destination pointer "thinks" that is pointing to the right kind. An example is in line 44 of the previous code.

The second is the case of pointers to void. A void pointer is a pointer that makes no demands on the type of data that is in the memory area to which it points. It is a pointer to a general area of ​​memory, something very low level.

To use a data pointed to by a pointer to void, before de-reference it, you must make an explicit cast to a valid type, since it is an int *, int, and referenced to a char * is de-referenced to char, guess what it is-referenced to a void *?

  15
 16
 17
 18
 19
 20
 21
 22
 23
 24
 25
 26
 27
  6 ; const int max = 6; 
 max ] = { 'B' , 'L' , 'A' , 'B' , 'O' , 'S' } ; cvet char [max] = {'B', 'L', 'A', 'B', 'O', 'S'};

 pv = cvet ; void * pv = cvet;

 / / Compile error.
 / / * Pv
 / / The valley sizeof (void)?

 int i = 0 ; i < max ; i ++ ) { for (int i = 0; i <max; i + +) {
     "%c" , * ( ( ( char * ) pv ) + i ) ) ; printf ("% c", * (((char *) pv) + i));
 }
 " \n " ) ; printf ("\ n"); 

Pointer to void are used when one needs to point to a general area of ​​memory without having control / knowledge of the type of data that this area contains, or functions that can not make assumptions about the types of its parameters, such as the API lib pthreads (link arbitrary).

Ending

The more we delve into the topics of pointers, we are closer to the machine. Much of the power of C and C + + comes from there, and much of the blame too. The complexity is increased and the risks. For many therein lies the fun!

Links

Comments

  • Leandro Melo http://0xc0de.wordpress.com

    Well didactic. I liked the "hands of pansexual." :)

  • Francisco-bruno-luisa

    rrrrr

blog comments powered by Disqus