Pointers and Arrays

May 18, 2009 · Posted in C / C + +

In the first post of this series, we talk a little about pointers. In the second, we talk about references. Today we will focus on the intimate relationship (ui!) Between pointers and arrays (or arrays).

Recalling Arrays

An array, or array, or an arrangement, it is a mathematical abstraction used to represent a set of homogeneous data, ie the same type (int, float, etc.). This abstraction is organized in table format, with rows and columns. Each element in the array has only coordinates (row and column), so that a given element E (i, j) represents the only element in the "line i", "j column.

The syntax for declaring arrays is as follows:

  / / One-dimensional arrays or vectors
 10 ] ; // 10 elementos, do 0 ao 9 ivet int [10] / / 10 elements, 0 to 9
 23 ] ; // 23 elementos, do 0 ao 22 cvet char [23] / / 23 elements, from 0 to 22

 / / Two-dimensional matrices
 2 ] [ 3 ] ; // 2 linhas (0 a 1) e 3 colunas (0 a 2) int imat [2] [3] / / 2 lines (0 to 1) and 3 columns (0 to 2)
 10 ] [ 2 ] ; // 10 linhas (0 a 9) e 2 colunas (0 a 1) DMAT double [10] [2] / / 10 lines (0 to 9) and 2 columns (0 to 1) 

Each array element is independent of the others and can be accessed as the following syntax:

  10 ] ; ivet int [10];
 3 ] [ 4 ] ; int imat [3] [4];

 / / Changing the fourth element of the vector ivet.
 / / Remember that begins to run from the zero element
 ] = 13 ; ivet [3] = 13;

 / / Reading the second element of the vector ivet
 ivet [ 1 ] ; int num = ivet [1];

 / / Changing the element of the first row, second column imat
 ] [ 1 ] = 42 ; imat [0] [1] = 42;

 / / Reading the element in the third row, fourth column imat
 imat [ 2 ] [ 3 ] ; int foo = imat [2] [3]; 

In this post we will not discuss the background matrix, we only investigate the relationship between arrays and pointers in a fairly intuitive.

Sizes arrays

Since an array is an abstraction that contains multiple values of the same type, how big is it? How much space it occupies in memory?

Consider the code below:

  17
 18
 19
 20
 21
 22
 23
 24
 25
 26
 27
 28
 29
 30
 31
 32
  10 ] ; ivet int [10];
 13 ] ; cvet char [13];
 20 ] ; double dvet [20];

 3 ] [ 4 ] ; CMAT char [3] [4];
 5 ] [ 4 ] ; int imat [5] [4];

 "sizeof(int)     = " << sizeof ( int ) << endl ; court << "sizeof (int) =" <<sizeof (int) <<endl;
 "sizeof(char)    = " << sizeof ( char ) << endl ; court << "sizeof (char) =" <<sizeof (char) <<endl;
 "sizeof(double)  = " << sizeof ( double ) << endl ; court << "sizeof (double) =" <<sizeof (double) <<endl;

 "sizeof(ivet)    = " << sizeof ( ivet ) << endl ; court << "sizeof (ivet) =" <<sizeof (ivet) <<endl;
 "sizeof(cvet)    = " << sizeof ( cvet ) << endl ; court << "sizeof (cvet) =" <<sizeof (cvet) <<endl;
 "sizeof(dvet)    = " << sizeof ( dvet ) << endl ; court << "sizeof (dvet) =" <<sizeof (dvet) <<endl;
 "sizeof(cmat)    = " << sizeof ( cmat ) << endl ; court << "sizeof (CMAT) =" <<sizeof (CMAT) <<endl;
 "sizeof(imat)    = " << sizeof ( imat ) << endl ; court << "sizeof (imat) =" <<sizeof (imat) <<endl; 

The result is quite reasonable. The space occupied by an array is equal to the number of elements multiplied by the size of the type of element (rows x columns x sizeof (type)). However, if each element is independent, it is assumed that each occupies a separate place in memory, otherwise one will overwrite another element. So, is that each element has its own memory address?

Arrays and memory addresses

For convenience, we will then analyze the possible addresses an array of characters, whose size is only given a 1 byte:

  15
 16
 17
 18
 19
 20
 21
 22
 23
 24
 25
 26
 27
 28
  5 ; const int max = 5;
 max ] = { 'A' , 'B' , 'C' , 'D' , 'E' } ; cvet char [max] = ( 'A', 'B', 'C', 'D', 'E');

 / / Displaying the contents, values and addresses of the data.
 "Índice \t Valor \t Endereço do elemento \n " ) ; printf ( "Index \ t Value \ t element address \ n");
 int i = 0 ; i < max ; i ++ ) { for (int i = 0; i <max; i + +) (
     "%d \t %c \t %p \n " , i, cvet [ i ] , & cvet [ i ] ) ; printf ( "% d \ t% c \ t% p \ n", i, cvet [i], & cvet [i]);
 )

 / / Printing the address of the array
 "Endereço da matriz: %p \n " , & cvet ) ; printf ( "Address of array:% p \ n", & cvet);

 / / Printing the address of the array again
 "Endereço da matriz: %p \n " , cvet ) ; printf ( "Address of array:% p \ n", cvet); 

The addresses of the elements are sequential, ie, each element is stored next to the former. Addition there are two more interesting facts:

  1. The address of the array (& cvet), shown in line 25, is the same as the first element of the array;
  2. The very variable cvet can be interpreted as a pointer, as shown in line 28;

In C + +, a common array is a contiguous block of memory whose name can be interpreted (cast) as a pointer that points to its first element. Additionally it is valid to make a pointer point to an array where the pointer to the destination is the same type as the type of the array elements. During atrubuição an array to a pointer, the compiler makes an implicit type conversions. The destination pointer is to be interpreted as a pointer to the memory area occupied by the array.

One of the consequences not so obvious is that during the implicit cast, is lost the information that the area was a memory array. Thus is lost the information of the size of the array. From the viewpoint of the pointer, it is pointing to the beginning of an arbitrary block of memory, a size too arbitrary. Check out an array and go to a pointer means going from an abstraction more restrictive and more high-level abstraction to a less restrictive and lower level.

On the other hand try to assign a pointer to an array generates a compile error for incompatible types. An array is a memory block of n data (bytes), as a pointer has only one input, an address. The compiler can not know in advance whether a pointer points to an area of 1, 2 or 200 bytes.

  15
 16
 17
 18
 19
 20
 21
 22
 23
 24
 25
 26
 27
 28
 29
 30
 31
  300 ; const int max = 300;
 max ] ; cvet char [max];
 pc = 0 ; char * pc = 0;

 " \n Antes da atribuição \n " ) ; printf ( "\ n Before the assignment \ n");
 "cvet = %p \n " , cvet ) ; printf ( "cvet =% p \ n", cvet);
 "pc   = %p \n " , pc ) ; printf ( "pc =% p \ n", pc);
 "sizeof(cvet) = %lu \n " , sizeof ( cvet ) ) ; printf ( "sizeof (cvet) =% lu \ n", sizeof (cvet));
 "sizeof(pc)   = %lu \n " , sizeof ( pc ) ) ; printf ( "sizeof (pc) =% lu \ n", sizeof (pc));

 pc = cvet;

 " \n Depois da atribuição \n " ) ; printf ( "\ n After the assignment \ n");
 "cvet = %p \n " , cvet ) ; printf ( "cvet =% p \ n", cvet);
 "pc   = %p \n " , pc ) ; printf ( "pc =% p \ n", pc);
 "sizeof(cvet) = %lu \n " , sizeof ( cvet ) ) ; printf ( "sizeof (cvet) =% lu \ n", sizeof (cvet));
 "sizeof(pc)   = %lu \n " , sizeof ( pc ) ) ; printf ( "sizeof (pc) =% lu \ n", sizeof (pc)); 

Note that before the assignment (line 25) the pointer pc. is null, so it was initialized. Since the sizes indicate that the array has 300 bytes and the pointer only 8 (my machine is a amd 64). After the assignment both pass the "point" to the same area of memory, but the sizes do not change. There was an implicit cast to char [300] to char *, and the pointer in that game pc can not know the size of the memory area to which it points. Since the array continues cvet knowing exactly what he is, without any existential crisis.

Pointer arithmetic - Vulgar, so what?

Why, if I know that the data in an array are arranged side by side, I can use a pointer that will jump to the next address and accessing the next element. The name of this is arithmetic pointers.

  15
 16
 17
 18
 19
 20
 21
 22
 23
 24
 25
 26
 27
 28
 29
 30
 31
 32
 33
 34
 35
 36
 37
 38
 39
 40
 41
 42
 43
 44
 45
 46
 47
 48
 49
 50
 51
 52
  6 ; const int max = 6; 
 max ] = { 'B' , 'L' , 'A' , 'B' , 'O' , 'S' } ; cvet char [max] = ( 'B', 'L', 'A', 'B', 'O', 'S');

 pc = cvet ; char * pc = cvet;

 int i = 0 ; i < max ; i ++ ) { for (int i = 0; i <max; i + +) (
     "%c" , * ( pc + i ) ) ; printf ( "% c", * (pc + i));
 )
 " \n " ) ; printf ( "\ n");

 int i = 0 ; i < max ; i ++ ) { for (int i = 0; i <max; i + +) (
     "%c" , * pc ++ ) ; printf ( "% c", * pc + +);
 )
 " \n " ) ; printf ( "\ n");


 / / Now the whole
 max ] = { 1 , 2 , 3 , 4 , 5 , 6 } ; ivet int [max] = (1, 2, 3, 4, 5, 6);
 pi = ivet ; int * pi = ivet;

 int i = 0 ; i < max ; i ++ ) { for (int i = 0; i <max; i + +) (
     "%p = %d \n " , pi, * pi ++ ) ; printf ( "% p =% d \ n", pi, pi * + +);
 )
 " \n " ) ; printf ( "\ n");

 / / In two dimensions
 2 ] [ 3 ] = { { 'B' , 'L' , 'A' } , { 'B' , 'O' , 'S' } } ; CMAT char [2] [3] = (( 'B', 'L', 'A'), ( 'B', 'O', 'S'));
 ppc ; char * ppc;

 char * ) cmat ; ppc = (char *) CMAT;

 int i = 0 ; i < 2 ; i ++ ) { for (int i = 0; i <2; i + +) (
     int j = 0 ; j < 3 ; j ++ ) { for (int j = 0, j <3 j + +) (
         "%c" , * ( ppc + 3 * i + j ) ) ; printf ( "% c", * (ppc + 3 * i + j));
     )
     " \n " ) ; printf ( "\ n");
 )
 " \n " ) ; printf ( "\ n"); 

In line 18 the pointer pc is now point to the array cvet and thus for its first element, the character 'B'. On line 21 the contents of PC, which is the address where it was stored with the 'B', ie is incremented then de-referenced. In the first week the value of i is zero, so it dereference to the value 'B'. In the following, the following address is being de-referenced to other characters stored in the original array. It is more or less what the compiler does internally when you use the syntax cvet [i]. The abstraction of array gives you a more friendly way of dealing with contiguous areas of memory that * (pc + i).

But if the abstraction of array is simpler, to use that pointer arithmetic?

One answer is line 26. She does the same thing that the line 21, but a little faster. In the syntax of line 21, or similarly, the array syntax, access to any of the information can be summarized in a very rough in the commands:

  1. Take the base address of the array;
  2. Add to address the value of the index;
  3. De-referencing the new address;

Already with pointer arithmetic looks like this:

  1. De-referencing this address;

The command increment (or made with regard to addresses) will not count because it is part of the loop, although i + +; is faster than a = b + c;. Now imagine this small gain of 66% applied to a data area of 1 MB. There will be over 2 million commands less!

The technique of using a pointer to handle an arbitrary area of memory is generally used in low level programming (closer to the machine), manipulation of buffers and strings, among other dirty tricks. In the bowels of computers, operations that sweep large areas of memory, are often performed with pointer arithmetic. At this level, Darwin reigns supreme and only the prepared survive. From here the language begins to give a power that only the pure of heart can understand.

An important note is that between lines 31 and 38 the experience is repeated with integers. Note that as the integers are 4 bytes, the increments are automatically made 4 of 4 bytes, not 1 to 1, ie the increment is automatically calculated for sizeof (type). Enhance a pointer means to access the next area of memory similar to the actual data, not just the next address. As the size of a char is one byte, when incremetamos a pointer to char, move only 1 byte. If furtherance of a pointer to double, we will move 8 bytes, and so on.

Another observation is that a two-dimensional array can be "linearized" as shown in lines 40 to 52. This is useful, when applicable, to make better use of processor cache, for example.

void *, the Pansexual pointers

Earlier I said that was only possible to assign an array to a pointer that was for the same type as the array data. I blatantly lie! The reason is that for someone who gave up the post before this topic, it is safer to believe that it can :) !

There are two exceptions to the rule. The first is when there is an explicit conversion of types and pointer target "thinks" that is pointing to the right kind. An example is at line 44 of the previous code.

The second is the case of pointers to void. A pointer to void is a pointer that makes no demands on the type of data that is in the area of memory to which it points. He is a pointer pra a general area of memory, something very low level.

To use any of the information pointed to by a pointer to void prior to de-reference it, you must make an explicit cast to a valid type, it is an int * is de-referenced to an int and char * is de-referenced to char, guess what it is de-referenced a void *?

  15
 16
 17
 18
 19
 20
 21
 22
 23
 24
 25
 26
 27
  6 ; const int max = 6; 
 max ] = { 'B' , 'L' , 'A' , 'B' , 'O' , 'S' } ; cvet char [max] = ( 'B', 'L', 'A', 'B', 'O', 'S');

 pv = cvet ; void * pv = cvet;

 / / Compile error.
 / / * pv
 / / How much is sizeof (void)?

 int i = 0 ; i < max ; i ++ ) { for (int i = 0; i <max; i + +) (
     "%c" , * ( ( ( char * ) pv ) + i ) ) ; printf ( "% c", * (((char *) pv) + i));
 )
 " \n " ) ; printf ( "\ n"); 

Pointer to void are used when one needs to point to a general area of memory without having control / knowledge of the type of data that this area contains, or functions that can not make assumptions about the types of its parameters, such as the API lib pthreads (link arbitrary).

Closing

The more we delve into the topics on pointers, next to the machine we. Much of the power of C and C + + comes from there, and many of the problems as well. The complexity is increasing and the risks too. For many therein lies the fun!

Links

Comments

One Response to "Pointers and Arrays

  1. Leandro Melo on May 21st, 2009 21:50

    Well didactic. I liked the "pansexual pointers. :)

Leave a Reply




Powered by WP Hashcash