C Tutorial, Part II

The Pointer Type: storing the address of a variable in another variable.

Every variable in C has 3 properties:

its name (identifier)
its value (the contents of the variable)
its address (the memory address WHERE that value is stored)

Since a C programmer may need to store a variable's address for later use, C provides a data type called the pointer type. The pointer type allows us to declare variables that are meant to hold the address of some other variable. The variable holding the address is called a pointer. Declaring a pointer variable is done as follows:

int *pi;  /* creates a 4-byte (32-bit) variable whose data type is pointer to int */
int i;    /* an int var */
pi = &i;  /* pi now holds i's address - i.e., it points to i */

A pointer variable is itself a variable and its value can change. A pointer can hold only one of 3 values:

the NULL value (value ZERO, indicates not pointing to any variable)
the memory address of any variable of appropriate type
garbage

Pointer variables are NOT the data type of the variable they point to (rather, they are a pointer to a variable of that type). From above, the data type of pi is pointer to int NOT int and the name of the (above mentioned) pointer variable is pi NOT *pi.

You can declare a pointer to ANY data type (and each pointer will be 4 bytes):

char *pc;   /* pc is type: pointer to char */
float *pf;  /* pf is type: pointer to float */
double *pd; /* pd is type: pointer to double */

Dereferencing pointer variables

int* pi; /* variable of type pointer to int  */
/* note the * can go next to the type name or next to the variable name */

address-of

scanf()

part 1

int i;         /* plain old int var */
pi = &i;   /* pi now contains the address of variable i */

dereference

*pi = 15;	/* i now contains 15. */
printf("value pointed to by pi is: %d\n", *pi); /* deref's pi and prints 15 */

The expression *p can be thought of as: the value at the address which is stored in pi.

The expression *pi = 15; can be thought of as: look inside the pi variable. pi will have an address in there for you. Go to that address and overwrite the contents of the memory at that address with the value 15.

Using pi with the * in front of it - like we did above - is called dereferencing the pointer variable. Dereferencing a pointer variable means following that pointer to refer to some other location in memory. NOTE: I can ONLY assign a value, like 15, to a dereferenced pointer, like *pi, if the pointer is pointing to valid (allocated) memory! That is, if I hadn't assigned i's address to pi prior to the assignment of 15 to *pi, my code would fail miserably!

Pointers are a strict data type - you should not assign the address of a variable of any other data type into a pointer of a different type. i.e., we should not do this:

double x = 10;
int *pi = &x;   /* Generates Compiler warning. */

One more thing

Using pointers to make changes to variables across function calls

C (like Java) has no reference parameters (although C++ does). Thus in C we are always passing arguments by value and it is impossible to modify the actual parameters in the call when you pass them to a function. Furthermore it helps to remember that incoming arguments to a function should really be thought of as local variables in the function and those local vars get a copy of the values passed from their caller. This is why we need pointers.

Pointers are the only mechanism C gives us to modify data declared in one scope using code written in another scope. In other words: If data is declared in function1(...) and we want to write code in function2(...) that modifies the data in function1(...), then we must pass the addresses of the variables we want to change. The calling function sends in addresses and the receiving function must declare those incoming args as pointers. All function2(...) has to do to modify the data in function1(...) is to dereference the pointers sent in.

We wrote a program to demonstrate this by swapping 2 ints: swap.c

The Array type

C has no keyword "array" and since C has no classes there is no Array class as in Java. C does, however, have an array type. C arrays are contiguous storage of homogeneous values that are addressable by index (starting at 0). When you declare an array, you are guaranteed that all the elements are of the same type and that they are stored in consecutive memory addresses. The value of a variable that is an array is the address of the first element (element 0) of the array. Thus,

int a[10];
int *pa;

pa = a;  /* is the same as saying pa = &a[0] */

BUT while pa is a variable (an l-value), an array name is not, it's a synonym for the location of the first element. Thus,

int fred;
int a[10], b[10];

a[8] = fred;    /* OK */
a[0] = a[1];    /* OK */
a[i] = a[i+1];  /* OK */
a[0] = b[0];    /* OK */
a = b;          /* ILLEGAL! */

But be careful! Consider the following code snippet (available as arraysize.c):

void printSize(int arr[])
{
    printf("%d\n", (int)sizeof arr); /* same as sizeof(int *) */
}

int main(int argc, char *argv[])
{
    double *p;
    double arr[42];

    printf("%d\n", (int)sizeof p);
    printf("%d\n", (int)sizeof &arr[0]); /* addr of 1st elem */
    printf("%d\n", (int)sizeof arr); /* sizeof the entire block (array) */
    printf("%d\n", (int)(sizeof arr/sizeof arr[0])); /* number of elements */
    printSize(arr);
    return 0;
}

The first two will print the size (in bytes) of a pointer on your system (probably 4). The third will NOT be 4 as it will literally print the size of the entire array (336) since it is declared (dimensioned) in this function. Given that that's the case, the next value printed will be the number of elements (the size of the entire structure divided by the size of one (arbitrarily-chosen) element. And the last value printed, from within the function call, will NOT be 336; it will be 4, the size of a pointer (which is what it is sent!).

We then examined

arrDemo1.c

Arrays can be declared and initialized in the same statement just by supplying a list of values in braces. In this case, you can use empty brackets, [ ], since the compiler will parse the list and calculate the appropriate dimension.

Arrays can be declared and dimensioned but left uninitialized. In this case, a compile-time constant (usually via a #define statement) must be inside the braces because arrays are created at compile-time so the compiler does not have access yet to run-time values.

When arrays are passed as parameters, you just pass the name as in Java and get the address of the array as the value received (because the array name is a synonym for the address of first item in the array). Note that the receiving formal parameter doesn't have to have a number in the brackets. That's OK - you're given where to start, but you have to determine where to stop somehow (since there's no .length instance variable as in Java!).

Look at the last part of the code in the main and notice the following: when we scanf(..) for c[0] and d[0], we used the arrays' names as parameters to pass the addresses of c[0] and d[0]. This works because the values of the symbols c and d are the address of c[0] and d[0], respectively. We will come back and look in more gory detail at the semantics of arrays, pointers and the [ ] operator when we cover pointer arithmetic a little later in the tutorial.