C development on Linux – Structures – VII.

Introduction

We will continue in this part of our tutorial with the complex data types in C, and we will talk about structures. Many modern programming languages offer them, one shape or another, and so does C. As you will see later, structures allow you to manipulate data easier, by allowing you to store different variables of (possibly) different types under one single “roof”.

Beginning structures

Although I wanted to postpone the definition part for this sub-chapter, it seems like I couldn’t wait and included it in the introduction. Yes, folks, that’s what a structure is, and you will see in a whim how useful it is when I will show you some examples. One interesting parallel is the one referring to a database table: if you have a table called users (the unique name), then you will put in that table the exact data which pertains directly to the users: age, gender, name, address, and so on. But these are different types! No problem, you can do that with a table, just as you can do it with a struct: age will be an integer, gender will be a char, name will be a string and so on. Then you will be able to access the members of the table easily, by referring to the name of the table/member. But this is not a database course, so let’s move on. But before that, let’s take a short look at a logical aspect: you are invited to create structs with members that have something in common from a logical point of view, like the example above. Make it easier for you and the people that will later look at your code. So, let’s see how our users database table would translate in a C struct:

struct users {
	int age;
	char gender;
	char *name;
	char *address;
};

Please do not forget the semicolon at the end. OK, so I boasted that the members of the structure are simple to access. Here’s how, provided you want to access the age of the user:

printf("The age of the user is %d.\n", users.age);

But for that printf to work, we’ll have to define the age first. That can be done like this

struct users {
	int age;
	...
} usrs;
usrs.age = 25;
...

...

What we did here is declare an instance of the struct (you can have as many instances as you please), named “usrs”. You can have usrs1, usrs2, usrs3 and so on, so you can use these attributes (like age, gender, address) on all of them. The second way to do this is to declare the struct as we did the first time (e.g. without instances) and then declare the respective instances later in the code:

...
struct users usrs1, usrs2, usrs3;

…and then take care of the age, gender, address and so on as we did above.

When we talk about structs in conjunction with functions, the most important thing to talk about is probably the fact that structs are regarded as a whole, not as a compound made of several elements. Here’s an example:

void show_age(usrs i)
{
	printf("User's age is %d.\n", i.age);
	printf("User's name is %s.\n", (&i)->name);
}

What this function does is: it takes a numeric argument and prints out all the users that have that specific age. You might have noticed a new operator in the above code (if you haven’t, look again). The “->” operator does exactly what the dot operator does, allowing you to access a member of the structure, with the specification that it’s used when pointers are involved, just as the dot operator is used in cases when pointers are not involved. One more important consideration here. Given the following code:

struct mystruct {
	int myint;
	char *mystring;
} *p;

what do you think the following expression will do?

++p->myint;

Advanced topics

One of the things you’ll see pretty often in relation with structures, but not only, is the typedef keyword. As the name implies, it allows you to define custom datatypes, like in the examples below:

typedef int Length; /* now Length is a synonym for int */
typedef char * String;

Regarding structs , typedef basically eliminates the need to use the ‘s’ word. So here’s a struct declared in this manner:

typedef struct colleagues {
	int age;
	char gender;
	...
} colls;

For our next topic, we will take an idea found in K&R and use it to illustrate our point. Why? It’s well-thought and it shows very well and in a simple way what we’re about to illustrate. But before we begin, here’s a question for you: knowing that C allows nested structs, do you think nested structs by means of typedef could be accepted? Why?

So, here’s the next topic: struct arrays. Now that you know what arrays are you can easily guess what this is about. However, some questions remain: how to implement the concept and, more important, what could be the use? The example we talked about will soon shed some light on both matters. LEt’s presume you have a program, written in C, and you want to count the number of occurrences of all the keywords the standard defines. We need two arrays: one to store the keywords and another to store the number of occurrences corresponding to each keyword. This implementation can be written as such:

char *keywords[NRKEYWORDS];
int results [NRKEYWORDS];

Looking at the concept you will soon see that it uses a pairs concept, which is more efficiently described by using a structure. So, because of the end result we’ll need, we will have an array whose each element is a structure. Let’s see.

struct keyword {
	char *keywords;
	int results; 
} keywrdtbl [NRKEYWORDS];

Now let’s initialize the array with the keywords and the initial number of occurrences which will, of course, be 0.

struct keyword {
	 char *keywords;
	 int results;
 } keywrdtbl [] = {
	 "auto", 0, 
	 "break", 0, 
	 "case", 0,
	 ...
	 "while", 0
 };

Your next and last assignment, since this task is a bit more complex, is to write a complete program that takes itself as the text to work on and print the number of occurances of every keyword, according to the method above.

The last subject on structs I will deal with is the matter of pointers to structs. If you wrote the program in the last exercise, you might already have quite a good idea how it could be re-written so it can use pointers instead on indexes. So if you like writing code, you might consider this as an optional exercise. So there’s nothing much around here, just a few aspects, like (very important), you must introduce some extra code with extra care so that when parsing the source code of the file you’re scanning for keywords, and of course the search function must be modified, you won’t create or stumble upon an illegal pointer. See the previous part for reference on pointer arithmetic and differences between using arrays and using pointers. Another issue to be careful with is the size of the structs. Don’t be fooled : there can be only one way to get a structure’s way right, and that is by using sizeof().

#include <stdio.h>

struct test {
        int one;
        int two;
        char *str;
        float flt;
};

int main()
{
        printf("Struct's size is %d.\n", sizeof(struct test));
        return 0;
}

This should return 24, but that is not guaranteed, and K&R explains this is because of various alignment requirements. I recommend using sizeof whenever you are in doubt, and presume nothing.

Unions

I should have altered the title and include the word “unions”, and maybe even “bitfields”. But because of the importance and general usage pattern of structures versus unions and bitfields, especially now that hardware is becoming a cheaper commodity (not necessarily healthy thinking, but anyway), I guess the title will say only “structures”. So what is an union? A union resembles much a structure, what differs is the way the compiler deals with the storage (memory) for it. In short, a union is a complex data type that can store different types of data, but one member at a time. So regardless how big the variable stored will be, it will have its’ place, but others won’t be allowed in the union at that precise moment. Hence the name “union”. The declarations and definitions of unions are the same as structures, and it’s guaranteed that the union will take as much memory as its’ biggest member.

Bitfields

If you will want to use C in embedded systems programming and/or low-level stuff is your game, then this part will seem appealing. A bitfield (some write it bit field), doesn’t have a keyword assigned like enum or union, and it requires you to know your machine. It allows you to go beyond the tipical word-based limitations other languages confine you to. It also allows you to, and this might be a formal definition, “pack” more than one object in a single word.

Enums

To start with a short historical fact, enums were introduced in C when C89 was out the door, meaning K&R lacked this nifty type. An enum allows the programmer to create a set of named values, also known as enumerators, which have as their main characteristic that they have an integer value associated with them, either implicitly (0,1,2…) or explicitly by the programmer (1,2,4,8,16…) . This makes it easy to avoid magic numbers.

enum Pressure { pres_low, pres_medium, pres_high };
enum Pressure p = pres_high;

Now, this is easier, if we need pres_low to be 0, medium 1 and so forth, and you won’t have to use #defines for this. I recommend a bit of reading if you’re interested.

Conclusion

Although the information might seem a bit more condensed than before, don’t worry. The concepts are relatively easy to grasp and a little bit of exercise will work wonders. We’re waiting for you at our Linux Forums for any further discussion.

All articles in this series:



Comments and Discussions
Linux Forum