Archive

Posts Tagged ‘c’

C Tidbit: Arrays of Strings as char**

February 22nd, 2009

While working further on my crypto assignment this evening, I had a hell of a time working with an array of strings in C. For those of you who know C, a string is really a character pointer (char *), and an array of strings, while often denoted *str[], can also be allocated as a pointer to a pointer of chars (char **). Long story short, after writing a bunch of code that utilizes this, I realized that I didn’t need all those strings and could actually get by with just one temporary variable, so I rewrote my code and got rid of all the code that took me so long to figure out. Because what I learned was useful, I’m writing it here for posterity and hopefully as a decent explanation for anyone else. This isn’t all the code that I wrote, but just snippets big enough to get the gist.

So, when you want to allocate memory for a bunch of (related) strings, you have two choices:

  1. Declare one char pointer (char *) and allocate enough memory for all the strings (and their terminating NUL bytes).
  2. Declare a pointer to a char pointer (char **) and allocate memory for the pointers and strings themselves separately.

My problems arose when I tried to combine these two approaches, by using a char** so I could access my individual strings with array notation (str[i]), but since I knew how much space I’d need ahead of time (but not at compile time), allocating all the memory at once. It makes sense that this didn’t work, since I was actually allocating a lot of space for a ton of pointers, but not actually any memory for the character pointers (i.e. the strings).

Since my explanation is mediocre, here’s a bit of code to explain:

/* Approach 1: one big char pointer */
char *bigstr; // Declare one character pointer, i.e. one string
int length = 72; // Enough space for 8 8-char values plus a NUL byte for each
int i;


bigstr = malloc(sizeof(char) * length);
for (i = 0; i < length; i += 9) { // I know this sucks, and I later fixed it
    fscanf (infile, "%s", &input_strs[i]);
    // or fscanf(infile, "%s", input_strs + i);
}

So, the benefit of the above approach is that you can allocate your memory all at once. The downside is that you have to either access the input_strs[] with the address-of operator (the ampersand) or just shove the pointer in and do pointer arithmetic.

Now, on to the second option:

/* Approach 2: pointer to a pointer */
char **str_array; // Declare a pointer to a pointer, i.e. an array of strings
int length = 8; // Enough space for 8 char pointers
int input_str_size = 9; // Enough space for 8 bytes plus NUL
int i;


str_array = malloc(sizeof(char *) * length);
for (i = 0; i < length; i++) {
    str_array[i] = malloc(input_str_size);
    fscanf (infile, "%s", str_array[i]);
    // or fscanf(infile, "%s", *input_strs + i);
}

The advantage of this second approach, as you can see, is that you can directly apply the array notation without the (possibly confusing) need for the address-of operator. The downside is that you don’t allocate all the memory ahead of time, so it may be slower to allocate small chunks of memory in a loop. This would be the way to go, however, if your strings were of different sizes.

Which of these two paths you choose is up to you, and it really depends on your application. However, just remember the difference between the two, and try not to mix them up like I did!

programming ,

C Tidbit: Reading in “binary” from text file with strtol

February 22nd, 2009

Last night, I was working on an assignment for my cryptography class (writing SPN and Feistel ciphers in C), and I was looking for a way to read in “binary” from a text file. I.e. the file itself was not a binary file, but rather I was reading ASCII 0 and 1 characters from the file and interpreting them in my program. After a bit of searching, I came across the man page for atoi, which states that its behavior is the same as:

strtol(nptr, (char **) NULL, 10);

I’ve never seen this “strtol” before, so I went and looked at its man page, and it turns out that this is the magic function that let me do what I wanted to do. That last argument is the numerical base, so I just do something like so:

long int myint = strtol(binary_str, NULL, 2);

Of course, this is after reading in a series of binary strings as character pointers from the file with fscanf. I.e. in the above, binary_str is a char * that contains something like “01101001″.

I just thought this was worth sharing, since without knowing about this function, anyone trying to do this has to end up writing annoying binary/decimal conversion functions. Not particularly difficult, but annoying nonetheless when the functionality is right there.

programming ,

Embedding NULL in a C string

November 18th, 2008

I just solved this interesting little problem for myself, and it took me a little while (not that long, but long enough) to get it right and straight in my head. The problem at hand is this: suppose you want to embed a NULL character (for whatever reason) into a C string. That’s a bit of a quandary, since C uses NULL (a.k.a. ASCII 0, or \0) for string termination. So if you try something like:

printf("This string is\0awesome!\n");

You’ll be left in the cold wondering what this string is, because you’ll never see anything after the “is.” That’s right, you’ll never get to know that this string is, in fact, awesome. That’s rather annoying if, for example, you want to use NULL as a delimiter for something. One might want to do this since NULL characters are disallowed in things like file names (since they terminate them). So the question is “can we fix it?” And much to my own glee, the answer is “yes we can!”

My first attempt was to encode a %s conversion specifier in there and make its value “\0″, like so:

printf("This string is%sawesome!\n", "\0");

It’ll print out the full string with that invisible NULL, right? Unfortunately, this is incorrect. It will print the full string, but not the NULL. Why? Because the %s will look at its corresponding string, cut off the NULL character, and insert it. So it puts in a string of length 0 in there. So it looks like nothing has been put into the string, because it has. But nothing != NULL. Back to the drawing board.

My next attempt was to encode a %c conversion specifier into the string and make its value ‘\0′, like so:

printf("This string is%cawesome!\n", '\0');

At first, it seemed to work, but I was dismayed. All I saw was the output of the original try (“This string is”). I guess that’s what I wanted, right? But it appears it didn’t do anything with the rest of the string. What to do? Well, I turned to printf’s close cousin, sprintf, to put the value into a string for later manipulation and printing. So I malloc’d some memory for a char *mystr, like so:

char *mystr = malloc(64); // arbitrary amount of memory

Then I shoved my string into mystr:

sprintf(mystr, "This string is%cawesome!\n", '\0');

Then I put in a line to print mystr:

printf("%s\n", mystr);

And… failure? Damn. I must be doing something wrong… Then it hit me: printf() will always stop at the NULL character, but that doesn’t mean my string isn’t all there. The key here is that I allocated enough memory for that string, so I’ll be damned if my whole string isn’t there. To test my theory, I modified my line like so:

printf("%s", mystr+15);

For those of you who aren’t all that up on your C, since mystr is simply a pointer to a location in memory to the beginning of my characters, mystr+15 is that location plus 15 bytes, which is one character after the inserted NULL byte. As I expected (and hoped), “awesome!” (with a newline) was printed! Success! Now, you might think back to my first attempt and wonder if that would work with sprintf, but it does not. It’s probably for the best anyway, since %c makes more of a point that you’re doing something despicable like puttying NULL characters into your strings, and allows for easier changing if you decide to use something else in the future.

So a final question comes up: if I’m shoving the character that’s used for string termination in my strings, how do I know when it’s really done? Well that’s a matter for a particular implementation. In mine, the newline character will be my buddy. For another implementation, one might decide to use calloc() instead and just look for more than one NULL character in a row. Either way it’s rather messy, but for my purposes I think it’ll work just fine.

Related Links:

programming ,