3

I recently learned about storage classes in C. In particular I was fascinated by the static storage class. Coming from Haskell I eschew the concept of passing an output buffer to a function to obtain a result. For example consider the following readfile function:

#include <stdio.h>
void readfile(const char * filename, char * contents, size_t size) {
 FILE * file = fopen(filename, "rb");
 fread(contents, size, 1, file);
 contents[size] = 0;
 fclose(file);
}

There are several reasons I don't like code like this:

  1. The use of void as a return type irks me. I don't know why. It just does.
  2. Passing an output buffer to a function seems unnatural. Mutable state is error-prone.
  3. You shouldn't have to pass the number of bytes to read as an input parameter to the function.
  4. You need to create a buffer manually and predict the size of the file.

Due to these problems I rewrote the above code as follows:

#include <stdio.h>
#include <malloc.h>
char * readfile(const char * filename) {
 FILE * file = fopen(filename, "rb");
 fseek(file, 0, SEEK_END);
 size_t size = ftell(file);
 fseek(file, 0, SEEK_SET);
 static char * contents;
 contents = malloc(size + 1);
 fread(contents, size, 1, file);
 fclose(file);
 contents[size] = 0;
 return contents;
}

Now all you need to do to read the contents of a file is pass the filename to readfile. It allocates space for the contents of the file and returns a pointer to the newly created buffer. The only bookkeeping you need to do is to free the buffer once you're done with it.

As you can see in the above code I have declared contents as static so that there's only one instance of that variable, and so that you can return it without the compiler giving you a warning. In my opinion this is a cleaner solution than using global variables.

Nevertheless, I am skeptical about using static in production code: partly because I am afraid of mutable state coupled with shared variables, and partly because this is the first time I am using it. What are the potential risks of using static as demonstrated?

For example could the above code give erroneous results when you're reading two files concurrently? How do I address these problems without reverting back to C-style code (e.g. passing an output buffer to the function, etc.)

asked Feb 9, 2014 at 7:40
14
  • 2
    What warning do you get when you remove static? I don't get any. The only situation I can think the compiler would warn you is if it were a non-static array you were returning; but it's not an array, it's a pointer. Commented Feb 9, 2014 at 7:45
  • 2
    You don't understand the difference between arrays and pointers. If contents was an auto (i. e. non-static) array, and you returned it, then you would have invoked undefined behavior because the array would have been converted to a pointer, but it would already have been out of scope. But since you are allocating a buffer dynamically, you don't need static: the value of the pointer is copied upon returning from the function. Commented Feb 9, 2014 at 7:48
  • 3
    Also, you coming from a functional language heavily shows itself :P you are mythically afraid of state, mutability and a function which is not returning values. Don't do that. C is a procedural and imperative programming language. Don't try to program in Haskell when you are programming in C. Commented Feb 9, 2014 at 7:49
  • 4
    @AaditMShah <malloc.h> is not standard. <stdlib.h> is, and it's <stdlib.h> that declares malloc(). Commented Feb 9, 2014 at 7:50
  • 2
    @AaditMShah regarding your last line, you should write only C-style code in C :) Commented Feb 9, 2014 at 7:55

3 Answers 3

5

You almost never want to declare a local variable as static in C.

Making a variable static essentially makes it a global variable whose name is not accessible outside of that function. As you're likely well-aware, global variables can be problematic: if you ran readfile from two different threads, you could have the first call malloc and store the result into contents, the second thread call malloc and store its result into contents, and then the first thread fread into the contents the second thread allocated, which could result in a buffer overrun if the file being read on the second thread was smaller and is undesirable in any case.

The reason you might have been tempted into using static was that contents was previously an array. If so, the compiler would have rightly warned that you can't return that to the caller: it will decompose to a pointer, but as soon as the function returns, the local array variable is destroyed and the pointer becomes invalid. Declaring it as static makes it valid to return, as because it's a global variable, it won't be destroyed when the function exits and the pointer will remain valid. There are still problems if you use it with threads, though.

The only time you might want to use static is if you've got some constant data only used within a function. For example:

static const int some_integers[] = { 1, 2, 3, 4 };

Then you can save some stack space.

Lest I forget to mention your specific code, removing static makes it work as desired. If this code were to be used in real life, I'd make sure to add some error checking, as almost all of the functions you call can fail and will signal that only through a return value.

answered Feb 9, 2014 at 8:05
Sign up to request clarification or add additional context in comments.

Comments

0

Using static in this case will effective make the variable's memory get stored in the global data segment of process memory. However, it will only have local scope (only this function sees it). Since you are likely to be tracking the pointer some other way (the return value) it's unnecessary to use this method and wastes a pointer worth of global data memory (probably not a big deal).

answered Feb 9, 2014 at 7:59

Comments

0

There are also POSIX functions like stat(2) and fstat(2) that allow you to find out the exact size of the file. You can use this to allocate the buffer and read the data all in one go.

In the case that your file is rather large you could also use the mmap(2) system call or the CreateFileMapping and MapViewOfFile to have the operating system do much of the heavy lifting for you.

answered Feb 9, 2014 at 7:55

6 Comments

What if the file size is 2 GB ??
The stat/fstat use an off_t to track the file size. On many systems this is a 64-bit integer. 2GB is not an issue there.
@Desolator Well, what? Nothing special.
Yes it can allocate it on x64, but "filling out" the memory whithout a real purpuse. Also, too much time to allocate this huge space
Yeah, that's a fair point. In that case I recommend mapping the file using mmap or CreateFileMapping/MapViewOfFile on Windows.
|

Your Answer

Draft saved
Draft discarded

Sign up or log in

Sign up using Google
Sign up using Email and Password

Post as a guest

Required, but never shown

Post as a guest

Required, but never shown

By clicking "Post Your Answer", you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.