I want to read multidimensional double array from file into a single pointer variable. The number of rows and columns are not known and need to be determined when reading the file in order to allocate memory for the array. I wrote the following code and it works fine with me. I want to know if someone has more efficient solution for such case:
//Initialize variables and open the file
size_t count=1000;
char *line = malloc(1000);
FILE *file= fopen("g.txt", "r");
if(file==NULL)
{
printf("Error in Opening file");
return EXIT_FAILURE;
}
double *data=(double*) malloc(1000*sizeof(double));
if(data==NULL)
{
printf("Error in allocating memory");
return EXIT_FAILURE;
}
// Read number of columns and number of rows
getline(&line, &count, file);
int read=-1,cur=0,columCount=0;
while(sscanf(line+cur, "%lf%n", &data[columCount],&read)==1)
{cur+=read;columCount=columCount+1;}
int rowCount=1;
while(getline(&line, &count, file)!=-1) {rowCount=rowCount+1;}
rewind(file);
// Reinitialize array using the number of rows and number of columns
free(data);
data=(double*) malloc(columCount*rowCount*sizeof(double));
if(data==NULL)
{
printf("Error in allocating memory");
return EXIT_FAILURE;
}
// Read file and store values
int i=0;
while(getline(&line, &count, file)!=-1)
{
read=-1,cur=0;
while(sscanf(line+cur, "%lf%n", &data[i],&read)==1)
{cur+=read;i=i+1;}
}
fclose(file);
return EXIT_SUCCESS;
-
2\$\begingroup\$ 1. Harmonize your coding style! This especially includes the positions of braces and the number of spaces contained in one indention level (2 or 4?). 2. Do not cast the result of malloc()! \$\endgroup\$ComFreek– ComFreek2013年11月29日 19:35:24 +00:00Commented Nov 29, 2013 at 19:35
1 Answer 1
You have written everything in one function (main
probably) but it is better
split into its components. Essentially what you want to do is:
open input file
get column count
get line count
rewind file
allocate memory for the array
read array data
Some of these stages could and probably should be in separate functions.
Taking it from the top, you open the input file, with the file name
hard-coded. A better approach is to get the filename from the command line -
argv[1]
. You have checked for failure to open the file, which is good.
The normal error handling is to use perror
and the name of the failed file:
FILE *f = fopen(argv[1], "r");
if (!f) {
perror(argv[1]);
return EXIT_FAILURE;
}
You then allocate buffer data
(again the error handling should use
perror
). You then use this buffer in a read loop to count the entries in
each line, but you really don't need to save the values in the buffer at all
at this stage - you discard them anyway. Your read loop could equally be:
while (sscanf(line+cur, "%*lf%n", &read) == 1) {
cur += read;
columCount++;
}
Notice that the format string uses a '*' to suppress assignment of the
floating point value read. Note also the improved formatting and the use of
++
in columCount++
. Note also that you don't check for failure in reading
the first line.
Edit: Abbas pointed out that adding the '*' to suppress assignment means that we must check for 0 returned from
sscanf
, not 1. But sscanf
also returns 0 when it fails to read anything (in this case the read count in read
would also contain 0). Hence a bad character in the line will cause the loop to hang. So we must either not suppress the assignment:
double d;
while (sscanf(line+cur, "%lf%n", &d, &read) == 1) {
cur += read;
columCount++;
}
or we must check read
.
while ((sscanf(line+cur, "%*lf%n", &read) == 0) && (read != 0)) {
cur += read;
columCount++;
}
I think I would go for the first. End of edit.
You then count the lines by looping through them. Note that you use a fixed
size buffer line
, allocated at the beginning, to read each line. To be
completely flexible, you should handle lines of arbitrary length. Your lines
are fixed at 1000 chars and anything beyond that breaks the code. Since you
are using getline
(which is non-standard, but useful) I would make use of its
ability to allocate a line for you by passing a zero size.
char *line = NULL;
size_t count = 0; // EDIT: Abbas pointed out we need to add count=0
getline(&line, &count, file);
// use the line
free(line);
In this, line
is just a pointer. getline
allocates a buffer large enough
to hold the line of text and places a pointer to the buffer in line
. You
must free that line later.
So now you have the column and row count. Those two stages (reading the two sizes) would be better extracted into a separate function.
You now rewind the file and allocate the real array. As stated above, this should be the first allocation of the program. Following this you read the data into the array. This might be better split into a separate function. The read loop can be simpler than yours - you don't need to get a line and then scan it - just scan it directly:
int ndata = 0;
while (scanf("%lg", &data[ndata++]) == 1) {
// nothing
}
ndata -= columCount * rowCount;
if (ndata != 0) {
printf("%s data values in input\n", ndata > 0 ? "Too few" : "Too many");
return EXIT_FAILURE;
}
And remember to check that the number of doubles read matches what you were expecting.
Note that scanf
and its associates do not react well to malformed input. There are other ways of reading text but in the simple case (of properly formed data) they are more complicated. Someone else might give you some examples/alternatives.
Finally note that your formatting is inconsistent, as if bits have been cut and pasted from elsewhere. Copying code from your previous work is not necessarily bad, but make sure you make the whole thing looks as if it belongs together.
-
\$\begingroup\$ Thanks for the excellent and comprehensive answer. However, when I applied your code hints, I found some bugs, and I corrected them as follows: First) getline requires that the first two parameters to be assigned to null and 0, respectively, before calling the function. So, getline(&line, 0, file) gave me an error, however, char *line=NULL; int count=0; getline(&line, &count, file); works. Second) The value 1 should be replaced by 0 in while (sscanf(line+cur, "%*lf%n", &read) == 1). Third) The value 0 should be replaced by 1 in if (ndata != 0). Is what I did correct? \$\endgroup\$Abbas– Abbas2013年11月30日 11:58:06 +00:00Commented Nov 30, 2013 at 11:58
-
\$\begingroup\$ Oops! Your first point is correct. Your second is correct but throws up the problem that a bad character in the input will cause and endless loop. I'll edit my answer to comment on this. The third, I can't see why
ndata
should be 1 at the end of the loop. \$\endgroup\$William Morris– William Morris2013年11月30日 17:05:26 +00:00Commented Nov 30, 2013 at 17:05 -
\$\begingroup\$ Thanks @William Morris. Regarding the third point: it is because you put data[ndata++] in the while condition, so, if the array has 20 elements, after executing the while loop ndata will have the value 22 (This is because the ++ executed even if the while condition result is false). One final note: in the while loop, you put it scanf, but, it should be fscanf(f,"%lg",....) \$\endgroup\$Abbas– Abbas2013年11月30日 17:39:29 +00:00Commented Nov 30, 2013 at 17:39
-
\$\begingroup\$ Ah yes, you are right. The perils of using a
while
loop when afor
would have been more appropriate :-) \$\endgroup\$William Morris– William Morris2013年11月30日 19:01:34 +00:00Commented Nov 30, 2013 at 19:01