3
\$\begingroup\$

Write a function that a receives file name of a txt file, which contains unknown number of words of max length 10 each. Print the word with the most occurrences. Assume the input is correct and contains at least one word.

Code

#define MAX_CHARS 11
int mostFrequentWord(char* filename) {
 char ch;
 int i, j, k, occurrence, counter = 1, maximum = 0;
 char *index_max = NULL;
 char **arr;
 FILE *file = fopen(filename, "r");
 while((ch = fgetc(file)) != EOF) {
 if(ch == ' ') {
 counter++;
 }
 }
 rewind(file);
 arr = (char**) calloc(counter, sizeof(char*));
 for (k = 0; k < counter; k++) {
 arr[k] = (char*) calloc(MAX_CHARS, sizeof(char));
 }
 while ((ch = fgetc(file)) != EOF) {
 if (ch == ' '){
 i++;
 j = 0;
 } else {
 arr[i][j] = ch;
 j++;
 }
 }
 for (i= 0; i < counter; i++) {
 occurrence = 0;
 for (j = i; j < counter; j++) {
 if (!strcmp(*(arr + i), *(arr + j))){
 occurrence++;
 }
 }
 if (occurrence > maximum) {
 maximum = occurrence;
 index_max = *(arr+i);
 }
 }
 if (index_max != NULL){
 printf("The most frequent word is \"%s\" with %d occurrences.\n", index_max, maximum);
 }
 for ( i = 0; i < counter; i++ ) {
 free(arr[i]);
 }
 free(arr);
 return 0;
}

Example

Input (test.txt):

test one two test three test

Output:

The most frequent word is "test" with 3 occurrences.

One thing I think I'm missing (and not sure what's the way to do it) is to check the success of calling functions that use system resources.

Any review is appreciated.

200_success
145k22 gold badges190 silver badges478 bronze badges
asked Jun 15, 2018 at 6:58
\$\endgroup\$

2 Answers 2

3
\$\begingroup\$

Clearly employ contract requirements

"Write a function that ... contains unknown number of words of max length 10 each." should echo in code someplace a 10 named something like "words of max length". Use "size" for the size of the array. With strings, best to use "length" in the context of the number of characters before the '0円'.

//#define MAX_CHARS 11
#define WORD_MAX_LENGTH 10 
#define WORD_SIZE (WORD_MAX_LENGTH + 1)
... 
// calloc(MAX_CHARS, ...
calloc(WORD_SIZE, ...

Use int with fgetc()

fgetc() returns values in the unsigned char range and the negative EOF. These typical 257 different values will not distinctively fit in an 8-bit char. With original code, should char be an unsigned char, your loop is infinite.

// char ch;
int ch;
...
while((ch = fgetc(file)) != EOF) {

Check return value of fopen() and calloc()

Should the open fail, detect it.

FILE *file = fopen(filename, "r");
if (file == NULL) {
 fprintf(stderr, "Unable to open file <%s>\n", filename);
 return 0;
}

Should the allocation of more than 0 bytes returnNULL,detect it.

arr = (char**) calloc(counter, sizeof(char*));
if (arr == NULL && counter != 0) {
 fprintf(stderr, "Out of memory\n");
 return 0;
}

Improve *alloc() style

Cast not needed.
Allocate to the size of the object, not the type. Easier to code right, review and maintain.

// arr = (char**) calloc(counter, sizeof(char*));
arr = calloc(counter, sizeof *arr);

Good use of sentinels in printing strings

 // v v
 printf("The most frequent word is \"%s\" ...

Lack of bounds checking

Code should ensure i, j are not to big.

if (i >= counter) Handle_Error_Somehow();
i++;
if (j >= MAX_CHARS) Handle_Error_Somehow();
else { 
 arr[i][j] = ch;
}

Missing fclose()

fclose(file);

Little bits.

I'd expect size_t counter, i, j, k for array indexing as size_t is neither too narrow nor too wide. Be mindful it is an unsigned type.

Style: if (index_max != NULL){ is replaceable with if (index_max) {. I find the the 2nd clearer.

answered Jun 16, 2018 at 13:13
\$\endgroup\$
1
\$\begingroup\$

Because I can't comment chuxs answer I have to use a new answer. Like him I have multiple problems with the call of the calloc-function but I will show them step by step.

1. Introduce variables where you need them

char **arr;
//... many lines ...
arr = (char**) calloc(counter, sizeof(char*));

should be changed to:

char **arr = (char**) calloc(counter, sizeof(char*));

By introducing them right where you need them you clarify the code a lot. A possible reader does not need to go to the top to find out what kind of variable you are assigning a value to.

2. Casting of malloc/calloc/realloc

You don't need to cast the result of the functions malloc, calloc or realloc, because you might hide errors. See question for more details.

Better version:

char **arr = calloc(counter, sizeof(char*));

3. Changing sizeof() parameter

You can change the line to this:

char **arr = calloc(counter, sizeof(*arr));

If you would change the type of arr to a different one the sizeof() is updated automatically since it is evaluated at compile time.

answered Jun 20, 2018 at 10:42
\$\endgroup\$

Your Answer

Draft saved
Draft discarded

Sign up or log in

Sign up using Google
Sign up using Email and Password

Post as a guest

Required, but never shown

Post as a guest

Required, but never shown

By clicking "Post Your Answer", you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.