Count Sort Algorithm

Question 1

I'm new to programming, but want to build up a good habit of code review so that I can develop best practice. It's been couple of weeks I'm coding in C and my knowledge, so far, is data types, conditions, arrays, loop and, very minimal, pointers.

I'm creating multiple search and sorting algorithms - not only to understand them better, but also solidifying my understanding of concepts.

Please review and provide your advice so that I can get better:

#include <stdio.h>
#include <cs50.h>
int main(void)
{
printf("How many numbers do you want to enter? ");
int len = get_int();
int input[len]; // creating an array to store values
for (int i=0; i<len; i++)
{
 printf("Enter %ith number: ", i+1);
 input[i] = get_int();
}
// determining minimum and maximum value
int min = input[0];
int max = input[0];
for (int i=1; i<len; i++)
{
 if (input[i]>max)
 {
 max = input[i];
 }
 if (input[i]<min)
 {
 min = input[i];
 }
}
// creating an array for all the digits needed
int digits[max+1];
for (int i=0; i<max+1; i++)
{
 digits[i] = 0; //giving all the elements of the array a default value
}
// counting numbers
for (int i=0; i<len; i++)
{
 digits[input[i]] = digits[input[i]] + 1; // increasing the value of default digit element (default is 0) by 1 each time
}
printf("Minimum = %i\n", min);
printf("Maximum = %i\n", max);
printf("Inputed integers are: ");
for (int i=0; i<len; i++)
{
 printf("%i ", input[i]);
}
printf("\n");
// count sort algorithm
int j=0;
for (int i=0; i<max+1; i++)
{
 if (digits[i] == 0) // if the default value is zero, the value is disregarded, because it's not used
 {
 //
 }
 else
 {
 do
 {
 input[j] = i; // updating the element's value with i
 j++; // updating j's value so that the next integer is stored in an updated address
 digits[i] = digits[i] - 1; // decreasing the count in digits by 1, tells that it's values are being accounted for
 }
 while(digits[i] != 0); //when digits hit 0, it becomes useless again and breaks the loop
 }
}
printf("The sorted numbers are: ");
for (int i=0; i<len; i++)
{
 printf("%i ", input[i]);
}
printf("\n");
}

Question 2

What is cs50.h?

Question 3

@Zeta I'm going through Harvard's cs50 course right now. cs50.h is a library they made, kind of like a training wheel for now. I know it's not standard practice, but I am going methodically in the course.

Question 4

cs50.h: mirror.cs50.net/library50/c/cs50-library-c-3.0/cs50.h

Question 5

regarding this line: for (int i=1; i<len; i++) what happens if the user only enters a single value?

Question 6

You haven't provided cs50.h, so I have to guess its contents. I guess that it contains

int get_int(); // probably { int k; scanf("%d", &k); return k; }

for this review, since this is the only unknown function in your code.

Either way, let us get started. While completely up to you, the innards of a function are usually indented, just like they are for a conditional construct. So all of your main would be indented:

int main(void)
{
 printf("How many numbers do you want to enter? ");
 int len = get_int();
 int input[len]; // creating an array to store values
 /* .. */

While your program only uses a single large main, where it doesn't seem to matter, it's tremendously helpful if your program consists of multiple functions. for example, you could have split your functionality into several, reusable functions:

void read_n_numbers(int * destination, size_t n);
void count_digits(const int * source, size_t n, int min, int max, int * digits);
void print_digit_count(const int * digits, int min, int max);

By the way, those additional min and max parameters? They already show a flaw with your algorithm. But more on that later.

Instead, let us focus on some things to be wary of.

Know what variable length arrays are

int input[len]; // creating an array to store values

This lines uses two C99 features: C++-style comments and variable length arrays. The comments don't pose any problem. The VLA on the other hand may. VLAs are often stored in an area close to your other variables, like len above. That area is rather limited in size, which can lead to problems if len is too large. Also, with C11, they got optional. Your compiler might or might not support those.

They're sufficient for a small toy program, but you have to make sure that things don't get too large.

Give a variable the correct type and make it constant if it shall not change

We're still just in the first few lines of your program. But did you check whether len is negative? Let's say the user inputs -1. We end up with

int input[-1];

Uh oh... If this was a program you want to use in a productive environment, you would check the sign of len, and, after the discussion above, also check that tit doesn't get too large (or dismiss VLA).

We can provide a small little helper to make sure that we get only positive integers:

size_t get_positive_int() {
 int k = get_int();
 while(k <= 0) {
 printf("Number must be positive, please try again: ");
 k = get_int();
 }
 return k;
}

Now we have an ease of mind when we get our len:

const size_t len = get_positive_int();

What? Why is len now size_t? Why is it const? Well, it's a size_t because that is the proper data type to measure the size of something. You can think of it as a unsigned long, but it's size depends on your system.

And why is len now const? Because it prevents accidental errors like this:

for( int i = 0; i < len++; ++i)
// ^^
// whoops

It seems unlikely that you accidentally change len. But why take a risk? const puts an ease on your mind.

Make exclusive branches exclusive

We're still following your function top-to-bottom. And while your for finding the minimum and maximum of the range works, it's not really friendly to our PC:

for (int i=1; i<len; i++)
{
 if (input[i]>max)
 {
 max = input[i];
 }
 if (input[i]<min)
 {
 min = input[i];
 }
}

You start with max = min. However, throughout your program, the following property will hold: min <= max. So if the input[i] is greater then max, it will certainly not be lesser than min. We can reflect that with an else:

for (int i=1; i<len; i++)
{
 if (input[i] > max)
 {
 max = input[i];
 }
 else if (input[i] < min) // << here
 {
 min = input[i];
 }
}

Now we only check whether input[i] is smaller than min if it wasn't aleady greater than max.

An error in disguise

Now we come to the part where your code is actually wrong. Sorry.

Let's say your user inputs the following numbers:

-3
-2
-1
0
1

What can go wrong with the following declaration?

// creating an array for all the digits needed
int digits[max+1];

You will have only memory for exactly two digits. That's not enough! You have to calculate the actual number of possible digits. Luckily, that is simple:

const int number_of_possible_digits = max - min + 1;
size_t digits[number_of_possible_digits]; // size_t because we're counting things again
// use `memset(digits, 0, sizeof(digits));` alternatively
for(int i = 0; i < number_of_possible_digits; ++i)
{
 digits[i] = 0;
}

"But wait", I here you say. "If I have now the following input, I would not have enough space for my digits:"

 10
 11
 12

"I would have only space for three digits, but in my assignment I use digits[input[i]]!" But we're not done yet. Keeping the example with negative numbers from above in mind, what goes wrong in your actual counting?

for (int i=0; i<len; i++)
{
 digits[input[i]] = digits[input[i]] + 1; // increasing the value of default digit element (default is 0) by 1 each time
}

You've probably guessed it. With our negative values, we end up with digits[-3]. This doesn't work. We have to use another technique for our access. Luckily, that's easy again:

for (int i=0; i<len; i++)
{
 digits[input[i] - min]++;
 // digits[ -3 - (-3)] = digits [-3 + 3] = digits[0]. All is well.
}

We have to adjust your algorithm slightly:

for (int i = 0, j = 0; i < number_of_possible_digits; i++)
{
 while (digits[i] != 0)
 {
 input[j] = i + min; // min now added here
 j++;
 digits[i]--;
 }
}

All we had to do is to add min to our index in our digits. Let's check that with an example:

The number of possible digits is 16 (10 - (-5) +1). After counting them, we end up with a digits array that looks like this:

{1,0,0,1,0,0,8,0,0,0,0,0,0,0,0,1}

Does input[0] get the correct value?

input[0] = 0 + -5

Yes it does.

To simplify a condition/loop

Allright, I changed a lot here, I have to admit. But the most important change is input[j] = i + min. What other changes did I do?

I've reduces j's scope. It is only available in the loop, so that you don't accidentally use it at some other point
I've removed all unecessary if and else.

I will show how I did the latter, so it gets easier to understand. First, if we do nothing in an if, but everything in an else, it makes sense to exchange those, e.g. go from

if (digits[i] == 0)
{
}
else
{
 // other stuff
}

to

if (digits[i] != 0)
{
 // other stuff
}

Next, if we look at our do-while, we will see that the conditions are the same:

if (digits[i] != 0)
{
 do
 {
 // other stuff
 }
 while(digits[i] != 0);
}

This means we can change it to a while loop, since the condition olds true. That's guaranteed by the if:

if (digits[i] != 0)
{
 while(digits[i])
 {
 // other stuff
 }
}

Now we remember that a while loop will not even get entered once, if the condition doesn't hold. So we can get rid of the if and we end up with simply

while(digits[i])
{
 // other stuff
}

Don't repeat yourself

You print the sorted number twice. And maybe you want to print the digits at some point, too. Create a function for that:

void print_numbers(const int * source, size_t length) {
 for (size_t i = 0; i<length; i++)
 {
 printf("%i ", source[i]);
 }
}

Keep in mind that that function won't work with digits if they are a size_t array.

To put things together

So, was your algorithm correct? For positive numbers, the answer is yes. But as soon as the user is allowed to input negative numbers, the answer was "no". However, the greatest weak point of your program is that it is a single large function. You don't need to split it into too small functions, but the actual counting algorithm kind of gets lost in the noise of the rest.

So for a future exercise, try to write

void counting_sort_inplace(int * values, size_t length){
 // ...
}

so that

counting_sort_inplace(input, len);

does exactly what your second part of the main did before.

Question 7

Thank you so much @Zeta for the review. I admit, I was looking at this from only positive integer standpoint and wasn't considering negatives. Couple of things: 1) I didn't use functions because it required me to use pointers, which I know almost nothing about. But I understand the importance of having as little code as possible inside main. 2) I did not know about size_t and const - I haven't read any book, so I'll have to read up on them more. I have a long way to go, but thank you very much for the feedback and corrections.

Question 8

@ahmed_imtiaz I thought that you haven't encountered pointers yet. Therefore I posed the future exercise. But you can think of values above as being the same as input, with the same usage, and you can probably already write counting_sort_inplace. By the way, if you think this post provides a complete answer, consider to accept it. I just noticed that you haven't accepted the review on your previous answer too, so you might be interested in this: meta.stackexchange.com/questions/5234/…. Good luck and have fun writing C!

Zeta ZetaZeta 19.6k2 gold badges57 silver badges90 bronze badges · Accepted Answer · 2017-02-18 10:42:04Z

You haven't provided cs50.h, so I have to guess its contents. I guess that it contains

int get_int(); // probably { int k; scanf("%d", &k); return k; }

for this review, since this is the only unknown function in your code.

Either way, let us get started. While completely up to you, the innards of a function are usually indented, just like they are for a conditional construct. So all of your main would be indented:

int main(void)
{
 printf("How many numbers do you want to enter? ");
 int len = get_int();
 int input[len]; // creating an array to store values
 /* .. */

While your program only uses a single large main, where it doesn't seem to matter, it's tremendously helpful if your program consists of multiple functions. for example, you could have split your functionality into several, reusable functions:

void read_n_numbers(int * destination, size_t n);
void count_digits(const int * source, size_t n, int min, int max, int * digits);
void print_digit_count(const int * digits, int min, int max);

By the way, those additional min and max parameters? They already show a flaw with your algorithm. But more on that later.

Instead, let us focus on some things to be wary of.

Know what variable length arrays are

int input[len]; // creating an array to store values

This lines uses two C99 features: C++-style comments and variable length arrays. The comments don't pose any problem. The VLA on the other hand may. VLAs are often stored in an area close to your other variables, like len above. That area is rather limited in size, which can lead to problems if len is too large. Also, with C11, they got optional. Your compiler might or might not support those.

They're sufficient for a small toy program, but you have to make sure that things don't get too large.

Give a variable the correct type and make it constant if it shall not change

We're still just in the first few lines of your program. But did you check whether len is negative? Let's say the user inputs -1. We end up with

int input[-1];

Uh oh... If this was a program you want to use in a productive environment, you would check the sign of len, and, after the discussion above, also check that tit doesn't get too large (or dismiss VLA).

We can provide a small little helper to make sure that we get only positive integers:

size_t get_positive_int() {
 int k = get_int();
 while(k <= 0) {
 printf("Number must be positive, please try again: ");
 k = get_int();
 }
 return k;
}

Now we have an ease of mind when we get our len:

const size_t len = get_positive_int();

What? Why is len now size_t? Why is it const? Well, it's a size_t because that is the proper data type to measure the size of something. You can think of it as a unsigned long, but it's size depends on your system.

And why is len now const? Because it prevents accidental errors like this:

for( int i = 0; i < len++; ++i)
// ^^
// whoops

It seems unlikely that you accidentally change len. But why take a risk? const puts an ease on your mind.

Make exclusive branches exclusive

We're still following your function top-to-bottom. And while your for finding the minimum and maximum of the range works, it's not really friendly to our PC:

for (int i=1; i<len; i++)
{
 if (input[i]>max)
 {
 max = input[i];
 }
 if (input[i]<min)
 {
 min = input[i];
 }
}

You start with max = min. However, throughout your program, the following property will hold: min <= max. So if the input[i] is greater then max, it will certainly not be lesser than min. We can reflect that with an else:

for (int i=1; i<len; i++)
{
 if (input[i] > max)
 {
 max = input[i];
 }
 else if (input[i] < min) // << here
 {
 min = input[i];
 }
}

Now we only check whether input[i] is smaller than min if it wasn't aleady greater than max.

An error in disguise

Now we come to the part where your code is actually wrong. Sorry.

Let's say your user inputs the following numbers:

-3
-2
-1
0
1

What can go wrong with the following declaration?

// creating an array for all the digits needed
int digits[max+1];

You will have only memory for exactly two digits. That's not enough! You have to calculate the actual number of possible digits. Luckily, that is simple:

const int number_of_possible_digits = max - min + 1;
size_t digits[number_of_possible_digits]; // size_t because we're counting things again
// use `memset(digits, 0, sizeof(digits));` alternatively
for(int i = 0; i < number_of_possible_digits; ++i)
{
 digits[i] = 0;
}

"But wait", I here you say. "If I have now the following input, I would not have enough space for my digits:"

 10
 11
 12

"I would have only space for three digits, but in my assignment I use digits[input[i]]!" But we're not done yet. Keeping the example with negative numbers from above in mind, what goes wrong in your actual counting?

for (int i=0; i<len; i++)
{
 digits[input[i]] = digits[input[i]] + 1; // increasing the value of default digit element (default is 0) by 1 each time
}

You've probably guessed it. With our negative values, we end up with digits[-3]. This doesn't work. We have to use another technique for our access. Luckily, that's easy again:

for (int i=0; i<len; i++)
{
 digits[input[i] - min]++;
 // digits[ -3 - (-3)] = digits [-3 + 3] = digits[0]. All is well.
}

We have to adjust your algorithm slightly:

for (int i = 0, j = 0; i < number_of_possible_digits; i++)
{
 while (digits[i] != 0)
 {
 input[j] = i + min; // min now added here
 j++;
 digits[i]--;
 }
}

All we had to do is to add min to our index in our digits. Let's check that with an example:

The number of possible digits is 16 (10 - (-5) +1). After counting them, we end up with a digits array that looks like this:

{1,0,0,1,0,0,8,0,0,0,0,0,0,0,0,1}

Does input[0] get the correct value?

input[0] = 0 + -5

Yes it does.

To simplify a condition/loop

Allright, I changed a lot here, I have to admit. But the most important change is input[j] = i + min. What other changes did I do?

I've reduces j's scope. It is only available in the loop, so that you don't accidentally use it at some other point
I've removed all unecessary if and else.

I will show how I did the latter, so it gets easier to understand. First, if we do nothing in an if, but everything in an else, it makes sense to exchange those, e.g. go from

if (digits[i] == 0)
{
}
else
{
 // other stuff
}

to

if (digits[i] != 0)
{
 // other stuff
}

Next, if we look at our do-while, we will see that the conditions are the same:

if (digits[i] != 0)
{
 do
 {
 // other stuff
 }
 while(digits[i] != 0);
}

This means we can change it to a while loop, since the condition olds true. That's guaranteed by the if:

if (digits[i] != 0)
{
 while(digits[i])
 {
 // other stuff
 }
}

Now we remember that a while loop will not even get entered once, if the condition doesn't hold. So we can get rid of the if and we end up with simply

while(digits[i])
{
 // other stuff
}

Don't repeat yourself

You print the sorted number twice. And maybe you want to print the digits at some point, too. Create a function for that:

void print_numbers(const int * source, size_t length) {
 for (size_t i = 0; i<length; i++)
 {
 printf("%i ", source[i]);
 }
}

Keep in mind that that function won't work with digits if they are a size_t array.

To put things together

So, was your algorithm correct? For positive numbers, the answer is yes. But as soon as the user is allowed to input negative numbers, the answer was "no". However, the greatest weak point of your program is that it is a single large function. You don't need to split it into too small functions, but the actual counting algorithm kind of gets lost in the noise of the rest.

So for a future exercise, try to write

void counting_sort_inplace(int * values, size_t length){
 // ...
}

so that

counting_sort_inplace(input, len);

does exactly what your second part of the main did before.

Thank you so much @Zeta for the review. I admit, I was looking at this from only positive integer standpoint and wasn't considering negatives. Couple of things: 1) I didn't use functions because it required me to use pointers, which I know almost nothing about. But I understand the importance of having as little code as possible inside main. 2) I did not know about size_t and const - I haven't read any book, so I'll have to read up on them more. I have a long way to go, but thank you very much for the feedback and corrections.
@ahmed_imtiaz I thought that you haven't encountered pointers yet. Therefore I posed the future exercise. But you can think of values above as being the same as input, with the same usage, and you can probably already write counting_sort_inplace. By the way, if you think this post provides a complete answer, consider to accept it. I just noticed that you haven't accepted the review on your previous answer too, so you might be interested in this: meta.stackexchange.com/questions/5234/…. Good luck and have fun writing C!

Stack Exchange Network

Count Sort Algorithm

1 Answer 1

Know what variable length arrays are

Give a variable the correct type and make it constant if it shall not change

Make exclusive branches exclusive

An error in disguise

To simplify a condition/loop

Don't repeat yourself

To put things together

Your Answer

Sign up or log in

Post as a guest

Post as a guest

Hot Network Questions

Count Sort Algorithm

1 Answer 1

Know what variable length arrays are

Give a variable the correct type and make it constant if it shall not change

Make exclusive branches exclusive

An error in disguise

To simplify a condition/loop

Don't repeat yourself

To put things together

Your Answer

Sign up or log in

Post as a guest

Post as a guest

Related

Hot Network Questions