I'm new to programming, but want to build up a good habit of code review so that I can develop best practice. It's been couple of weeks I'm coding in C and my knowledge, so far, is data types, conditions, arrays, loop and, very minimal, pointers.
I'm creating multiple search and sorting algorithms - not only to understand them better, but also solidifying my understanding of concepts.
Please review and provide your advice so that I can get better:
#include <stdio.h>
#include <cs50.h>
int main(void)
{
printf("How many numbers do you want to enter? ");
int len = get_int();
int input[len]; // creating an array to store values
for (int i=0; i<len; i++)
{
printf("Enter %ith number: ", i+1);
input[i] = get_int();
}
// determining minimum and maximum value
int min = input[0];
int max = input[0];
for (int i=1; i<len; i++)
{
if (input[i]>max)
{
max = input[i];
}
if (input[i]<min)
{
min = input[i];
}
}
// creating an array for all the digits needed
int digits[max+1];
for (int i=0; i<max+1; i++)
{
digits[i] = 0; //giving all the elements of the array a default value
}
// counting numbers
for (int i=0; i<len; i++)
{
digits[input[i]] = digits[input[i]] + 1; // increasing the value of default digit element (default is 0) by 1 each time
}
printf("Minimum = %i\n", min);
printf("Maximum = %i\n", max);
printf("Inputed integers are: ");
for (int i=0; i<len; i++)
{
printf("%i ", input[i]);
}
printf("\n");
// count sort algorithm
int j=0;
for (int i=0; i<max+1; i++)
{
if (digits[i] == 0) // if the default value is zero, the value is disregarded, because it's not used
{
//
}
else
{
do
{
input[j] = i; // updating the element's value with i
j++; // updating j's value so that the next integer is stored in an updated address
digits[i] = digits[i] - 1; // decreasing the count in digits by 1, tells that it's values are being accounted for
}
while(digits[i] != 0); //when digits hit 0, it becomes useless again and breaks the loop
}
}
printf("The sorted numbers are: ");
for (int i=0; i<len; i++)
{
printf("%i ", input[i]);
}
printf("\n");
}
1 Answer 1
You haven't provided cs50.h
, so I have to guess its contents. I guess that it contains
int get_int(); // probably { int k; scanf("%d", &k); return k; }
for this review, since this is the only unknown function in your code.
Either way, let us get started. While completely up to you, the innards of a function are usually indented, just like they are for a conditional construct. So all of your main
would be indented:
int main(void)
{
printf("How many numbers do you want to enter? ");
int len = get_int();
int input[len]; // creating an array to store values
/* .. */
While your program only uses a single large main
, where it doesn't seem to matter, it's tremendously helpful if your program consists of multiple functions. for example, you could have split your functionality into several, reusable functions:
void read_n_numbers(int * destination, size_t n);
void count_digits(const int * source, size_t n, int min, int max, int * digits);
void print_digit_count(const int * digits, int min, int max);
By the way, those additional min
and max
parameters? They already show a flaw with your algorithm. But more on that later.
Instead, let us focus on some things to be wary of.
Know what variable length arrays are
int input[len]; // creating an array to store values
This lines uses two C99 features: C++-style comments and variable length arrays. The comments don't pose any problem. The VLA on the other hand may. VLAs are often stored in an area close to your other variables, like len
above. That area is rather limited in size, which can lead to problems if len
is too large. Also, with C11, they got optional. Your compiler might or might not support those.
They're sufficient for a small toy program, but you have to make sure that things don't get too large.
Give a variable the correct type and make it constant if it shall not change
We're still just in the first few lines of your program. But did you check whether len
is negative? Let's say the user inputs -1
. We end up with
int input[-1];
Uh oh... If this was a program you want to use in a productive environment, you would check the sign of len
, and, after the discussion above, also check that tit doesn't get too large (or dismiss VLA).
We can provide a small little helper to make sure that we get only positive integers:
size_t get_positive_int() {
int k = get_int();
while(k <= 0) {
printf("Number must be positive, please try again: ");
k = get_int();
}
return k;
}
Now we have an ease of mind when we get our len
:
const size_t len = get_positive_int();
What? Why is len
now size_t
? Why is it const
? Well, it's a size_t
because that is the proper data type to measure the size of something. You can think of it as a unsigned long
, but it's size depends on your system.
And why is len
now const
? Because it prevents accidental errors like this:
for( int i = 0; i < len++; ++i)
// ^^
// whoops
It seems unlikely that you accidentally change len
. But why take a risk? const
puts an ease on your mind.
Make exclusive branches exclusive
We're still following your function top-to-bottom. And while your for finding the minimum and maximum of the range works, it's not really friendly to our PC:
for (int i=1; i<len; i++)
{
if (input[i]>max)
{
max = input[i];
}
if (input[i]<min)
{
min = input[i];
}
}
You start with max = min
. However, throughout your program, the following property will hold: min <= max
. So if the input[i]
is greater then max
, it will certainly not be lesser than min
. We can reflect that with an else
:
for (int i=1; i<len; i++)
{
if (input[i] > max)
{
max = input[i];
}
else if (input[i] < min) // << here
{
min = input[i];
}
}
Now we only check whether input[i]
is smaller than min
if it wasn't aleady greater than max
.
An error in disguise
Now we come to the part where your code is actually wrong. Sorry.
Let's say your user inputs the following numbers:
-3
-2
-1
0
1
What can go wrong with the following declaration?
// creating an array for all the digits needed
int digits[max+1];
You will have only memory for exactly two digits. That's not enough! You have to calculate the actual number of possible digits. Luckily, that is simple:
const int number_of_possible_digits = max - min + 1;
size_t digits[number_of_possible_digits]; // size_t because we're counting things again
// use `memset(digits, 0, sizeof(digits));` alternatively
for(int i = 0; i < number_of_possible_digits; ++i)
{
digits[i] = 0;
}
"But wait", I here you say. "If I have now the following input, I would not have enough space for my digits:"
10
11
12
"I would have only space for three digits, but in my assignment I use digits[input[i]]
!" But we're not done yet. Keeping the example with negative numbers from above in mind, what goes wrong in your actual counting?
for (int i=0; i<len; i++)
{
digits[input[i]] = digits[input[i]] + 1; // increasing the value of default digit element (default is 0) by 1 each time
}
You've probably guessed it. With our negative values, we end up with digits[-3]
. This doesn't work. We have to use another technique for our access. Luckily, that's easy again:
for (int i=0; i<len; i++)
{
digits[input[i] - min]++;
// digits[ -3 - (-3)] = digits [-3 + 3] = digits[0]. All is well.
}
We have to adjust your algorithm slightly:
for (int i = 0, j = 0; i < number_of_possible_digits; i++)
{
while (digits[i] != 0)
{
input[j] = i + min; // min now added here
j++;
digits[i]--;
}
}
All we had to do is to add min
to our index in our digits. Let's check that with an example:
-5
-2
1
1
1
1
1
1
1
1
10
The number of possible digits is 16 (10 - (-5) +1). After counting them, we end up with a digits array that looks like this:
{1,0,0,1,0,0,8,0,0,0,0,0,0,0,0,1}
Does input[0]
get the correct value?
input[0] = 0 + -5
Yes it does.
To simplify a condition/loop
Allright, I changed a lot here, I have to admit. But the most important change is input[j] = i + min
. What other changes did I do?
- I've reduces
j
's scope. It is only available in the loop, so that you don't accidentally use it at some other point - I've removed all unecessary
if
andelse
.
I will show how I did the latter, so it gets easier to understand. First, if we do nothing in an if
, but everything in an else
, it makes sense to exchange those, e.g. go from
if (digits[i] == 0)
{
}
else
{
// other stuff
}
to
if (digits[i] != 0)
{
// other stuff
}
Next, if we look at our do-while
, we will see that the conditions are the same:
if (digits[i] != 0)
{
do
{
// other stuff
}
while(digits[i] != 0);
}
This means we can change it to a while loop, since the condition olds true. That's guaranteed by the if
:
if (digits[i] != 0)
{
while(digits[i])
{
// other stuff
}
}
Now we remember that a while loop will not even get entered once, if the condition doesn't hold. So we can get rid of the if
and we end up with simply
while(digits[i])
{
// other stuff
}
Don't repeat yourself
You print the sorted number twice. And maybe you want to print the digits at some point, too. Create a function for that:
void print_numbers(const int * source, size_t length) {
for (size_t i = 0; i<length; i++)
{
printf("%i ", source[i]);
}
}
Keep in mind that that function won't work with digits
if they are a size_t
array.
To put things together
So, was your algorithm correct? For positive numbers, the answer is yes. But as soon as the user is allowed to input negative numbers, the answer was "no". However, the greatest weak point of your program is that it is a single large function. You don't need to split it into too small functions, but the actual counting algorithm kind of gets lost in the noise of the rest.
So for a future exercise, try to write
void counting_sort_inplace(int * values, size_t length){
// ...
}
so that
counting_sort_inplace(input, len);
does exactly what your second part of the main
did before.
-
\$\begingroup\$ Thank you so much @Zeta for the review. I admit, I was looking at this from only positive integer standpoint and wasn't considering negatives. Couple of things: 1) I didn't use functions because it required me to use pointers, which I know almost nothing about. But I understand the importance of having as little code as possible inside main. 2) I did not know about
size_t
andconst
- I haven't read any book, so I'll have to read up on them more. I have a long way to go, but thank you very much for the feedback and corrections. \$\endgroup\$ahmed_imtiaz– ahmed_imtiaz2017年02月19日 02:32:28 +00:00Commented Feb 19, 2017 at 2:32 -
\$\begingroup\$ @ahmed_imtiaz I thought that you haven't encountered pointers yet. Therefore I posed the future exercise. But you can think of
values
above as being the same asinput
, with the same usage, and you can probably already writecounting_sort_inplace
. By the way, if you think this post provides a complete answer, consider to accept it. I just noticed that you haven't accepted the review on your previous answer too, so you might be interested in this: meta.stackexchange.com/questions/5234/…. Good luck and have fun writing C! \$\endgroup\$Zeta– Zeta2017年02月19日 10:28:39 +00:00Commented Feb 19, 2017 at 10:28
cs50.h
? \$\endgroup\$cs50.h
is a library they made, kind of like a training wheel for now. I know it's not standard practice, but I am going methodically in the course. \$\endgroup\$cs50.h
: mirror.cs50.net/library50/c/cs50-library-c-3.0/cs50.h \$\endgroup\$for (int i=1; i<len; i++)
what happens if the user only enters a single value? \$\endgroup\$