I am trying to code up a basic combinatorial math library for C++. Started with the implementation of the factorial function. Here I include the code and the test code. Please review.
- Is the way of throwing argument exceptions the best way to do it?
- Am I using the Test code the correct way for testing exceptions?
- Any other advice will be welcome.
N.B: I have seen that the boost library uses a tabularized approach with gamma functions. I shall do these optimizations later.
namespace Combinatorics
{
unsigned long long factorial(int n)
{
if (n < 0)
throw std::invalid_argument("The argument cannot be negative");
unsigned long long fact = 1;
for (int i = 1; i <= n; i++)
{
if ((ULLONG_MAX / i) >= fact)
{
fact = fact * i;
}
else
{
throw std::invalid_argument("The argument is too large");
}
}
return fact;
}
}
TEST(CombinatorialTestCase, Factorial)
{
EXPECT_EQ(120, Combinatorics::factorial(5));
EXPECT_EQ(1, Combinatorics::factorial(0));
EXPECT_EQ(1, Combinatorics::factorial(1));
try
{
EXPECT_EQ(3, Combinatorics::factorial(-1));
}
catch (std::invalid_argument &exception)
{
EXPECT_EQ(exception.what(), std::string("The argument cannot be negative"));
}
catch (...)
{
FAIL();
}
try
{
EXPECT_EQ(500, Combinatorics::factorial(500));
}
catch (std::invalid_argument& exception)
{
EXPECT_EQ(exception.what(), std::string("The argument is too large"));
}
catch (...)
{
FAIL();
}
}
2 Answers 2
We're missing some required headers:
#include <climits>
#include <stdexcept>
#include <string>
#include <gtest/gtest.h>
We can reduce the number of division operations here:
if ((ULLONG_MAX / i) >= fact)
If we use ULLONG_MAX / n
instead, we get the exception in the same cases, but this doesn't change during the loop and the compiler can hoist the calculation to perform the division just once.
If we swap the if
/else
around, we find we don't need else
because throwing an exception breaks the flow of control:
if (fact > ULLONG_MAX / n) { throw std::invalid_argument("The argument is too large"); } fact *= i;
Since negative numbers don't have factorials, it may be better to accept an unsigned type as argument, so we don't need the n < 0
test.
There's no need to check whether other exceptions are thrown:
catch(...) { FAIL(); }
This is actually harmful, because it gives a information-free failure, whereas GTest's own handling of exceptions will report the exception that was thrown.
If the concern is that any exception will abort the test case and not run subsequent tests, then the fix is simple - create individual tests for each value.
In any case, instead of using our own try
/catch
, we could do better by using the framework's provided EXPECT_THROW
macro:
TEST(Combinatorial_Factorial, small_ints)
{
EXPECT_EQ(120, Combinatorics::factorial(5));
EXPECT_EQ(1, Combinatorics::factorial(0));
EXPECT_EQ(1, Combinatorics::factorial(1));
}
TEST(Combinatorial_Factorial, negative)
{
EXPECT_THROW(Combinatorics::factorial(-1), std::invalid_argument);
}
TEST(Combinatorial_Factorial, too_big)
{
EXPECT_THROW(Combinatorics::factorial(500), std::invalid_argument);
}
You'll see that these tests don't verify the message within the exception. That's really a Good Thing, as the content of the message isn't (shouldn't) be part of the spec, and that level of verification is over-testing.
As an aside, we (theoretically at least) can't guarantee that ULLONG_MAX
is less than 500!, since there is no upper limit imposed by the standard, only a lower limit.
The correct exception type to throw is really std::overflow_error
(result of a computation is too large for the destination type).
Minor style issue: established convention is that we use lower-case names for namespace identifiers.
Minor issue: fact *= i
is equivalent to fact = fact * i
and somewhat easier to take in.
Modified code
#include <climits>
#include <stdexcept>
#include <string>
namespace combinatorics
{
unsigned long long factorial(unsigned n)
{
unsigned long long fact = 1;
for (unsigned i = 2; i <= n; ++i) {
if (fact > ULLONG_MAX / n) {
throw std::overflow_error("The result is too large");
}
fact *= i;
}
return fact;
}
}
#include <gtest/gtest.h>
TEST(Combinatorial_Factorial, small_ints)
{
EXPECT_EQ(combinatorics::factorial(0), 1);
EXPECT_EQ(combinatorics::factorial(1), 1);
EXPECT_EQ(combinatorics::factorial(5), 120);
}
TEST(Combinatorial_Factorial, too_big)
{
EXPECT_THROW(combinatorics::factorial(500), std::overflow_error);
}
-
4
-
3\$\begingroup\$ @JDługosz Disagree, yet also agree. (I feel strongly both ways) \$\endgroup\$chux– chux2021年08月04日 14:49:15 +00:00Commented Aug 4, 2021 at 14:49
-
7\$\begingroup\$ @JDługosz, with unsigned type, there is no need to check for invalid values, as all values are then valid. That's the whole point. (And that ES.106 link seems to assume that compiler warnings aren't enabled - just enable them). \$\endgroup\$Toby Speight– Toby Speight2021年08月04日 16:08:23 +00:00Commented Aug 4, 2021 at 16:08
-
2\$\begingroup\$ If you fi the exceptions,
std::domain_error
looks more appropriate imho. \$\endgroup\$Deduplicator– Deduplicator2021年08月04日 17:37:35 +00:00Commented Aug 4, 2021 at 17:37 -
2\$\begingroup\$ @OlivierGrégoire, any decent compiler will hoist
ULLONG_MAX / n
out of the loop; you can store it in a variable if you prefer. Yes, it would make sense to start the loop ati = 2
; I'll edit. \$\endgroup\$Toby Speight– Toby Speight2021年08月05日 09:15:51 +00:00Commented Aug 5, 2021 at 9:15
You should write it as a template rather than hard-coding it to produce unsigned long long
. You might want to call it with 128-bit integers (a gnu extension), or a Bignum class of some kind such as Boost.Multiprecision.
Rather than using C's limits macros (ULLONG_MAX
), use the C++ numeric_limits
library feature. This is template-friendly, and that helps even if you are not writing template code: you don't have to remember different names for the limit based on the type, and finding and fixing these when you change the type of something.
For the exception, you can include the function name in the error message without complicating the code. You'll be glad of this if the error is ever hit in real code rather than in trivial test code where you're only calling one function!
As Toby pointed out, you can move the test outside of the loop. (Division is very expensive and will take far longer than the actual factorial computation, as written!). But I'll add to that: Find the maximum legal input value as a constexpr
local variable. This will be computed at compile time. But if it's a template, the specified type (some big-int library) might not support constexpr
arithmetic if it's not fairly recent, so use static const
instead, so it will be computed once and reused on subsequent calls to the function.
Another tip: concerning std::string("The argument cannot be negative")
, assuming you really need to explicitly make it a std::string
rather than just passing the lexical string literal, you can use the string literal operator with identical meaning: "The argument cannot be negative"s
.
Update: because the reason is to have ==
work, it's far more efficient to use a std::string_view
. That not only prevents copying the characters, but the comparison is much faster than strcmp
.
Write: "The argument cannot be negative"sv
.
(read the link above to learn how to opt-in to enable these literals)
-
\$\begingroup\$ Good idea to compile-time compute the max value. I was trying to think how we could compute inverse-factorial of
ULLONG_MAX
, but of course we don't need to do that - we can loop until we're about to overflow and record that value, all inconstexpr
with favourable types. \$\endgroup\$Toby Speight– Toby Speight2021年08月04日 16:14:33 +00:00Commented Aug 4, 2021 at 16:14 -
\$\begingroup\$ The reason to construct a
std::string
is becauseASSERT_EQ
will use==
to test the assertion. We don't want to be comparing twoconst char*
(as that's pointer equality), so converting one to string object gives us the value equality we want. @suryasis, you'll needusing namespace std::literals;
for that - it's a namespace designed to be used that way, unlikenamespace std
. \$\endgroup\$Toby Speight– Toby Speight2021年08月04日 16:23:42 +00:00Commented Aug 4, 2021 at 16:23 -
\$\begingroup\$ maximum value for an unsigned type is easier to just calculate as needed:
T(-1)
. \$\endgroup\$Deduplicator– Deduplicator2021年08月04日 17:36:06 +00:00Commented Aug 4, 2021 at 17:36 -
1\$\begingroup\$ @TobySpeight to get
==
to work naturally, it's more efficient to use astring_view
. This also has a literal form using the""sv
suffix. (I have a Code Project article in draft promoting this, in fact.) \$\endgroup\$JDługosz– JDługosz2021年08月05日 13:48:27 +00:00Commented Aug 5, 2021 at 13:48 -
\$\begingroup\$ @Deduplicator If
T
is not a primitive type but something like Boost.Multiprecision, or a "safe integer" wrapper (see the 2021 CPPNow! conference videos) that trick will not work. \$\endgroup\$JDługosz– JDługosz2021年08月05日 13:51:28 +00:00Commented Aug 5, 2021 at 13:51
long long
is 29. This means you may as well just precompute a const array of all 30 possible values and return the result in constant-time rather than compute it on the fly. Check if the value is <0 or >29 and throw an exception, otherwise just return the value indexed from the array. \$\endgroup\$