Today I decided to learn some basic Haskell, and for starters I made a program for calculating the checksum of a Swedish personal identification number. It uses the Luhn-algorithm, aka. IBM MOD-10
.
Explanation of this code can be found on Swedish Wikipedia and English Wikipedia
Here's a description of how the algorithm works:
Given a string of 9 digits, abcdefghi
compute:
array = [a*2, b, c*2, d, e*2, f, g*2, h, i*2]
Then you computer the sum
of the digits in this array, so for example if a*2
is 16
, then that part counts as 1 + 6
.
Then the result is how much more you have to add to make it evenly divisible by 10. For example, if sum == 54
then the result is 6
as 60 - 54 = 6
.
import Data.Char
sumOfChars :: String -> Int
sumOfChars "" = 0
sumOfChars str = digitToInt(str !! 0) + sumOfChars(tail str)
twoMult :: Char -> String
twoMult c = show (digitToInt(c) * 2)
identificationSum :: String -> Int
identificationSum "" = 0
identificationSum str = if length str `mod` 2 == 1 then sumOfChars(twoMult(str !! 0)) + identificationSum(tail str)
else digitToInt(str !! 0) + identificationSum(tail str)
remainingToTens :: Int -> Int
remainingToTens x = ceiling(fromIntegral x / 10) * 10 - x
determineLastDigit :: String -> Int
determineLastDigit str = remainingToTens(identificationSum(str))
To test the code:
determineLastDigit("811228987")
Prints:
4
As this is the first time ever I managed to make something in Haskell, any comments are welcome. Even though it is the first time though, feel free to rip my code apart as much as you would like and suggest any advanced things, I am always eager to learn.
2 Answers 2
This is a good start! I'll do two passes over your code, one to address issues of style, and another to leverage functions from the Haskell Prelude to decrease code length and bring it more in line with typical Haskell usage and idioms.
Style
Your sumOfChars
function is a good example of tail-recursion, but in Haskell we would take that String
apart through pattern matching. A Haskell String
is really a list of characters, or [Char]
, so we'll decompose it as a list.
sumOfChars :: [Char] -> Int -- String is a synonym for [Char]
sumOfChars [] = 0 -- The empty list is the same value as the empty string, i.e. [] == ""
sumOfChars (c:cs) = digitToInt c + sumOfChars cs
The :
is the list cons operator, c
is the head of the list and cs
is the tail. So, c == str !! 0
and cs == tail str
. (And head
is the function we usually use for getting the element at index 0 in a list.)
twoMult
isn't wrong, but function application in Haskell doesn't require any parentheses. Parentheses are for grouping since function application has the highest precedence of any operation.
twoMult :: Char -> String
twoMult c = show (digitToInt c * 2)
Using an if
statement at the top level of a function is usually an indication that you can use a guard. Guards are a bit like a multi-way if
.
identificationSum :: [Char] -> Int
identificationSum [] = 0
identificationSum s@(c:cs) -- The 's@' part is an as-pattern, explained below
| length s `mod` 2 == 1 = sumOfChars (twoMult c) + identificationSum cs
| otherwise = digitToInt c + identificationSum cs
Each guard begins with a pipe (|
) and evaluates to a Bool
value. First pattern matching takes place, then guards are evaluated in order. otherwise
is just a synonym for True
, i.e., a guard that always succeeds when evaluation reaches it.
I used an as-pattern there to bind the value of c:cs
to an identifier. This is a handy shortcut and much preferable to writing c:cs
all over the place if you need both the original value and its decomposition.
remainingToTens
is a little odd, and I'd guess that you got to that answer when the compiler complained about there not being an instance for Fractional Int
, yeah? Integer division uses a function called div
, /
is for Fractional
values (like Float
or Double
) (yeah it's a little strange the first time you come across this).
remainingToTens :: Int -> Int
remainingToTens x = (x `div` 10 + 1) * 10 - x -- Edit: This is incorrect, use version below
And determineLastDigit
just has a few extra parentheses.
determineLastDigit :: String -> Int
determineLastDigit s = remainingToTens (identificationSum s)
Idiomatic Haskell
Many of the operations you implemented can be expressed using some of the higher-order functions we have in the Prelude and coding in a functional style.
Instead of using primitive recursion on the elements of a list in sumOfChars
, typical Haskell usage would see us using functions like sum
and map
to manipulate the entire list without getting into the weeds.
sumOfChars :: [Char] -> Int
sumOfChars cs = sum (map digitToInt cs)
map
s type is (a -> b) -> [a] -> [b]
, that is, it takes a function - that itself takes an element of type a
and returns something of type b
- and a list and returns a list where that function has been applied to every element of the first list. sum
adds all the values in a list of numbers.
One further thing we could do to that function is to write it in pointfree style. Writing functions pointfree is definitely an aspect of Haskell style, but don't concentrate on it overmuch until you have a solid grasp on the basics of the language. I've included an Appendix A at the bottom of this post where functions have been written pointfree for your reference.
identificationSum
could be written many, many ways (some as in the appendices below) but given the context of sumOfChars
, twoMult
, and the specification of the algorithm I would favor separating out the functionality into two phases. First, constructing a new string based on doubling every other digit, and secondly summing all of the digits. To do this I'll introduce a new function.
doubleAlternating :: [Char] -> [Char]
doubleAlternating [] = []
doubleAlternating (c:[]) = twoMult c
doubleAlternating (c:d:cs) = twoMult c ++ [d] ++ doubleAlternating cs
And then identificationSum
is a composition of doubleAlternating
and sumOfChars
.
identificationSum :: String -> Int
identificationSum s = sumOfChars (doubleAlternating s)
remainingToTens
could really just benefit from some modular arithmetic. Use the rem
function.
remainingToTens :: Int -> Int
remainingToTens x = negate x `mod` 10
I'll take it a little further with inlining in the appendices, but as it stands this is a very readable translation of the problem as it was stated. Having a close correspondence to the problem domain can be much more valuable than terse code for the sake of terseness.
Appendix A: Pointfree
One of the things I like about pointfree style is that it forces you to think in terms of functions, higher order abstractions, and data pipelines. Presented without further comment.
sumOfChars :: [Char] -> Int
sumOfChars = sum . map digitToInt
twoMult :: Char -> String
twoMult = show . (* 2) . digitToInt
identificationSum :: String -> Int
identificationSum = sumOfChars . doubleAlternating
determineLastDigit :: String -> Int
determineLastDigit = remainingToTens . identificationSum
Appendix B: Code golf
Understanding this will help you explore the Prelude and hopefully have a stronger grasp of function composition. I'll admit it's a little showy though. ;-)
checksum :: String -> Int
checksum s = negate (sum . map digitToInt . concatMap show . zipWith ($) (cycle [(* 2), id]) . map digitToInt $ s) `mod` 10
-
\$\begingroup\$ I love your answer! It certainly gives me a whole lot of things to check out. I believe though that
remainingToTens x = (x `div` 10 + 1) * 10 - x
orremainingToTens x = 10 - x `rem` 10
will give the wrong result for whenx % 10 == 0
, it should be 0 and not 10 in those cases. Which is why I went with theceiling
approach. (I guess it's not too hard to solve with your approach though). And yes, I did get that instance of Fractional Int message. Took some google searches to get around that.. \$\endgroup\$Simon Forsberg– Simon Forsberg2014年07月24日 00:07:27 +00:00Commented Jul 24, 2014 at 0:07 -
\$\begingroup\$ Ah! You're right about
remainingToTens
! I only really tested with the single example you gave. Shame on me. This seems to work.negate x `mod` 10
I'm glad you found my answer helpful! \$\endgroup\$bisserlis– bisserlis2014年07月24日 01:52:46 +00:00Commented Jul 24, 2014 at 1:52
Simple issues first...
I suggest being more explicit about what you are importing. For example,
import Data.Char (digitToInt, intToDigit)
remainingToTens x = ceiling(fromIntegral x / 10) * 10 - x
could be more simply written asremainingToTens x = 10 - x `mod` 10
.- It would be nice to support
'-'
and'+'
characters in the input. - It would be good to fail if the input contains any character other than a decimal digit or a
'-'
or a'+'
. Actually,digitToInt
also accepts'a'
through'f'
as hexadecimal digits, which you want to reject.
Idiomatic Haskell
As @bisserlis mentioned, your use of !!
, if-then-else
, and parentheses are not idiomatic in Haskell.
In general, Haskell lists can be infinitely long. Therefore, Haskell idioms emphasize laziness, which implies head-to-tail traversal. For code like this, built to handle short strings, it makes little difference, but you should try to develop a habit of traversing lists from head to tail, in one single pass whenever possible.
In that spirit, it is best to avoid calling functions like length
, which traverse all the way to the end of a list. Especially not when you do so within a recursive function identificationSum
. (As it turns out, however, length
might be necessary after all — see below.)
I don't like the way sumOfChars(twoMult(str !! 0))
works. twoMult
calls show
to stringify the doubled digit, and sumOfChars
immediately converts it back into the numeric realm. You should be able to handle the calculation without stringifying.
Helper functions, if they are only ever used by one function, should be scoped. You can use either a where
clause or a let helper = ... in ...
expression.
First suggested implementation
Incorporating the ideas above...
import Data.Char (digitToInt, intToDigit)
sumOfChars :: String -> Int
sumOfChars str = sumOfChars' double id str
where
sumOfChars' f f' "" = 0
sumOfChars' f f' (c:cs)
| c == '-' || c == '+' = sumOfChars' f f' cs
| '0' <= c && c <= '9' = (f (digitToInt c)) + sumOfChars' f' f cs
double n
| n < 5 = 2 * n
| n < 10 = 2 * n + (1 - 10)
lastDigit :: String -> Char
lastDigit str = intToDigit $ 10 - (sumOfChars str) `mod` 10
I've redefined lastDigit
so that it returns the digit as a Char
rather than an Int
. The user of your functions should be dealing purely with strings and characters, as Swedish Personal IDs are strings, not integers. (Leading zeroes and punctuation matter, for example.)
Complication
Apparently, IDs are sometimes written with a four-digit year in the birthdate field. However, the Swedish Wikipedia example ("19811218-9876") suggests that even when the century is prepended, the checksum is still based on the canonical representation using a two-digit year. That makes implementation trickier: you would probably need to either reintroduce a call to length
(just one call in a non-recursive context, mind you) or do a crazy amount of pattern matching.
sumOfChars :: String -> Int
sumOfChars str@(_:_:cs)
| length str > 11 = sumOfChars' double id cs
| otherwise = sumOfChars' double id str
where
sumOfChars' f f' "" = 0
sumOfChars' f f' (c:cs)
| c == '-' || c == '+' = sumOfChars' f f' cs
| '0' <= c && c <= '9' = (f (digitToInt c)) + sumOfChars' f' f cs
double n
| n < 5 = 2 * n
| n < 10 = 2 * n + (1 - 10)
Alternative API suggestion
I imagine that the common use-case might be to append a checksum digit. Perhaps this alternative API might be more useful? One function takes care of both sumChars
and determineLastDigit
, and all its helper functions are neatly scoped inside.
completeID :: String -> String
-- Support four-digit-year variant representation
completeID (y1:y2:y3:y4:m1:m2:d1:d2:n1:n2:n3:[]) = y1:y2:completeID (y3:y4:m1:m2:d1:d2:n1:n2:n3:[])
completeID (y1:y2:y3:y4:m1:m2:d1:d2:'+':n1:n2:n3:[]) = y1:y2:completeID (y3:y4:m1:m2:d1:d2:'+':n1:n2:n3:[])
completeID (y1:y2:y3:y4:m1:m2:d1:d2:'-':n1:n2:n3:[]) = y1:y2:completeID (y3:y4:m1:m2:d1:d2:'-':n1:n2:n3:[])
completeID cs = completeID' double id 0 cs
where
completeID' f f' sum "" = [intToDigit $ 10 - sum `mod` 10]
completeID' f f' sum (c:cs)
| c == '-' || c == '+' = c : completeID' f f' sum cs
| '0' <= c && c <= '9' = c : completeID' f' f (sum + f (digitToInt c)) cs
double n
| n < 5 = 2 * n
| n < 10 = 2 * n + (1 - 10)