Learn You a Haskell explains the unwords
function.
$unwords ["hey","there","mate"]
"hey there mate"
Here's my implementation.
unwords' :: [String] -> String
unwords' [] = []
unwords' (x:xs)
| null xs = x
| otherwise = x ++ " " ++ unwords' xs
Is there a possible implementation with fold
? I tried the following, but the " "
got appended to the end.
$foldr (\x acc -> x ++ " " ++ acc) [] ["hey", "there", "mate"]
"hey there mate "
Also, how can I append a String([Char])
to another String without ++
?
Lastly, please critique in general.
4 Answers 4
Your foldr
solution doesn't work because it's using the empty list you're passing in the final step. You can use foldr1
instead, it uses the final element of the list in place of being passed an accumulator value.
Looking to the fold
s to implement unwords
isn't a bad idea, but there are other high-level functions that you can use to write a more terse or readable version. Let's start from a verbal description of what unwords
is doing.
Insert a space character between every
String
in a list, then join the resultingString
s together.
The latter half of that description is easy, we know that String
is really [Char]
, and we can easily flatten doubly-nested lists with concat
. The former portion we could implement on our own, or search Hoogle for to see if anything already exists in the Prelude
or other modules that could help us out. In this case, we're looking for a function with the type a -> [a] -> [a]
, that is, we want to pass it a value and a list and have it return a list with that value inserted between each pair of elements. As luck would have it, there's a function in Data.List
that does exactly what we're looking for called intersperse
that comes up as the first result if we perform that search.
Using these two functions, we can write a very short version of unwords
that reads almost like prose.
import Data.List (intersperse)
import Prelude hiding (unwords)
unwords :: [String] -> String
unwords = concat . intersperse " "
Besides the aesthetic appeal of this solution, to me this illustrates the power of thinking about what you want to do in Haskell, instead of thinking about how it's going to be done.
-
\$\begingroup\$ thanks for your thoughtful answer, bisserlis. Such a powerful 1 liner for
unwords
that you've provided. \$\endgroup\$Kevin Meredith– Kevin Meredith2014年05月14日 23:45:38 +00:00Commented May 14, 2014 at 23:45
The otherwise
case is redundant.
unwords' :: [String] -> String
unwords' [] = []
unwords' (x:xs) = x ++ " " ++ unwords' xs
To use a fold but not get the extra space, use foldr1
:
unwords'' :: [String] -> String
unwords'' xs
| null xs = []
| otherwise = foldr1 (\x acc -> x ++ ' ' : acc) xs
-
\$\begingroup\$ In the 2nd implementation, I observe your first guard:
null xs = []
. However, are you concerned thatfoldr1
can throw a run-time exception if that check was not there? Looking at code that depends on a sequence (example: getting a value from an Option in Scala) seems fitting for a monad, no? My concern withfoldr1
-stackoverflow.com/q/23183935/409976. P.S. I've used monads a bit in Scala, but I'm still new to them. \$\endgroup\$Kevin Meredith– Kevin Meredith2014年05月14日 02:42:02 +00:00Commented May 14, 2014 at 2:42 -
\$\begingroup\$ Yes, that's why the
null xs
guard is there. \$\endgroup\$200_success– 200_success2014年05月14日 02:54:37 +00:00Commented May 14, 2014 at 2:54
The original implementation isn’t ideal, because it’s recursive without being tail-recursive or tail-recursive-modulo-cons (two special cases that GHC can optimize). Other answers have already discussed how to fix up the foldr
implementation, which would be more efficient.
This code could also be cleaned up with pattern-matching:
unwords' [] = []
unwords' (x:xs)
| null xs = x
| otherwise = x ++ " " ++ unwords' xs
could become:
unwords' [] = []
unwords' [x] = x
unwords' (x:xs) = x ++ " " ++ unwords' xs
Another efficient technique in functional programming is a map-reduction. In this case, you can represent adding a single character in front of each word after the first as a :
operation, then concatenate them, which lets you write this as:
unwords2 :: [String] -> String
unwords2 [] = []
unwords2 (x:xs) = x ++ (concatMap (' ':) xs)
That should be "well-behaved" enough for the compiler, in some cases, to optimize away deep copies of the words you pass in.
Testing on the Godbolt compiler explorer shows what code you actually get with both versions, and allows you to see the intermediate representation of both. The original version gets transformed into:
unwords'
= \ ds_dJs ->
case ds_dJs of {
[] -> [];
: x_agY xs_agZ ->
case xs_agZ of wild1_a13X {
[] -> x_agY;
: ds1_a13Y ds2_a13Z ->
++ x_agY (unpackAppendCString# lvl1_r15Z (unwords' wild1_a13X))
}
}
Which as you see is making redundant deep copies with unpackAppendCString
of each nested recursive call. The second version becomes:
test2_go1
= \ ds1_a151 ->
case ds1_a151 of {
[] -> [];
: y_a154 ys_a155 -> ++_$s++ ds_r160 y_a154 (test2_go1 ys_a155)
}
end Rec }
unwords2
= \ ds1_dIN ->
case ds1_dIN of {
[] -> [];
: x_aHm xs_aHn -> ++ x_aHm (test2_go1 xs_aHn)
}
where the concatMap
call becomes a helper function that reduces to a builtin. Introducing calls to these functions as top-level variables shows that GHC is capable of partially inlining the second version but not the first.
Although, in the real world, if you care about optimized string-processing, you would use a Text
or a ByteString
, not a linear linked list of UCS-4 [Char]
.
-
\$\begingroup\$ Older question than I realized when I answered, but I hope this is valuable to someone who finds it. \$\endgroup\$Davislor– Davislor2023年10月09日 20:59:39 +00:00Commented Oct 9, 2023 at 20:59
Here the shortest version I could come up with:
unwords = tail . (>>= (' ' :))
Remember that >>=
is just concatMap
for lists. So this code prepends every String
in the list with a space, concats it and than throws the very first character (the leading space) away using tail
.