Implementing Haskell's `unwords`

Question 1

Learn You a Haskell explains the unwords function.

$unwords ["hey","there","mate"]
"hey there mate"

Here's my implementation.

unwords' :: [String] -> String
unwords' [] = []
unwords' (x:xs)
 | null xs = x
 | otherwise = x ++ " " ++ unwords' xs

Is there a possible implementation with fold? I tried the following, but the " " got appended to the end.

$foldr (\x acc -> x ++ " " ++ acc) [] ["hey", "there", "mate"] 
"hey there mate "

Also, how can I append a String([Char]) to another String without ++?

Lastly, please critique in general.

Question 2

Your foldr solution doesn't work because it's using the empty list you're passing in the final step. You can use foldr1 instead, it uses the final element of the list in place of being passed an accumulator value.

Looking to the folds to implement unwords isn't a bad idea, but there are other high-level functions that you can use to write a more terse or readable version. Let's start from a verbal description of what unwords is doing.

Insert a space character between every String in a list, then join the resulting Strings together.

The latter half of that description is easy, we know that String is really [Char], and we can easily flatten doubly-nested lists with concat. The former portion we could implement on our own, or search Hoogle for to see if anything already exists in the Prelude or other modules that could help us out. In this case, we're looking for a function with the type a -> [a] -> [a], that is, we want to pass it a value and a list and have it return a list with that value inserted between each pair of elements. As luck would have it, there's a function in Data.List that does exactly what we're looking for called intersperse that comes up as the first result if we perform that search.

Using these two functions, we can write a very short version of unwords that reads almost like prose.

import Data.List (intersperse)
import Prelude hiding (unwords)
unwords :: [String] -> String
unwords = concat . intersperse " "

Besides the aesthetic appeal of this solution, to me this illustrates the power of thinking about what you want to do in Haskell, instead of thinking about how it's going to be done.

Question 3

thanks for your thoughtful answer, bisserlis. Such a powerful 1 liner for unwords that you've provided.

Question 4

The otherwise case is redundant.

unwords' :: [String] -> String
unwords' [] = []
unwords' (x:xs) = x ++ " " ++ unwords' xs

To use a fold but not get the extra space, use foldr1:

unwords'' :: [String] -> String
unwords'' xs
 | null xs = []
 | otherwise = foldr1 (\x acc -> x ++ ' ' : acc) xs

Question 5

In the 2nd implementation, I observe your first guard: null xs = []. However, are you concerned that foldr1 can throw a run-time exception if that check was not there? Looking at code that depends on a sequence (example: getting a value from an Option in Scala) seems fitting for a monad, no? My concern with foldr1 -stackoverflow.com/q/23183935/409976. P.S. I've used monads a bit in Scala, but I'm still new to them.

Question 6

Yes, that's why the null xs guard is there.

Question 7

The original implementation isn’t ideal, because it’s recursive without being tail-recursive or tail-recursive-modulo-cons (two special cases that GHC can optimize). Other answers have already discussed how to fix up the foldr implementation, which would be more efficient.

This code could also be cleaned up with pattern-matching:

unwords' [] = []
unwords' (x:xs)
 | null xs = x
 | otherwise = x ++ " " ++ unwords' xs

could become:

unwords' [] = []
unwords' [x] = x
unwords' (x:xs) = x ++ " " ++ unwords' xs

Another efficient technique in functional programming is a map-reduction. In this case, you can represent adding a single character in front of each word after the first as a : operation, then concatenate them, which lets you write this as:

unwords2 :: [String] -> String
unwords2 [] = []
unwords2 (x:xs) = x ++ (concatMap (' ':) xs)

That should be "well-behaved" enough for the compiler, in some cases, to optimize away deep copies of the words you pass in.

Testing on the Godbolt compiler explorer shows what code you actually get with both versions, and allows you to see the intermediate representation of both. The original version gets transformed into:

unwords'
 = \ ds_dJs ->
 case ds_dJs of {
 [] -> [];
 : x_agY xs_agZ ->
 case xs_agZ of wild1_a13X {
 [] -> x_agY;
 : ds1_a13Y ds2_a13Z ->
 ++ x_agY (unpackAppendCString# lvl1_r15Z (unwords' wild1_a13X))
 }
 }

Which as you see is making redundant deep copies with unpackAppendCString of each nested recursive call. The second version becomes:

test2_go1
 = \ ds1_a151 ->
 case ds1_a151 of {
 [] -> [];
 : y_a154 ys_a155 -> ++_$s++ ds_r160 y_a154 (test2_go1 ys_a155)
 }
end Rec }
unwords2
 = \ ds1_dIN ->
 case ds1_dIN of {
 [] -> [];
 : x_aHm xs_aHn -> ++ x_aHm (test2_go1 xs_aHn)
 }

where the concatMap call becomes a helper function that reduces to a builtin. Introducing calls to these functions as top-level variables shows that GHC is capable of partially inlining the second version but not the first.

Although, in the real world, if you care about optimized string-processing, you would use a Text or a ByteString, not a linear linked list of UCS-4 [Char].

Question 8

Older question than I realized when I answered, but I hope this is valuable to someone who finds it.

Question 9

Here the shortest version I could come up with:

unwords = tail . (>>= (' ' :))

Remember that >>= is just concatMap for lists. So this code prepends every String in the list with a space, concats it and than throws the very first character (the leading space) away using tail.

bisserlis bisserlis 3,3311 gold badge13 silver badges17 bronze badges · Accepted Answer · 2014-05-14 02:30:32Z

Your foldr solution doesn't work because it's using the empty list you're passing in the final step. You can use foldr1 instead, it uses the final element of the list in place of being passed an accumulator value.

Looking to the folds to implement unwords isn't a bad idea, but there are other high-level functions that you can use to write a more terse or readable version. Let's start from a verbal description of what unwords is doing.

Insert a space character between every String in a list, then join the resulting Strings together.

The latter half of that description is easy, we know that String is really [Char], and we can easily flatten doubly-nested lists with concat. The former portion we could implement on our own, or search Hoogle for to see if anything already exists in the Prelude or other modules that could help us out. In this case, we're looking for a function with the type a -> [a] -> [a], that is, we want to pass it a value and a list and have it return a list with that value inserted between each pair of elements. As luck would have it, there's a function in Data.List that does exactly what we're looking for called intersperse that comes up as the first result if we perform that search.

Using these two functions, we can write a very short version of unwords that reads almost like prose.

import Data.List (intersperse)
import Prelude hiding (unwords)
unwords :: [String] -> String
unwords = concat . intersperse " "

Besides the aesthetic appeal of this solution, to me this illustrates the power of thinking about what you want to do in Haskell, instead of thinking about how it's going to be done.

thanks for your thoughtful answer, bisserlis. Such a powerful 1 liner for unwords that you've provided.

Stack Exchange Network

Implementing Haskell's `unwords`

4 Answers 4

Hot Network Questions

Implementing Haskell's `unwords`

4 Answers 4

Related

Hot Network Questions