3
\$\begingroup\$

I wrote a split by delimiter function in Haskell and wanted some feedback on this piece of code. Since I come from an imperative programming background, I often write too complex functions in haskell.

split :: Char -> String -> [String]
split c str = fst $ splitInternal c ([], str)
splitInternal :: Char -> ([String], String) -> ([String], String)
splitInternal _ (result, "") = (result, "")
splitInternal c (result, str) = splitInternal c (
 result ++ [takeWhile (/= c) str], 
 case dropWhile (/= c) str of
 "" -> ""
 rest -> tail rest
 )

My questions are

  1. Is it bad to have splitInternal function? I couldn't figure out a way without it.
  2. Is there maybe a simpler way to write the function?
  3. Any other feedback is welcome as well
asked Dec 12, 2020 at 20:58
\$\endgroup\$

1 Answer 1

4
\$\begingroup\$

Is it bad to have splitInternal function? I couldn't figure out a way without it.

Well, according to your procedure, I think it is necessary, but you can improve the readibility by writing some small functions and then combining them together. Besides, if there are consecutive delimiters, your split function doesn't work as expected. The code can be rewritten as following:

splitInternal :: Char -> ([String], String) -> ([String], String)
splitInternal _ (result, "") = (result, "")
splitInternal c (result, remain) = splitInternal c (getBefore c remain, getAfter c remain)
 where
 getBefore delimiter rest = result ++ [takeWhile (/= delimiter) rest]
 getAfter delimiter rest = dropWhile (== delimiter) . dropWhile (/= delimiter) $ rest

Is there maybe a simpler way to write the function?

Yes, you can use the break and span function defined in Prelude:

split :: Char -> String -> [String]
split _ "" = []
split delimiter str = 
 let (start, rest) = break (== delimiter) str
 (_, remain) = span (== delimiter) rest
 in start : split delimiter remain

So in this case, your splitInternal is unnecessary.

Any other feedback is welcome as well

Well, if you are dealing with string, then a better choice is Text from Data.Text. Text is more efficient than String when you are dealing with string. In the module Data.Text, there is a pre-defined function splitOn that works almost as you expect:

ghci> :seti -XOverloadedString
ghci> splitOn "," "123,456,789"
["123","456","789"]
ghci> splitOn "," "123,,,456,789"
["123","","","456","789"] -- This is what I mean "almost", since splitOn doesn't use the consecutive delimiters. Maybe this is what you want.
```
answered Dec 16, 2020 at 4:36
\$\endgroup\$

Your Answer

Draft saved
Draft discarded

Sign up or log in

Sign up using Google
Sign up using Email and Password

Post as a guest

Required, but never shown

Post as a guest

Required, but never shown

By clicking "Post Your Answer", you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.