1
\$\begingroup\$

Problem: Given a arbitrary string, split by a given regex and format into an R vector expression. E.g. Convert "a..,b..,c,d" into 'c("a","b","c","d")', when the given regex is "[.,]+".

Solution in haskell:

module Lib
 ( toRVec
 ) where
import Data.List (intersperse)
import Data.Maybe (fromMaybe)
import Data.Text ()
import Data.Text (pack, strip, unpack)
import Text.Regex (mkRegex, splitRegex)
(//) :: Maybe a -> a -> a
(//) = flip fromMaybe
stripString = (unpack . strip . pack)
toRVec :: String -> Maybe String -> Maybe Bool -> String
toRVec orig delim' toStrip' =
 let
 delim = delim' // "[ \n]+"
 toStrip = toStrip' // True
 ws = splitRegex (mkRegex delim) $ stripString orig
 ws1 = map (\w -> "\"" ++ w ++ "\"") $ if toStrip then
 map stripString ws
 else
 ws
 ws2 = intersperse "," ws1
 res = foldl (++) "" ws2
 in "c(" ++ res ++ ")"

Any suggestion for improvements are welcome.

200_success
146k22 gold badges190 silver badges479 bronze badges
asked Jan 5, 2016 at 22:48
\$\endgroup\$

1 Answer 1

3
\$\begingroup\$

How I would do it:

import Data.List.Split
import Data.List
foo delim s = "c(\"" ++ intercalate "\",\"" (wordsBy (`elem` delim) s) ++ "\")"

reads yours

No brackets needed around unpack . strip . pack.

I would replace if toStrip then map stripString ws else ws with (if toStrip then map stripString else id) ws.

Some of your names are unneeded; I would inline ws, ws1 and ws2 into where they are used (once ws is only used once), perhaps even res.

Instead of foldl (++), use concat.

answered Jan 6, 2016 at 5:02
\$\endgroup\$
2
  • 1
    \$\begingroup\$ Moreover: delim and tostrip are options, orig is what is transformed; thus orig should be the last argument. Think Maybe String -> Maybe String -> (String -> String). People usually don't provide default options as Maybe arguments, at least I don't see it. Consider dropping the lines with // and let the user plug in the default options if they want, seeing as providing default options isn't even part of the task. \$\endgroup\$ Commented Jan 6, 2016 at 17:47
  • 1
    \$\begingroup\$ If you use Gurkenglas's suggestion about orig being last, you could even provide a default like this: toRVecDefault = toRVec "[ \n]+" True \$\endgroup\$ Commented Jan 6, 2016 at 20:18

Your Answer

Draft saved
Draft discarded

Sign up or log in

Sign up using Google
Sign up using Email and Password

Post as a guest

Required, but never shown

Post as a guest

Required, but never shown

By clicking "Post Your Answer", you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.