Problem: Given a arbitrary string, split by a given regex and format into an R vector expression. E.g. Convert "a..,b..,c,d" into 'c("a","b","c","d")', when the given regex is "[.,]+".
Solution in haskell:
module Lib
( toRVec
) where
import Data.List (intersperse)
import Data.Maybe (fromMaybe)
import Data.Text ()
import Data.Text (pack, strip, unpack)
import Text.Regex (mkRegex, splitRegex)
(//) :: Maybe a -> a -> a
(//) = flip fromMaybe
stripString = (unpack . strip . pack)
toRVec :: String -> Maybe String -> Maybe Bool -> String
toRVec orig delim' toStrip' =
let
delim = delim' // "[ \n]+"
toStrip = toStrip' // True
ws = splitRegex (mkRegex delim) $ stripString orig
ws1 = map (\w -> "\"" ++ w ++ "\"") $ if toStrip then
map stripString ws
else
ws
ws2 = intersperse "," ws1
res = foldl (++) "" ws2
in "c(" ++ res ++ ")"
Any suggestion for improvements are welcome.
1 Answer 1
How I would do it:
import Data.List.Split
import Data.List
foo delim s = "c(\"" ++ intercalate "\",\"" (wordsBy (`elem` delim) s) ++ "\")"
reads yours
No brackets needed around unpack . strip . pack
.
I would replace if toStrip then map stripString ws else ws
with (if toStrip then map stripString else id) ws
.
Some of your names are unneeded; I would inline ws, ws1 and ws2 into where they are used (once ws is only used once), perhaps even res.
Instead of foldl (++)
, use concat
.
-
1\$\begingroup\$ Moreover: delim and tostrip are options, orig is what is transformed; thus orig should be the last argument. Think
Maybe String -> Maybe String -> (String -> String)
. People usually don't provide default options as Maybe arguments, at least I don't see it. Consider dropping the lines with // and let the user plug in the default options if they want, seeing as providing default options isn't even part of the task. \$\endgroup\$Gurkenglas– Gurkenglas2016年01月06日 17:47:17 +00:00Commented Jan 6, 2016 at 17:47 -
1\$\begingroup\$ If you use Gurkenglas's suggestion about
orig
being last, you could even provide a default like this:toRVecDefault = toRVec "[ \n]+" True
\$\endgroup\$Christian Oudard– Christian Oudard2016年01月06日 20:18:52 +00:00Commented Jan 6, 2016 at 20:18