4
\$\begingroup\$

I'm starting to play with F# to do some data parsing and I was able to create a function to group an array of string into sub groups. The code looks like this:

let breakBy lines pattern =
 let rec breakByRec lines pattern acc = 
 match lines with 
 | [] -> acc
 | head::tail ->
 match pattern head with
 | true ->
 let newGroup = [head]
 let newList = newGroup :: acc
 breakByRec tail pattern newList
 | false ->
 let lastGroup = List.head acc 
 let newGroup = head :: lastGroup;
 let newList = newGroup :: List.tail acc
 breakByRec tail pattern newList
 breakByRec lines pattern [[]]

However, I feel that it could be improved specially regarding the way I handle intermediate values when concatenating the lists and the data structure used to return the grouped data.

The goal, at the end, is to insert each of these "groups" as single entities in the database. Is there a more suitable data structure that I should take a look?

Below is a sample ready to be used and an example output.

let data = [
 "Name=John"; "Age=29"; "City=San Francisco";
 "Name=Jane"; "Age=28"; "City=New York";
 "Name=Mike"; "Age=35"; "City=Miami"
]
let matchName line = Regex.IsMatch(line, "^Name=")
let people = breakBy data matchName
printfn "%A" people
/* 
The output looks like this:
val people : string list list = [
 ["City=Miami"; "Age=35"; "Name=Mike"];
 ["City=New York"; "Age=28"; "Name=Jane"];
 ["City=San Francisco"; "Age=29"; "Name=John"]; 
 []
]
*/

Any suggestions are appreciated.

200_success
145k22 gold badges190 silver badges478 bronze badges
asked Mar 28, 2016 at 2:53
\$\endgroup\$
1
  • \$\begingroup\$ Maybe this would be better with one of the type providers - the CSV type provider is probably a close match here. \$\endgroup\$ Commented Mar 28, 2016 at 3:56

2 Answers 2

2
\$\begingroup\$

The fact that a function named breakBy also reverses the order of the data violates the Principle of Least Surprise. In my opinion, it's a bug.

breakBy (Regex "^Name=").IsMatch can be thought of as an transformation to be applied to a list. Therefore, to facilitate currying, the order of the parameters to breakBy should be reversed, so that the predicate comes before the data.

answered Mar 28, 2016 at 10:03
\$\endgroup\$
2
\$\begingroup\$
match pattern head with
| true -> ...
| false -> ...

I don't see any reason to use pattern matching here, if would work the same and is simpler:

if pattern head then ...
else ...

I think the empty group at the end of output shouldn't be there, get rid of it.


Since lines is a list, you can use List.foldBack to simplify the code:

let breakBy lines pattern =
 let processLine line (head, tail) =
 let head' = line::head
 if pattern line then
 ([], head'::tail)
 else
 (head', tail)
 snd <| List.foldBack processLine lines ([], [])

I'm assuming that the input always has to start with a matching line. If that's not necessarily true, the last line can't be just snd.

This also does not reverse the inputs, as 200_success suggested.

answered Mar 28, 2016 at 11:23
\$\endgroup\$

Your Answer

Draft saved
Draft discarded

Sign up or log in

Sign up using Google
Sign up using Email and Password

Post as a guest

Required, but never shown

Post as a guest

Required, but never shown

By clicking "Post Your Answer", you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.