A lot of my work is done with .csv extracts (reports) from databases. As I have been programming in Clojure, I've received comments that relying on vector indexes creates dependencies. I understand why, and concur.
I am rewriting one of my programs to take advantage of the fact that each report's first row contain the column headings, and I could go after each row of data I want by map key. I am rewriting some code to zipmap the headings made into map keys and one row of data (at a time) so I can access the data I want. Here is an example.
(def bene-csv-inp (fetch-csv-data "benetrak_roster.csv"))
(def bene-csv-cols (first bene-csv-inp))
(def bene-csv-data (rest bene-csv-inp))
(def zm1 (zipmap
(map #(keyword %1) bene-csv-cols)
(first bene-csv-data)))
(zm1 :EmploymentStartDate)
"21-Jun-82"
Does a higher level of extraction exist, and if so, what is it that would allow my code not to have to have to hard-code :EmploymentStartDate
? If my code has to know these keys, then how is that also not a dependency like an index?
Personally, I like going after the data with map keys, because it's less confusing and more informative than indexes. However, I believe I still have a dependency.
Thanks.
1 Answer 1
Well, you're no longer dependent on the order of the columns in the data, but you're now dependent on the column names. If someone adds a column to an extract, even if it's in the middle, you're better off, because your code won't break. If someone deletes or renames a column, you'll still have an issue. If it's more likely that someone will rename a column than delete or move it, you'd be better off sticking with indexes. I do agree that it's usually easier to understand code that refers to (hopefully) meaningful names than raw column numbers. It should make debugging easier too, since if a column is missing you can report "No such column named 'EmploymentStartDate'" rather than "Missing column: 23".
-
Thanks. It is well worth re-writing my small program, because the indexes were beginning to confuse me, and I wrote the code. So, that's a bad sign. If the report blows up due to a column rename, I can deal with that. Again, thanks.octopusgrabbus– octopusgrabbus2012年08月07日 19:43:34 +00:00Commented Aug 7, 2012 at 19:43