Is it safe in Haskell to save a data structure to a file using "show", and retrieve it using "read"?
Say I have the following types:
type EndsTup = (Int,Int)
-- (0-based index from start or end, Frequency)
type FreqTup = (Char, [EndsTup], [EndsTup])
-- (Character, Freqs from start, Freqs from end)
type FreqData = [FreqTup]
-- 1 entry in the list for each letter
Which I'm using to store data regarding the frequency of a character's position in a word. If I want to then save this to file, is it safe (as in guaranteed not to corrupt if it's written without error) to convert the structure to a string using show, and then read it using something like:
readData <- readFile filePath
let reconstructed = read readData :: FreqData
I'm just asking because that seems "too easy". Is this how it's typically done?
-
1show and read dictate in their type signatures what they do. If you can come up with the right input for either one of them, their behaviour is well defined. The part about the file has nothing to do with what show and read do, if you use a file to store and retrieve the necessary [Char], show and read will generate and operate on it just as their type signatures dictate.Jimmy Hoffa– Jimmy Hoffa2014年07月16日 20:09:40 +00:00Commented Jul 16, 2014 at 20:09
2 Answers 2
It's perfectly safe to do it this way, especially since, as Doval pointed out, all the instances of Show
and Read
that you're using are in the standard library, and are thus guaranteed to cooperate with each other without any difficulty.
However, it's not the generally recommended way to do it, simply because saving to human-readable (and Haskell-read
able) text is nowhere near as efficient as a tag-based binary format. For this, you can use the binary
package, which already includes instances for all of the datatypes that you're using in the question.
More to the point, though, using the binary
package to serialize to and deserialize from disk is even easier then using Show
and Read
! (Just use the functions encodeFile
and decodeFile
(OrFail
) from the Data.Binary
module.) In general, you will have to get used to things in Haskell being much simpler, at least in the common cases, then they typically are in most imperative languages. :)
-
1Thanks. Is binary the best encoding? Wouldn't a format with a higher base require less space? (Unless I'm thinking about it incorrectly)Carcigenicate– Carcigenicate2014年07月17日 16:34:49 +00:00Commented Jul 17, 2014 at 16:34
-
@Carcigenicate What do you mean by "base"? Remember that, to the computer, the text is also binary, so using text is just using binary but less efficiently, by definition.Ptharien's Flame– Ptharien's Flame2014年07月17日 20:56:20 +00:00Commented Jul 17, 2014 at 20:56
-
1When you said binary, I was picturing it literally being stored in the file as 1s and 0s; which I assumed would be inefficient. Is it actually stored as a bytestring (which could be expressed in less characters/space)?Carcigenicate– Carcigenicate2014年07月17日 21:11:35 +00:00Commented Jul 17, 2014 at 21:11
-
1Yes, it is actually stored directly as a byrestring. :)Ptharien's Flame– Ptharien's Flame2014年07月17日 21:17:18 +00:00Commented Jul 17, 2014 at 21:17
I haven't tried it, but all signs point to "yes", as long as you use the derived instances.
The Prelude has this to say about Show:
Derived instances of Show have the following properties, which are compatible with derived instances of Read:
- The result of show is a syntactically correct Haskell expression containing only constants, given the fixity declarations in force at the point where the type is declared. It contains only the constructor names defined in the data type, parentheses, and spaces. When labelled constructor fields are used, braces, commas, field names, and equal signs are also used.
- If the constructor is defined to be an infix operator, then showsPrec will produce infix applications of the constructor.
- the representation will be enclosed in parentheses if the precedence of the top-level constructor in x is less than d (associativity is ignored). Thus, if d is 0 then the result is never surrounded in parentheses; if d is 11 it is always surrounded in parentheses, unless it is an atomic expression.
- If the constructor is defined using record syntax, then show will produce the record-syntax form, with the fields given in the same order as the original declaration.
And it has this to say about Read:
Derived instances of Read make the following assumptions, which derived instances of Show obey:
- If the constructor is defined to be an infix operator, then the derived Read instance will parse only infix applications of the constructor (not the prefix form).
- Associativity is not used to reduce the occurrence of parentheses, although precedence may be.
- If the constructor is defined using record syntax, the derived Read will parse only the record-syntax form, and furthermore, the fields must be given in the same order as the original declaration.
- The derived Read instance allows arbitrary Haskell whitespace between tokens of the input string. Extra parentheses are also allowed.
Emphasis mine.
-
1The OP is not asking specifically about derived instances.user39685– user396852014年07月16日 18:54:30 +00:00Commented Jul 16, 2014 at 18:54
-
1@MattFenwick If derived instances aren't involved then it'd depend on how show and read are defined so the question is impossible to answer in general. Also, the OP's example uses only standard Haskell types so I assume they all use the derived instances.Doval– Doval2014年07月16日 19:25:36 +00:00Commented Jul 16, 2014 at 19:25