2

I have a large object graph in .NET (F# as it happens) that I need to persist to disk and then will load again periodically for use in a calculation.

The performance of deserializing is more important (will be performed many times) than the seriliazing (will only be performed once) should that have a bearing on the answer.

Currently, I am using FsPickler and using their binary format. This is very convenient / easy-to-use but I am trying to get a handle on how much more performance I would get by customizing a serializer/deserializer...

One avenue I am considering is to persist and load from a small relational database (I have sqlite in mind). Should I expect this to be much faster?

Per request below, I have provided a slightly simplified version of the object graph that I am working on below:

CODE

type Value = 
 | Float of float
 | String of string
 | Bool of bool
[<Struct>]
type Address (i:int, j:int, k:int) =
 member this.I = i
 member this.J = j
 member this.K = k
type Data = {
 Target:Address
 mutable SpecialIndex:int
 mutable Parameters1 : Value []
 mutable Parameters2 : Address []
 Check1 : bool
 Check2:bool
 Parent: Address option
}
type Persisted = 
 { 
 Inputs : Address []
 Outputs : Address []
 Aliases : Dictionary<string, Address>
 Mapping : Dictionary<string, int>
 Masters: Dictionary<Address, Value[]>
 BigCollection : Data [] }
  1. The object that is persisted is an instance of Persisted.

  2. The large size is most likely to come about due to Persisted.BigCollection being in the order of 10m or more items in the array.

Doc Brown
219k35 gold badges405 silver badges619 bronze badges
asked Nov 4, 2015 at 15:04
15
  • 4
    Have you measured the time it takes and determined that there is a material performance problem? Commented Nov 4, 2015 at 15:07
  • @RobertHarvey yes, I have profiled and the de-serialization time dominates the time taken to perform a calculation. My question above is an attempt to get guidance on where I should look should I choose to optimize and/or whether further material performance improvements should be expected. Commented Nov 4, 2015 at 15:11
  • Can you trade off space for speed? Show us some of the object graph code looks like. Commented Nov 4, 2015 at 15:14
  • @RobertHarvey yes - would be prepared to. Commented Nov 4, 2015 at 15:16
  • 1
    Depending on how much performance improvement you need, you might try a faster serializer like Protocol Buffers. Commented Nov 4, 2015 at 15:32

1 Answer 1

6

One avenue I am considering is to persist and load from a small relational database (I have sqlite in mind). Should I expect this to be much faster?

No, you should not expect this. Though it is not completely impossible, to my experience using a relational database for deserializing an object graph is seldom quicker than deserializing from a file. To my experience, relational databases can only help to increase performance when you can play out their strengths like indexing capabilites or managing of external data which is too big to be loaded into memory at once.

I am trying to get a handle on how much more performance I would get by customizing a serializer/deserializer.

Whatever serializer/deserializer you use, the upper limit (and often the bottleneck) for performance is the I/O speed of your disk in "bytes per second". So look at the expected size in bytes of your serialized graph, divide it by the speed, and you will get a lower limit for the deserialization time. When the time your deserializer needs is near to that limit, the only resonable way to increase performance is to use a faster disk (like a modern SSD or something like that).

answered Nov 4, 2015 at 21:42
2
  • Thanks. Your second point makes a lotta sense! I have an SSD with "alleged" read speed of 728 mb/s. The example file I am testing on is 175mb in size and is taking 6.5s to deserialize...Sounds like there is room for improvement. Commented Nov 4, 2015 at 21:52
  • 1
    +1 for convincing me this question is answerable if interpreted the right way. I'd suggesting editing the off-topic-looking title to match this answer, since this clearly helped the OP, it's about to be closed (as the title, by itself, is obviously not a good question), and I don't feel I understand the question well enough to edit the title myself. Commented Nov 7, 2015 at 13:07

Your Answer

Draft saved
Draft discarded

Sign up or log in

Sign up using Google
Sign up using Email and Password

Post as a guest

Required, but never shown

Post as a guest

Required, but never shown

By clicking "Post Your Answer", you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.