I have a database of approximately 300K records. For each of theses records there are up to 1600 related properties. Each of these properties have a simple key=>value format. This data is very static and nearly all operation will be reads.
Below are the options I'm thinking about to solve this problem.
1) A column for each property
This falls apart quickly as the max number of columns available in postgres is ~1600. So really, this isn't an option.
2) Create an hstore or JSONB column to store the key value columns.
This seems to work but I've read there are practical limits to how much data should be stored in hstore or JSONB. There are a lot of key/values and I think both hstore and JSONB will store these in TOAST which could have a negative impact on performance.
What potential issues/practical limits should I be aware of if using approach #2?
What are some other approaches to solving this problem?
1 Answer 1
You could create a table to hold the key/value pairs like such:
CREATE TABLE KEY_VALUE (
ID BIGINT, -- THIS COULD BE A FKEY TO YOUR '300K' RECORD TABLE
KEY VARCHAR,
VALUE VARCHAR
);
As long as this table is indexed properly (an index on id/key, I imagine), getting any key/value pairs you are interested in should still be very quick, even if it's millions of rows large. Granted - this solution as a viable option really depends on what scope of data you expect to store with your key/value pairs (is it all text? or numbers? etc..). Perhaps adding a 3rd column to say what type of data the value is would help.
EDIT
Just to be clear, btw - I figured this might work given the sheer number of different key->value pairs you said you would have.. If it were a small number, I would probably just have a "details" type of table where each value was stored in a column.
-
Is this essentially an EAV model?TehNrd– TehNrd2015年01月22日 17:16:25 +00:00Commented Jan 22, 2015 at 17:16
-
New acronym for me ... looks it up Yes, that is the gist of what I was suggesting. I have no idea about what Dezso may have been commenting about (with being brave) .. I have never used the above model.Joishi Bodio– Joishi Bodio2015年01月22日 20:58:00 +00:00Commented Jan 22, 2015 at 20:58
-
1Aaron's post summarizes the possible pitfalls very well. With careful design and planning, this may be a viable solution. It usually has relatively high development costs, as far as I can tell, so using some ready-to-use structures like
hstore
orjsonb
can mean faster delivery. YMMV, as nearly always.András Váczi– András Váczi2015年01月22日 21:56:48 +00:00Commented Jan 22, 2015 at 21:56
hstore
column and see how this performs.jsonb
, if you have typed data (i.e., of non-text type). You can also try EAV, if you are brave.