I tried to search here thoroughly but did not find any answers.
I have a PostgreSQL database, which has two main tables:
- documents
- users
These two tables have different relations. A user can:
- like
- bookmark to read later
- save
... a document.
The question is how should I save these relations?
In my experience with MySQL, the obvious way was to create tables for these many to many relations, containing user_id
and document_id
.
But as we are using PostgreSQL and it has the amazing JSON support, we are thinking maybe the better approach is to have a user_document
table, which contains user_id
, document_id
and a JSON column containing all the relations.
The JSON would be something like this:
{
'follow' : {'date' : 1523517140, 'doesFollow' : 't'},
'bookmark' : {'date' : null, 'doesBookmark' : 'f'},
....
}
I have little to zero experience with PostgreSQL and I don't know about the performance of querying on JSONB columns. And I don't know if this approach makes sense at all in PostgreSQL. But it seems OK and if there is nothing wrong with it, maybe it is preferable to the first, normal, approach.
1 Answer 1
Document types like json
or jsonb
(or xml
or hstore
) are convenient to store, well, documents. Ideal for data with varying keys and rarely updated and not-too-complex filter criteria in queries. Manipulating big documents a lot inside the database would be an anti-pattern.
Structured data that is often written in small increments, and probably searched a lot (like in your case most likely) are much more efficient with a normalized design - regarding storage, performance, concurrent write access and data integrity. Hence the advice in the manual:
JSON data is subject to the same concurrency-control considerations as any other data type when stored in a table. Although storing large documents is practicable, keep in mind that any update acquires a row-level lock on the whole row. Consider limiting JSON documents to a manageable size in order to decrease lock contention among updating transactions. Ideally, JSON documents should each represent an atomic datum that business rules dictate cannot reasonably be further subdivided into smaller datums that could be modified independently.
Implement your n:m relationship with one ore more junction tables. You can enforce data integrity with constraints (PK, FK, UNIQUE, CHECK, ..), none of which is easily possible with document types. I would lean towards a single table unless you have substantially differing requirements for "like", "bookmark" etc.
Example
If you are unsure about the typical layout, here are the basics:
Assuming that each of your relation types can only be used once per user/doc combination. For just a hand full of values (3 in your case) I would use a 1-byte "char"
column as PK of a lookup table to optimize storage and performance in tables and indexes.
CREATE TABLE reltype (
reltype "char" PRIMARY KEY
, relation_type text UNIQUE NOT NULL
);
INSERT INTO reltype(reltype, relation_type) VALUES
('l', 'like')
, ('b', 'bookmark')
, ('s', 'save');
CREATE TABLE user_doc (
user_doc_id int PRIMARY KEY GENERATED ALWAYS AS IDENTITY
, user_id int REFERENCES users ON UPDATE CASCADE ON DELETE CASCADE
, document_id int REFERENCES documents ON UPDATE CASCADE ON DELETE CASCADE
, reltype "char" REFERENCES reltype NOT NULL DEFAULT 'l'
, CONSTRAINT user_document_pkey UNIQUE(user_id, document_id, reltype)
);
You can always export aggregated data as JSON documents. Even have a VIEW
or MATERIALIZED VIEW
to read like a table. But don't manage relationships inside a single big JSON document.
-
the Idea is these relations will become more than these 3 and maybe their requirements will differ, consider a 4th relation
view
. It needs acount
to record how many times the user has viewed it. Or document can be a video. So we will need to save the portion of video user has watched int itsview
relation. So we thought we can save these data as JSONB so we don't have to worry about database changes in future.Hamid Sajjadi– Hamid Sajjadi2018年04月12日 13:27:24 +00:00Commented Apr 12, 2018 at 13:27 -
@hamidsajjadi: The example design is open to more relationship types. You might add varying additional attributes in a
jsonb
column inuser_doc
. Or create multiple distinct junction tables. It depends on the complete situation and exact requirements.Erwin Brandstetter– Erwin Brandstetter2018年04月12日 15:56:40 +00:00Commented Apr 12, 2018 at 15:56
Explore related questions
See similar questions with these tags.
the obvious way was to create tables for these many to many relations, containing user_id and document_id
is the way to go!