I am looking for the main points to consider when choosing between a JSON or JSONB array and the regular SQL postgres array. If you know that you will only be inserting one data type into an array, why would you use the SQL array over the JSON array? Thanks
3 Answers 3
I think this is mainly a matter of personal preference, so I guess your question will be closed soon.
If you deal with JSON values in your application anyway, going with JSON might be easier (but then you can easily convert a native array to a json array using to_jsonb()
).
There are a bit more functions for changing native arrays than there are for JSON arrays. So if you need to remove or find elements by value, native arrays might be easier. If those arrays contain integers, you can have even more functions for manipulating them through the intarray extension
And then of course there is the underlying question: wouldn't a properly normalized model be the best solution anyways, so that you avoid arrays completely?
We are using postgresql arrays and tried jsonb arrays. The data comes from external data source, which heavily uses artificial keys. Basically if you open the table as a human, great part of the table is not readable. Additionally there are many-many bridge tables with 1:n:1 relation, which can be simplified to simple arrays. Taken together analytic queries had >20 joins for some major business objects, without adding data (e.x. you want to analyze the fluctuance of your income compared to economic data). Proper queries with additional external data would be non-manageable and waay to long lasting. Thus we decided to leverage triggers to get rid of most lookup tables and some of the bridge tables, to increase human readability and decrease the amount of lookups and code complexity for analysts. The normal lookup tables are just written into fields, bridge tables are aggregated into arrays first and then put into single fields. Our optimized tables consume more storage space, but they are much faster to query and way easier to understand by analysts.
We also tried it with jsonb and jsonb arrays, but our approach is not final yet and jsonb arrays are more complex in handling. It is also possible to use postgresql arrays to store json/jsonb data, and approach is very similar. But no sufficient tests so far. My suggestion here: as long as you need one dimension, use normal arrays. BR.
As a new PostgresQL user, I searched the internet for answers regarding the same question you're asking. I read so many posts and so many different articles from so many different people. Still none of them gave me a satisfying answer. I was about to give up when finally I found someone who gave me the answer I was looking for.
The right question to ask here would not be 'When' but 'Why' use json instead of standard relational SQL queries.
The short answer is Performance and Flexibility
Now for the long answer
Most of us are used to Relational Database Management Systems (RDBMS) where each data is mostly separated by columns.
Note : PostgreSQL is NOT a NoSQL. It is a classical RDBMS but provide support for JSON/JSONb which in essence gives us best of the both world in a way
Consider this table orders
stored as SQL array in Postgres
id | first_name | last_name | product | qty
1 | Lily | Bush | Diaper | 24
2 | Josh | William | Toy Car | 1
3 | Mary | Clark | Toy Train | 2
In essence, this is okay for small or medium size databases. However, the larger your database, the more reduced the performance will be when querying huge datasets.
Now consider the same data stored as json in Postgres
Creating a new table with JSON data type
CREATE TABLE orders (
id serial NOT NULL PRIMARY KEY,
info json NOT NULL
);
The orders table consists of only two columns:
- The id column is the primary key column that identifies the order.
- The info column stores the data in the form of JSON.
Inserting JSON data
The following statement inserts multiple rows at the same time.
INSERT INTO orders (info)
VALUES('{ "customer": "Lily Bush", "items": {"product": "Diaper","qty": 24}}'),
('{ "customer": "Josh William", "items": {"product": "Toy Car","qty": 1}}'),
('{ "customer": "Mary Clark", "items": {"product": "Toy Train","qty": 2}}');
Format will look like this
id | info
1 | { "customer": "Lily Bush", "items": {"product": "Diaper","qty": 24}}
2 | { "customer": "Josh William", "items": {"product": "Toy Car","qty": 1}}
3 | { "customer": "Mary Clark", "items": {"product": "Toy Train","qty": 2}}
Less columns mean significantly more performance.
Working with JSON arrays is also more flexible and expandable.
JSON makes building realtime apps and listening to database updates very easy
Granted your tables will be hard to read for humans but for developers it increases productivity in my opinion. Now if you choose to store data as JSON, it does require you to unlearn a few bad habits you've learned from working with RDBMS. It does require a complete rewiring of your methodology of work. But once you've got the hang of it, I believe you will never go back.
Plus, you can create collections that can easily be used in apps that make heavy use of Javascript and Nodejs.
Example of a JSON collection
The data is fake
var usersjson = {
'type': 'usersCollection',
'users': [
{
'id': '23',
'date_created' :'Mon Jan 04 2021 18:55:33',
'dob' :'2021-02-05',//yyyy-mm-dd
'avatar' : 'avatar.jpg',
'first_name': 'George',
'last_name': 'Larosa',
'countrycode' :'1',
'phone' : '4075357795',
'email' : '[email protected]',
'address' : {
'timezone' : 'Eastern Standard Time',
'door_number': '1462',
'street': 'McDonald Avenue',
'appt': '',
'city': 'Orlando',
'state_province': 'Florida',
'zip_postal_code': '32810',
'country': 'United States',
},
'geometry': {
'type': 'Point',
'coordinates': [-81.670370, 28.853510]
},
},
{
'id': '24',
'date_created' :'Mon Jan 06 2021 18:55:33',
'dob' :'2021-02-05',//yyyy-mm-dd
'avatar' : 'avatar.jpg',
'first_name': 'Elizabeth',
'last_name': 'Newton',
'countrycode' :'1',
'phone' : '3055807405',
'email' : '[email protected]',
'timezone' : 'Eastern Standard Time',
'address' : {
'door_number': '1316',
'street': 'Monroe Avenue',
'appt': '',
'city': 'Fort Myers',
'state_province': 'Florida',
'zip_postal_code': '33901',
'country': 'United States',
},
'geometry': {
'type': 'Point',
'coordinates': [-81.869480, 26.640830]
},
}
]}
Resources
PostgresQL and manipulating JSON DATA
Articles
- Forget SQL vs NoSQL : https://blog.arctype.com/json-in-postgresql/-
17This doesn't answer the question. The question asked about
json(b)
vsarray
types notjson
vs standard relational model.Kaplan Ilya– Kaplan Ilya2022年03月27日 20:24:53 +00:00Commented Mar 27, 2022 at 20:24