10

I have a database that stores a bunch of custom fields using hstore. In order to merge it into another database that doesn't support hstore, I'd like to split the keys into extra columns.

Users can add new custom fields and so I can't rely on knowledge of the keys ahead of time. Which makes the answer at "Attributes from an hstore-column as separate columns in a view?" not applicable to my problem.

Where a record doesn't have a key present in other records, it should get the same column with a null value.

How do I do this?

asked Mar 9, 2015 at 4:46
0

4 Answers 4

7

This can be done, very efficiently, too. Not in a single statement, though, since SQL demands to know the return type at call time. So you need two steps. The solution involves a number of advanced techniques ...

Assuming the same table as @Denver in his answer:

CREATE TABLE hstore_test (
 id serial PRIMARY KEY
, hstore_col hstore
);

Solution 1: Simple SELECT

After I wrote the crosstab solution below it struck me that a simple "brute force" solution is probably faster. Basically, the query @Denver already posted, built dynamically:

Step 1a: Generate query

SELECT format(
 'SELECT id, h->%s
 FROM (SELECT id, hstore_col AS h FROM hstore_test) t;'
 , string_agg(quote_literal(key) || ' AS ' || quote_ident(key), ', h->')
 ) AS sql 
FROM (
 SELECT DISTINCT key
 FROM hstore_test, skeys(hstore_col) key
 ORDER BY 1
 ) sub;

The subquery (SELECT id, hstore_col AS h FROM hstore_test) is just to get in the column alias h for your hstore column.

Step 1b: Execute query

This generates a query of the form:

SELECT id, h->'key1' AS key1, h->'key2' AS key2, h->'key3' AS key3
FROM (SELECT id, hstore_col AS h FROM hstore_test) t;

Result:

 id | key1 | key2 | key3
----+-------+-------+-------
 1 | val11 | val12 | val13
 2 | val21 | val22 |
 3 | | | -- for a row where hstore_col IS NULL

Solution 2: crosstab()

For lots of keys this may perform better. Probably not. You'll have to test. Result is the same as for solution 1.

You need the additional extension tablefunc which provides the crosstab() function. Read this first if you are not familiar:

Step 2a: Generate query

SELECT format(
 $s$SELECT * FROM crosstab(
 $$SELECT h.id, kv.*
 FROM hstore_test h, each(hstore_col) kv
 ORDER BY 1, 2$$
 , $$SELECT unnest(%L::text[])$$
 ) AS t(id int, %s text);
 $s$
 , array_agg(key) -- escapes strings automatically
 , string_agg(quote_ident(key), ' text, ') -- needs escaping!
 ) AS sql 
FROM (
 SELECT DISTINCT key
 FROM hstore_test, skeys(hstore_col) key
 ORDER BY 1
 ) sub;

Note the nested levels of dollar-quoting.

I use this explicit form in the main query instead of the short CROSS JOIN in the auxiliary query to preserve rows with empty or NULL hstore values:

LEFT JOIN LATERAL each(hstore_col) kv ON TRUE

Related:

Step 2b: Execute query

This generates a query of the form:

SELECT * FROM crosstab(
 $$SELECT h.id, kv.*
 FROM hstore_test h
 LEFT JOIN LATERAL each(hstore_col) kv ON TRUE
 ORDER BY 1, 2$$
 , $$SELECT unnest('{key1,key2,key3}'::text[])$$
 ) AS t(id int, key1 text, key2 text, key3 text);

You may want to inspect it for plausibility before running the first time. This should deliver optimized performance.

Notes

answered Dec 6, 2015 at 7:06
1
  • 1
    Thanks again! If anyone's interested, the cost for my use case for solution 1 was Planning time: 0.615 ms, Execution time: 2474.708 ms to generate the query. And Planning time: 0.534 ms, Execution time: 3956.939 ms to run it. And solution 2: Planning time: 5.079 ms, Execution time: 2420.472 ms to generate the query and Planning time: 0.034 ms, Execution time: 4009.264 ms to run it. So not a lot of difference between the two for a table with 1255409 rows and 27 unique keys in the hstore_col. Commented Dec 7, 2015 at 23:46
5

I realize I'm a bit late—and by now you sure have it figured out—but, seeing the comment you left on Denver Timothy's answer, I thought I would leave an answer for everybody else:

select (each(hstore_col)).key from hstore_test;

This will create a row for each key contained in hstore_col, so you won't need to know what the keys are beforehand.

answered Dec 4, 2015 at 21:56
0
3

You'll want to use the -> operator on the column (see here).

Records without the same key in other records will show as NULL.

create table hstore_test (id serial, hstore_col hstore);
insert into hstore_test (hstore_col) values ('key1=>val11, key2=>val12, key3=>val13'), ('key1=>val21, key2=>val22');
select hstore_col->'key1' as key1, hstore_col->'key2' as key2, hstore_col->'key3' as key3 from hstore_test;
┌───────┬───────┬───────┐
│ key1 │ key2 │ key3 │
├───────┼───────┼───────┤
│ val11 │ val12 │ val13 │
│ val21 │ val22 │ NULL │
└───────┴───────┴───────┘
(2 rows)

Here is a similar answer.

answered Mar 9, 2015 at 8:15
0
3

Based on s.m.'s answer it seems you can combine EACH() with both .key and .value to generate columns automatically.

Setup test table like in Ian Timothy's answer:

create table hstore_test (id serial, hstore_col hstore);
insert into hstore_test (hstore_col) values ('key1=>val11, key2=>val12, key3=>val13'), ('key1=>val21, key2=>val22');

Query

SELECT id, (EACH(hstore_col)).key, (EACH(hstore_col)).value FROM hstore_test;

which returns

 id | key | value 
----+------+-------
 1 | key1 | val11
 1 | key2 | val12
 1 | key3 | val13
 2 | key1 | val21
 2 | key2 | val22

I was a bit unsure whether two EACH() inside the same query are guaranteed to produce the same order. I found an example in the hstore documentation using the same mechanism so I guess it's explicitly supported.

answered Apr 21, 2020 at 15:24
1
  • Thanks! This is a good addition. I can't remember why I selected Erwin's answer over s.m.'s but it might have been because of this? It looks much simpler anyway Commented Apr 22, 2020 at 5:08

Your Answer

Draft saved
Draft discarded

Sign up or log in

Sign up using Google
Sign up using Email and Password

Post as a guest

Required, but never shown

Post as a guest

Required, but never shown

By clicking "Post Your Answer", you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.