Reshape array in PostgreSQL

Question 1

I am using a PostgreSQL 10 server, in which I have some tables containing attributes of type double precision[] intended for storing 1D arrays of data, of length ~1000. My code interacting with the database contained a bug that inserted data as an array of length-1 arrays (see my psycopg bug report where I eventually worked out that this is what was happening), but this was permitted by the DBMS:

The current implementation does not enforce the declared number of dimensions either. Arrays of a particular element type are all considered to be of the same type, regardless of size or number of dimensions. So, declaring the array size or number of dimensions in CREATE TABLE is simply documentation; it does not affect run-time behavior.

How can I fix these misshapen arrays within the DBMS (i.e., using SQL rather than updating from external code)? I essentially wish to just flatten each array into a 1D array as originally intended.

A snippet of the array data as returned by psql is:

{{2.20751909662576},{2.20679071024511},{2.20615506273571},{2.2055910715332},{2.
20507756148068},{2.20459435596551},{2.20412336646322},{2.20364958013081},{2.2031
618420891},{2.20265334228114},{2.20212174686242},{2.20156895755458},{2.201000532
12516},{2.20042484573123},{2.19985210697606},{2.19929335742568},{2.1987595765431
},{2.19826098783408},{2.19780662274576},{2.19740415504621},{2.19705997866427},{2
.19677947299938},{2.1965673849541},{2.19642825612809},{2.1963668334181},{2.19638 ...

Question 2

Show us the code that inserted the data and how the data currently looks like e.g. the output of a select statement of that table as formatted text please, no screen shots

Question 3

@a_horse_with_no_name example of data now added. I don't think the inserting code will help as it's using opaque numpy arrays and pandas dataframes that I'm in the process of debugging.

Question 4

Something like this should work:

update the_table 
 set data = x.data
from (
 select t.id, array_agg(d.x order by d.idx) as data
 from the_table t, unnest(data) with ordinality as d(x, idx)
 group by t.id
) x
where x.id = the_table.id;

Sample psql output:

postgres=> create table the_table (id integer primary key, data double precision[]);
CREATE TABLE
postgres=> insert into the_table values
postgres-> (1, '{{2.20751909662576},{2.20679071024511},{2.20615506273571},{2.2055910715332},{2.20507756148068},{2.20459435596551},{2.20412336646322},{2.20364958013081}}'),
postgres-> (2, '{{2.19780662274576},{2.19740415504621},{2.19705997866427},{2.19677947299938},{2.1965673849541},{2.19642825612809},{2.1963668334181}}');
INSERT 0 2
postgres=> select *
postgres-> from the_table;
 id | data
----+----------------------------------------------------------------------------------------------------------------------------------------------------------
 1 | {{2.20751909662576},{2.20679071024511},{2.20615506273571},{2.2055910715332},{2.20507756148068},{2.20459435596551},{2.20412336646322},{2.20364958013081}}
 2 | {{2.19780662274576},{2.19740415504621},{2.19705997866427},{2.19677947299938},{2.1965673849541},{2.19642825612809},{2.1963668334181}}
(2 rows)
postgres=> update the_table
postgres-> set data = x.data
postgres-> from (
postgres(> select t.id, array_agg(d.x order by d.idx) as data
postgres(> from the_table t, unnest(data) with ordinality as d(x, idx)
postgres(> group by t.id
postgres(> ) x
postgres-> where x.id = the_table.id;
UPDATE 2
postgres=> select *
postgres-> from the_table;
 id | data
----+------------------------------------------------------------------------------------------------------------------------------------------
 1 | {2.20751909662576,2.20679071024511,2.20615506273571,2.2055910715332,2.20507756148068,2.20459435596551,2.20412336646322,2.20364958013081}
 2 | {2.19780662274576,2.19740415504621,2.19705997866427,2.19677947299938,2.1965673849541,2.19642825612809,2.1963668334181}
(2 rows)
postgres=>

Question 5

Much nicer than my hacky serialization/deserialization via strings approach, thanks. I'll test it now.

Question 6

Works beautifully. Out of interest, is the use of WITH ORDINALITY essential? Are no guarantees made about the order of unnested array data without it?

Question 7

An order is never guaranteed without an order by and the only way to do an array_agg() with an order by is if the original array index is available.

user1822user1822 · Accepted Answer · 2018-02-16 22:21:33Z

Something like this should work:

update the_table 
 set data = x.data
from (
 select t.id, array_agg(d.x order by d.idx) as data
 from the_table t, unnest(data) with ordinality as d(x, idx)
 group by t.id
) x
where x.id = the_table.id;

Sample psql output:

postgres=> create table the_table (id integer primary key, data double precision[]);
CREATE TABLE
postgres=> insert into the_table values
postgres-> (1, '{{2.20751909662576},{2.20679071024511},{2.20615506273571},{2.2055910715332},{2.20507756148068},{2.20459435596551},{2.20412336646322},{2.20364958013081}}'),
postgres-> (2, '{{2.19780662274576},{2.19740415504621},{2.19705997866427},{2.19677947299938},{2.1965673849541},{2.19642825612809},{2.1963668334181}}');
INSERT 0 2
postgres=> select *
postgres-> from the_table;
 id | data
----+----------------------------------------------------------------------------------------------------------------------------------------------------------
 1 | {{2.20751909662576},{2.20679071024511},{2.20615506273571},{2.2055910715332},{2.20507756148068},{2.20459435596551},{2.20412336646322},{2.20364958013081}}
 2 | {{2.19780662274576},{2.19740415504621},{2.19705997866427},{2.19677947299938},{2.1965673849541},{2.19642825612809},{2.1963668334181}}
(2 rows)
postgres=> update the_table
postgres-> set data = x.data
postgres-> from (
postgres(> select t.id, array_agg(d.x order by d.idx) as data
postgres(> from the_table t, unnest(data) with ordinality as d(x, idx)
postgres(> group by t.id
postgres(> ) x
postgres-> where x.id = the_table.id;
UPDATE 2
postgres=> select *
postgres-> from the_table;
 id | data
----+------------------------------------------------------------------------------------------------------------------------------------------
 1 | {2.20751909662576,2.20679071024511,2.20615506273571,2.2055910715332,2.20507756148068,2.20459435596551,2.20412336646322,2.20364958013081}
 2 | {2.19780662274576,2.19740415504621,2.19705997866427,2.19677947299938,2.1965673849541,2.19642825612809,2.1963668334181}
(2 rows)
postgres=>

Much nicer than my hacky serialization/deserialization via strings approach, thanks. I'll test it now.
Works beautifully. Out of interest, is the use of WITH ORDINALITY essential? Are no guarantees made about the order of unnested array data without it?
An order is never guaranteed without an order by and the only way to do an array_agg() with an order by is if the original array index is available.

Stack Exchange Network

Reshape array in PostgreSQL

1 Answer 1

Your Answer

Sign up or log in

Post as a guest

Post as a guest

Hot Network Questions

Reshape array in PostgreSQL

1 Answer 1

Your Answer

Sign up or log in

Post as a guest

Post as a guest

Related

Hot Network Questions