I'm looking for a way to eliminate duplicates from a PostgreSQL array while preserving the ordering of its elements. What I currently have are the following functions:
create function array_unique( anyarray )
returns anyarray immutable strict language sql as $$
select array( select distinct unnest( 1ドル ) ); $$;
create function array_unique_sorted( anyarray )
returns anyarray immutable strict language sql as $$
select array( select distinct unnest( 1ドル ) order by 1 ); $$;
/* ### TAINT there ought to be a simpler, declarative solution */
create function array_unique_stable( text[] )
returns text[] immutable strict parallel safe language plpgsql as $$
declare
R text[] = '{}';
¶element text;
begin
foreach ¶element in array 1ドル loop
if not array[ ¶element ] && R then
R := R || array[ ¶element ];
end if;
end loop;
return R; end; $$;
In the above, array_unique
takes an array of any type and returns a copy
with all duplicates removed; their relative ordering is arbitrary.
array_unique_sorted
is like array_unique
, but the elements are sorted
relative to each other; this is sometimes useful as all arrays with the
same set of of distinct elements will compare equal after being
normalized by this function.
array_unique_stable
already does what I'm looking for: given an array
(which in this exampe must be a text[]
array), it scans elements from
left to right; whenever it encounters a previously unseen element, it adds
that one to the result. Thus, only the first occurrence of each value
is kept.
However, the implementation has some drawbacks: First, there seems to be no
way to write it so it accepts the pseudo-type anyarray
.
Second, while
the first two functions are written in SQL, they may presumably be inlined,
array_unique_stable
is written in PL/pgSQL, and so it cannot be inlined.
Third, it bugs me that I couldn't come up with a solution in pure SQL...
-
1If you need to do something like that, using an array was probably the wrong choice.user1822– user18222018年07月06日 05:36:59 +00:00Commented Jul 6, 2018 at 5:36
-
1I do not think so. Using arrays, sorting things, keeping elements distinct are different and very frequent requirements; any subset of the three and all three together should be achievable in straightforward ways, IMHOJohn Frazer– John Frazer2018年07月06日 06:10:00 +00:00Commented Jul 6, 2018 at 6:10
1 Answer 1
This can indeed be done using pure SQL:
create function array_unique_stable(p_input anyarray)
returns anyarray immutable strict parallel safe
language sql
as
$$
select array_agg(t order by x)
from (
select distinct on (t) t,x
from unnest(p_input) with ordinality as p(t,x)
order by t,x
) t2;
$$
The unnest(p_input) with ordinality
will return the original index of the element in the array which is then used to aggregate them back in the outer query.
select array_unique_stable(array['a','x','x','b']) as text_array,
array_unique_stable(array[10,1,1,5,8,8]) as int_array
returns
text_array | int_array
-----------+-----------
{a,x,b} | {10,1,5,8}
and
-
2Thanks indeed! I realize I was soooo close before I threw my code away and rewrote it in PL/pgSQL... FWIW there's an essentially identical answer over at stackoverflow.com/a/42399297/7568091 which I have since found; in other news, there are lots of incorrect answers to similar questions here on DBA and on over on the general SE, so I guess it's good to have another correct answer here.John Frazer– John Frazer2018年07月06日 06:07:01 +00:00Commented Jul 6, 2018 at 6:07
Explore related questions
See similar questions with these tags.