4

This may be more of a SQL fundamentals question but I'm sure somebody on here has encountered this before.

I have a database of geospacial data (around 4 million points) using PostgreSQL/PostGIS

I would like to add a value to indicate how many of the other points in the table are within a specified range (5km) to that row value.

SELECT count(geom) 
FROM geom_points
WHERE ST_DistanceSphere(geom, ST_SetSRID(ST_MakePoint(long, lat),4326)) < 5000

The following works for a single row but I would like to be able to run this value for all rows and append the value to the row as a count.

I'm not really worried how long it will take as it is a function I intend to run once per year.

How do I achieve this?

geozelot
31.4k4 gold badges38 silver badges59 bronze badges
asked Dec 3, 2021 at 12:07

3 Answers 3

4

Add a column and update with a self-join:

ALTER TABLE <table>
 ADD COLUMN <neighbor_count> INT
;
UPDATE <table> AS t
 SET <neighbor_count> = (
 SELECT COUNT(s.*) - 1
 FROM <table> AS s
 WHERE ST_DWithin(t.geom, s.geom, <distance>)
 )
;

Here

  • ST_DWithin is the better choice, as it implements an index lookup natively
  • COUNT(s.*) - 1 subtracts the current point from the count; this is cheaper than excluding by <id>
  • <distance> refers to your distance in units of the given CRS; see below

This procedure is efficient only when utilizing a spatial index, and the proximity search requires either a suitable projection or the GEOGRAPHY type, if you are using a geographic reference system, to be able to use meter based units.

Since your data seems to be referenced in EPSG:4326, I'd suggest to add a functional index on a CAST to GEOGRAPHY to get results within your lifetime:

CREATE INDEX ON <table>
 USING GIST( (geom::GEOGRAPHY) ) -- double parens!
;
VACUUM ANALYZE <table>;

and run

UPDATE <table> AS t
 SET <neighbor_count> = (
 SELECT COUNT(s.*) - 1
 FROM <table> AS s
 WHERE ST_DWithin(t.geom::GEOGRAPHY, s.geom::GEOGRAPHY, 5000)
 )
;
answered Dec 3, 2021 at 13:18
0
1

You can make use of a lateral join. The trick is to use the same table twice: once for the source, and once for counting nearby points.

Be aware that ST_DistanceSphere doesn't use indexes. Instead you can cast your points to geography, create an index on the geographies, and use st_dwithin instead.

create index geogidx on geom_points USING gist((geom::geography));
select a.*, sub.cnt
from geom_points a,
lateral (select count(*) as cnt 
 from geom_points b
 where st_dwithin(a.geom::geography, b.geom::geography,5000)
 and a.geo_id <> b.geo_id --Optionaly prevent a point from counting itself
 ) sub;
answered Dec 3, 2021 at 13:09
0

You would typically do this with a MATERIALIZED VIEW and a SELF JOIN. Then on the query, you would select from the materialized view and you could find your results (since this will be slow). First, storing lat and long on the row is stupid, as it doesn't make any sense without an SRID and it's more difficult to index and query. So let's fix that.

Sample data (where you're at)

CREATE TABLE foo AS
SELECT 1 AS id, 1::float AS lat, 2::float AS long;

Changing to a geopgrahy point on the table,

BEGIN;
 ALTER TABLE foo ADD COLUMN geog geography;
 UPDATE foo SET geog = ST_MakePoint(long,lat)::geography;
 ALTER TABLE foo DROP COLUMN lat, DROP COLUMN long;
 CREATE INDEX ON foo USING gist ( geog );
COMMIT;

Now to get what you want you'll want to use the query with ST_DWithin, note this will use an index.

SELECT f1.*, count(*) FROM foo AS f1
INNER JOIN foo AS f2 ON ST_DWithin(f1.geog, f2.geog, 5000)
GROUP BY f1.id, f1.geog;

But that may not be fast enough to run in production, to solve your problem you can create a MATERIALIZED VIEW,

CREATE MATERIALIZED VIEW foo_count AS
 SELECT f1.*, count(*) FROM foo AS f1
 INNER JOIN foo AS f2 ON ST_DWithin(f1.geog, f2.geog, 5000)
 GROUP BY f1.id, f1.geog;

Now once a year you just need to run,

REFRESH MATERIALIZED VIEW foo_count;

The advantage being you don't have to remember how you did it, and it creates a discrete layer separating your data from your generated view.

answered Dec 22, 2021 at 17:52

Your Answer

Draft saved
Draft discarded

Sign up or log in

Sign up using Google
Sign up using Email and Password

Post as a guest

Required, but never shown

Post as a guest

Required, but never shown

By clicking "Post Your Answer", you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.