I have a table of subway stops in NYC and what I am trying to do is select the points from the five closest subway stops to a given point. However, each subway 'stop' has multiple points associated with it due to there being many entrances for each stop.
The table looks like this:
ogc_fid | line | name | objectid | url | wkb_geometry |
---|---|---|---|---|---|
1 | 2-5 | Birchall Ave & Sagamore St at NW corner | 1734.0 | http://web.mta.info/nyct/service/ | POINT (-73.86835600032798 |
2 | 2-5 | Birchall Ave & Sagamore St at NE corner | 1735.0 | http://web.mta.info/nyct/service/ | POINT (-73.86821300022677 |
3 | 2-5 | Morris Park Ave & 180th St at NW corner | 1736.0 | http://web.mta.info/nyct/service/ | POINT (-73.87349900050798 |
4 | 2-5 | Morris Park Ave & 180th St at NW corner | 1737.0 | http://web.mta.info/nyct/service/ | POINT (-73.8728919997833 |
5 | 2-5 | Boston Rd & 178th St at SW corner | 1738.0 | http://web.mta.info/nyct/service/ | POINT (-73.87962300013866 |
6 | 2-5 | Boston Rd & E Tremont Ave at NW corner | 1739.0 | http://web.mta.info/nyct/service/ | POINT (-73.88000500027815 |
If I split the 'name' field by "at" and select the first slice I can group all rows at the same 'stop' but how do I then return all rows associated with the 10 closest groups (stops)?
For instance, when I use this query:
select wkb_geometry
from osm.subwaypoints s
where s.line like '%C%'
order by ST_Distance(ST_SetSRID( ST_Point( -73.96833020663865,40.68396650310555), 4326),s.wkb_geometry ) LIMIT 10
I get the nearest 10 entrances, but it only accounts for 2 actual stops
1 Answer 1
The first thing to do is to give a unique name to each 'stop'. Like you mentioned we can do this if you split the field by 'at', something like:
ALTER TABLE osm.subwaypoints
ADD COLUMN station_na VARCHAR;
UPDATE osm.subwaypoints
SET station_na = left(name, strpos(name, '_') - 1);
Then we go ahead and run the following query:
DROP TABLE IF EXISTS osm.subwaypoints_nearten;
CREATE TABLE public.subwaypoints_nearten AS
SELECT a.*
FROM osm.subwaypoints a
INNER JOIN
(
SELECT s.station_na
from osm.subwaypoints s
where s.line like '%C%'
GROUP BY station_na
order by ST_Distance(ST_SetSRID( ST_Point( -73.96833020663865,40.68396650310555), 4326),ST_COLLECT(s.geom) ) LIMIT 10
) as b
ON a.station_na = b.station_na
WHERE a.line like '%C%';
Here we are using a sub query to, first of all, get a list of all the 'station_na's that are the 10 closest to our point (GROUP BY station_na is doing the work here keep each 'stop' grouped). Then the outer query is selecting the details of the points that match that list of station names.
Closest 10 '%C%' line stations:Closest 10 '%C%' line stations
You have a WHERE select for line so I've included that above. The same query without the line filter would be:
DROP TABLE IF EXISTS osm.subwaypoints_nearten_notc;
CREATE TABLE osm.subwaypoints_nearten_notc AS
SELECT a.*
FROM osm.subwaypoints a
INNER JOIN
(
SELECT s.station_na
from osm.subwaypoints s
GROUP BY station_na
order by ST_Distance(ST_SetSRID( ST_Point( -73.96833020663865,40.68396650310555), 4326),ST_COLLECT(s.geom) ) LIMIT 10
) as b
ON a.station_na = b.station_na;
Closest 10 stations of any line:Closest 10 stations of any line:
-
Excellent! I wasn't aware of the ST_Collect method for aggregating geometries on a group by. Thanks!jbogart– jbogart2021年05月12日 14:15:56 +00:00Commented May 12, 2021 at 14:15
-
Great, glad it worked! I should have mentioned 'ST_Collect' in the description.Cushen– Cushen2021年05月13日 00:15:08 +00:00Commented May 13, 2021 at 0:15
-
For better performance, you could create a table of the stop point groups with a spatial index on the geometry column, and then use the KNN
<->
operator.dr_jts– dr_jts2021年05月13日 21:15:08 +00:00Commented May 13, 2021 at 21:15