I'm trying to reverse geocode some points using an OSM import into PostGIS. I imported the OSM data (for Denmark so all addresses should be avaialable) into PostGIS using osm2pgsql enabling "addr:street" and "addr:housenumber". All the addresses should now be in the planet_osm_point table
.
My points table that I want to reverse geocode is created using
update tower set the_geom = st_transform(st_setsrid(st_makepoint(xcoord,
ycoord),4326),4326)
When trying to use the following code I get some very weird results where as far as I can see the addresses I end up with are either random but still within the general area or maybe displaced by up to a few km:
update tower set nearest_number = t."addr:housenumber", nearest_street = t."addr:street"
from
(
select "addr:housenumber", "addr:street", pk from planet_osm_point, tower
where st_dwithin(st_transform(planet_osm_point.way, 4326), tower.the_geom, 0.01)
order by st_transform(planet_osm_point.way,4326) <-> tower.the_geom ASC
) t
where tower.pk=t.pk;
In the attached image I have labeled the points with the result of the reverse geocoding, and as an example I have highligthed a point which has gotten an address that is located longer to the north, somewhat beyond the boundary. enter image description here Anyone have any ideas as to what I'm doing wrong?
-
why not use postgis.net/docs/Reverse_Geocode.html ?Mapperz– Mapperz ♦2015年11月09日 16:18:50 +00:00Commented Nov 9, 2015 at 16:18
-
Isn't Tiger Geocoder only for USA?JonasPedersen– JonasPedersen2015年11月09日 16:41:37 +00:00Commented Nov 9, 2015 at 16:41
-
Not if you have standardised your addresses postgis.net/docs/Address_Standardizer.htmlMapperz– Mapperz ♦2015年11月09日 17:34:39 +00:00Commented Nov 9, 2015 at 17:34
-
That's the first time I've heard about that and I'll definitely look into that. But I would also really like to know what I've done wrong in the example aboveJonasPedersen– JonasPedersen2015年11月09日 17:54:27 +00:00Commented Nov 9, 2015 at 17:54
2 Answers 2
It's easy to get unexpected effects running an UPDATE
that involves a join. Deep in the PostgreSQL docs for UPDATE, you can find the following warning:
When a FROM clause is present, what essentially happens is that the target table is joined to the tables mentioned in the from_list, and each output row of the join represents an update operation for the target table. When using FROM you should ensure that the join produces at most one output row for each row to be modified. In other words, a target row shouldn't join to more than one row from the other table(s). If it does, then only one of the join rows will be used to update the target row, but which one will be used is not readily predictable.
Since your original query is potentially generating many street matches for each point, you're updating the point with a random one of those matches. The fact that you're ordering the subquery doesn't help, unfortunately.
You can fix the problem by putting a DISTINCT
in your FROM
clause to make sure that you only get one match or, perhaps more directly, putting a LIMIT 1
after your ORDER BY
.
A nice way to avoid the problem (that isn't much help here) is to pull your join out of the FROM
and turn it into an expression after SET
, ie:
UPDATE tower
SET nearest_id = (SELECT id FROM planet_osm_point.way
WHERE ST_DWithin(tower.geom, way.geom, 0.01)
ORDER BY ST_Distance(tower.geom, way.geom) ASC
LIMIT 1);
This method has the advantage that your query will fail if your subquery returns more than one row. But if you need to fetch more than one column from your subquery, I don't think there's a way to take advantage of this.
I'm not really sure why this works, and I would love for someone to provide a real answer with the reasoning behind, but using the following code seems to work:
update tower set nearest_number = t."addr:housenumber", nearest_street = t."addr:street"
from
(
select distinct st_transform(planet_osm_point.way,4326) <-> tower.the_geom, "addr:housenumber", "addr:street", pk from planet_osm_point, tower
where st_dwithin(st_transform(planet_osm_point.way, 4326), tower.the_geom, 0.01)
order by st_transform(planet_osm_point.way,4326) <-> tower.the_geom ASC
) t
where tower.pk=t.pk;
EDIT
The statement below makes a lot more sense as it uses the select distinct on
expression, on the column pk
, which is a serialised unique integer. So for each unique pk
value I get the nearest point from planet_osm_point
.
Inspired by bostongis.
update tower set nearest_number = t."addr:housenumber", nearest_street = t."addr:street"
from
(
select distinct on(tower.pk) tower.pk, "addr:housenumber", "addr:street"from planet_osm_point, tower
where st_dwithin(st_transform(planet_osm_point.way, 4326), tower.the_geom, 0.01)
order by st_transform(planet_osm_point.way,4326) <-> tower.the_geom ASC
) t
where tower.pk=t.pk;
Explore related questions
See similar questions with these tags.