6

I'm trying to reverse geocode some points using an OSM import into PostGIS. I imported the OSM data (for Denmark so all addresses should be avaialable) into PostGIS using osm2pgsql enabling "addr:street" and "addr:housenumber". All the addresses should now be in the planet_osm_point table. My points table that I want to reverse geocode is created using

update tower set the_geom = st_transform(st_setsrid(st_makepoint(xcoord, ycoord),4326),4326)

When trying to use the following code I get some very weird results where as far as I can see the addresses I end up with are either random but still within the general area or maybe displaced by up to a few km:

update tower set nearest_number = t."addr:housenumber", nearest_street = t."addr:street" from ( select "addr:housenumber", "addr:street", pk from planet_osm_point, tower where st_dwithin(st_transform(planet_osm_point.way, 4326), tower.the_geom, 0.01) order by st_transform(planet_osm_point.way,4326) <-> tower.the_geom ASC ) t where tower.pk=t.pk;

In the attached image I have labeled the points with the result of the reverse geocoding, and as an example I have highligthed a point which has gotten an address that is located longer to the north, somewhat beyond the boundary. enter image description here Anyone have any ideas as to what I'm doing wrong?

asked Nov 9, 2015 at 16:05
4
  • why not use postgis.net/docs/Reverse_Geocode.html ? Commented Nov 9, 2015 at 16:18
  • Isn't Tiger Geocoder only for USA? Commented Nov 9, 2015 at 16:41
  • Not if you have standardised your addresses postgis.net/docs/Address_Standardizer.html Commented Nov 9, 2015 at 17:34
  • That's the first time I've heard about that and I'll definitely look into that. But I would also really like to know what I've done wrong in the example above Commented Nov 9, 2015 at 17:54

2 Answers 2

3

It's easy to get unexpected effects running an UPDATE that involves a join. Deep in the PostgreSQL docs for UPDATE, you can find the following warning:

When a FROM clause is present, what essentially happens is that the target table is joined to the tables mentioned in the from_list, and each output row of the join represents an update operation for the target table. When using FROM you should ensure that the join produces at most one output row for each row to be modified. In other words, a target row shouldn't join to more than one row from the other table(s). If it does, then only one of the join rows will be used to update the target row, but which one will be used is not readily predictable.

Since your original query is potentially generating many street matches for each point, you're updating the point with a random one of those matches. The fact that you're ordering the subquery doesn't help, unfortunately.

You can fix the problem by putting a DISTINCT in your FROM clause to make sure that you only get one match or, perhaps more directly, putting a LIMIT 1 after your ORDER BY.

A nice way to avoid the problem (that isn't much help here) is to pull your join out of the FROM and turn it into an expression after SET, ie:

UPDATE tower
 SET nearest_id = (SELECT id FROM planet_osm_point.way
 WHERE ST_DWithin(tower.geom, way.geom, 0.01)
 ORDER BY ST_Distance(tower.geom, way.geom) ASC
 LIMIT 1);

This method has the advantage that your query will fail if your subquery returns more than one row. But if you need to fetch more than one column from your subquery, I don't think there's a way to take advantage of this.

answered Nov 12, 2015 at 11:55
2

I'm not really sure why this works, and I would love for someone to provide a real answer with the reasoning behind, but using the following code seems to work:

update tower set nearest_number = t."addr:housenumber", nearest_street = t."addr:street" from ( select distinct st_transform(planet_osm_point.way,4326) <-> tower.the_geom, "addr:housenumber", "addr:street", pk from planet_osm_point, tower where st_dwithin(st_transform(planet_osm_point.way, 4326), tower.the_geom, 0.01) order by st_transform(planet_osm_point.way,4326) <-> tower.the_geom ASC ) t where tower.pk=t.pk;

EDIT

The statement below makes a lot more sense as it uses the select distinct onexpression, on the column pk, which is a serialised unique integer. So for each unique pkvalue I get the nearest point from planet_osm_point. Inspired by bostongis.

update tower set nearest_number = t."addr:housenumber", nearest_street = t."addr:street" from ( select distinct on(tower.pk) tower.pk, "addr:housenumber", "addr:street"from planet_osm_point, tower where st_dwithin(st_transform(planet_osm_point.way, 4326), tower.the_geom, 0.01) order by st_transform(planet_osm_point.way,4326) <-> tower.the_geom ASC ) t where tower.pk=t.pk;

answered Nov 10, 2015 at 10:02

Your Answer

Draft saved
Draft discarded

Sign up or log in

Sign up using Google
Sign up using Email and Password

Post as a guest

Required, but never shown

Post as a guest

Required, but never shown

By clicking "Post Your Answer", you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.