5

I'd like to merge linestrings which touch, but do not cross. The linestrings share the same SRID and come from a PostgreSQL 9.2/PostGIS 1.5 database. Within the table I query are several line strings, not just those I want to merge, but also others.

For example, consider lines from the OpenStreetMap database. I start with a temporary table that has all linestring which are tagged as being part of the power grid, together with the respective voltages. However, not all parts of the grid that belong together are also represented as a single (multi-) linestring, they are within the database the way they were inserted. So, I need to find out which lines touch, do not cross, and share the same voltages, and finally merge them. The result will be another, shorter list containing merged lines and those which did not need to get merged with others.

I've come up with an algorithm in C++ (Qt), which works, but is slow:

for (QHash<qlonglong, MapItem *>::iterator current = gridItems.begin();
 current != gridItems.end(); current++) {
 for (QHash<qlonglong, MapItem *>::iterator other = current;
 other != gridItems.end(); other++) {
 if (other.value() == current.value()) {
 continue;
 }
 const OGRGeometry *firstGeom = current.value()->geometry();
 const OGRGeometry *itGeom = other.value()->geometry();
 if (current.value()->tag("voltages") == other.value()->tag("voltages")
 && firstGeom->Touches(itGeom)
 && !firstGeom->Crosses(itGeom)) {
 // Merge:
 MapItem *mergedItem = new MapItem(firstGeom->Union(itGeom));
 mergedItem->tags(current.value()->tags());
 *current = mergedItem;
 *other = mergedItem;
 }
 }
}

Basically, it does the following: Given a list of pointers to geometries, it creates two iterators within two loops. The second iterator always starts as a copy of the first; then both travel until the end of the list. If the geometries the two pointers point to touch, but do not cross (and also share some tags), they are united.

To illustrate what it does, I've drawn a very simple picture featuring five line strings which could be a possible result of my example query outlined above. In this picture, only the three grey ones are merged:

LineMerger: The grey line strings are merged, the blue ones not.

I now want to improve this approach by (a) possibly improving the algorithm and (b) putting the strain on the database server, not the slower client machine.

I've two options here, but I do not know which one is the best or which actually works:

  1. Create a SQL query which does this. However, I do not know how. (Self-join? Aggregates?) If someone could suggest a query that does this, I'd be absolutely happy.
  2. Create a PostgreSQL C extension that implements my algorithm. If so, I'm happy for any suggestions of how to improve the algorithm or improve it by enabling parallel processing.
asked Oct 11, 2013 at 15:37

3 Answers 3

3
answered Oct 11, 2013 at 18:49
2
  • My similar solution had a lot more temp tables and procedures ( but i had to do it in MS SQL ) Commented Oct 11, 2013 at 19:10
  • Thanks! Took me some time to wrap my head around it, but that was definitively the way to go! Commented Nov 4, 2013 at 18:28
0

See ST_LineMerge. Example:

SELECT ST_AsText(ST_LineMerge(
ST_GeomFromText('MULTILINESTRING((-29 -27,-30 -29.7,-36 -31,-45 -33),(-45 -33,-46 -32))')
 )
);
st_astext
--------------------------------------------------------------------------------------------------
LINESTRING(-29 -27,-30 -29.7,-36 -31,-45 -33,-46 -32)
(1 row)
answered Oct 11, 2013 at 16:44
1
  • Thanks; however, ST_LineMerge merges lines I already know need merging. But I've got a collection of lines which I first have to examine for possible merges. I.e., the if clause of my algorithm. Sorry if that wasn't clear; I'll edit my original post to make it clearer. Commented Oct 11, 2013 at 17:13
0

SELECT ST_Linemerge(ST_Collect(geom)) as geom . voltage FROM table as x Group by voltage , that would try merge all lines with same voltage , or if it fails it would create or if it fails it would create one multilinestring per voltage.

Another way would be creating simple topology using PgRouting Grass example and writing something similar to this parent-child recursive query. Then use source and target column data to solve longest lines . This assumes that you want join only lines from start -> start , start -> end or end-> end . (Pgrouting may have some useful functions already done)

parent-child query is pretty simple, limit initial data to one voltage, start "walk" to tree. if call ends to situation where it would split to two paths , store existing line and start walk two new lines. I have done similar implementation to solve still connected "sinks" in network when one or more connection edges are taken out.

Performance in recursive SQL is not that good, but probably it fast enough. After you have longest lines and network only from line start -> end points, you can start to solve intersecting and touching problems ( do you need to split lines and start to create connectivity between them...)

If SQL sound too hard and you have money to spend, i recommend FME (ww.safe.com) It's nice toolbox to own when you do GIS ( file type conversion and data manipulation )

answered Oct 11, 2013 at 19:07

Your Answer

Draft saved
Draft discarded

Sign up or log in

Sign up using Google
Sign up using Email and Password

Post as a guest

Required, but never shown

Post as a guest

Required, but never shown

By clicking "Post Your Answer", you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.