Inaccurate output (missing features) while reading a shapefile into networkx

Question 1

I am doing some work with networkx, which involves the conversion of a point and polyline shapefile into a graph with nodes and links. The documentation about the read_shp() method states that it:

Generates a networkx DiGraph from shapefiles. Point geometries are translated into nodes, lines into edges. Coordinate tuples are used as keys. Attributes are preserved, line geometries are simplified into start and end coordinates. Accepts a single shapefile or directory of many shapefiles.

My first observation is that the method does not read all the features in the shapefiles e.g. there are over 31000 points in my nodes shapefile but the length of nodelist returns only 4991 as against 31760 nodes. I observed the same behaviour for the lines shapefile. I have a small code snippet below as an illustration.

import networkx as nx
G = nx.read_shp("Sample_nodes.shp", simplify=False)
print len(G.nodes()) 
#output: 4991 instead of 31760

Is there something I am missing?

Question 2

I think you'd be better off reporting your issue here: github.com/networkx/networkx/issues. Also see if you can find any useful info on the github page, if you haven't done so already: github.com/networkx/networkx

Question 3

It is not complicated.

nx.read_shp uses ogr to read a shapefile , look at nx_shp.py line 78-88
then the script use a dictionary to add nodes to the Graph (lines 87-88, net.add_node((g.GetPoint_2D(0)), attributes))

First part of the script, ogr only

 shp = ogr.Open("network_pts.shp")
 for lyr in shp:
 fields = [x.GetName() for x in lyr.schema]
 for f in lyr:
 flddata = [f.GetField(f.GetFieldIndex(x)) for x in fields]
 g = f.geometry()
 attributes = dict(zip(fields, flddata))
 attributes["ShpName"] = lyr.GetName()
 # Note: Using layer level geometry type
 print g.GetPoint_2D(0), attributes
 (204097.29746070135, 89662.23095525998) {'ShpName': 'network_pts', 'type': 'one'}
 (204168.65175332528, 89745.26602176542) {'ShpName': 'network_pts', 'type': 'two'}
 (204110.75574365177, 89765.58041112455) {'ShpName': 'network_pts', 'type': 'three'}
 (204220.19951632406, 89794.7823458283) {'ShpName': 'network_pts', 'type': 'three'}
 (204097.29746070135, 89662.23095525998) {'ShpName': 'network_pts', 'type': 'one-bis'}

The shapefile contains 5 features with "one" and "one-bis" with the same coordinates and two others with the same type.

with read_shp

 G = nx.read_shp("network_pts.shp")
 G.number_of_nodes()
 4

Why

print G.node
{(204097.29746070135, 89662.23095525998): {'ShpName': 'network_pts', 'type': 'one-bis'}, (204110.75574365177, 89765.58041112455): {'ShpName': 'network_pts', 'type': 'three'}, (204220.19951632406, 89794.7823458283): {'ShpName': 'network_pts', 'type': 'three'}, (204168.65175332528, 89745.26602176542): {'ShpName': 'network_pts', 'type': 'two'}}

And for the points with same coordinates

print G.node[(204097.29746070135, 89662.23095525998)]
{'ShpName': 'network_pts', 'type': 'one-bis'}

Only one point was retained, the last one (insertions in a dictionary with same key)

net = nx.DiGraph()
net.node[(204097.29746070135, 89662.23095525998)] = {'ShpName': 'network_pts', 'type': 'one'}
net.node[(204097.29746070135, 89662.23095525998)] = {'ShpName': 'network_pts', 'type': 'one-bis'}
print net.node
{(204097.29746070135, 89662.23095525998): {'ShpName': 'network_pts', 'type': 'one-bis'}}

Question 4

so it removes duplicate features to ensure only unique ones are used to build the network

Question 5

But does this not have an effect, if routing is to be done on that network, since different routes may use a particular node more than once?

Question 6

In Graph theory, a node is unique with many edges

gene gene 55.8k3 gold badges115 silver badges196 bronze badges · Accepted Answer · 2016-09-19 08:50:04Z

It is not complicated.

nx.read_shp uses ogr to read a shapefile , look at nx_shp.py line 78-88
then the script use a dictionary to add nodes to the Graph (lines 87-88, net.add_node((g.GetPoint_2D(0)), attributes))

First part of the script, ogr only

 shp = ogr.Open("network_pts.shp")
 for lyr in shp:
 fields = [x.GetName() for x in lyr.schema]
 for f in lyr:
 flddata = [f.GetField(f.GetFieldIndex(x)) for x in fields]
 g = f.geometry()
 attributes = dict(zip(fields, flddata))
 attributes["ShpName"] = lyr.GetName()
 # Note: Using layer level geometry type
 print g.GetPoint_2D(0), attributes
 (204097.29746070135, 89662.23095525998) {'ShpName': 'network_pts', 'type': 'one'}
 (204168.65175332528, 89745.26602176542) {'ShpName': 'network_pts', 'type': 'two'}
 (204110.75574365177, 89765.58041112455) {'ShpName': 'network_pts', 'type': 'three'}
 (204220.19951632406, 89794.7823458283) {'ShpName': 'network_pts', 'type': 'three'}
 (204097.29746070135, 89662.23095525998) {'ShpName': 'network_pts', 'type': 'one-bis'}

The shapefile contains 5 features with "one" and "one-bis" with the same coordinates and two others with the same type.

with read_shp

 G = nx.read_shp("network_pts.shp")
 G.number_of_nodes()
 4

Why

print G.node
{(204097.29746070135, 89662.23095525998): {'ShpName': 'network_pts', 'type': 'one-bis'}, (204110.75574365177, 89765.58041112455): {'ShpName': 'network_pts', 'type': 'three'}, (204220.19951632406, 89794.7823458283): {'ShpName': 'network_pts', 'type': 'three'}, (204168.65175332528, 89745.26602176542): {'ShpName': 'network_pts', 'type': 'two'}}

And for the points with same coordinates

print G.node[(204097.29746070135, 89662.23095525998)]
{'ShpName': 'network_pts', 'type': 'one-bis'}

Only one point was retained, the last one (insertions in a dictionary with same key)

net = nx.DiGraph()
net.node[(204097.29746070135, 89662.23095525998)] = {'ShpName': 'network_pts', 'type': 'one'}
net.node[(204097.29746070135, 89662.23095525998)] = {'ShpName': 'network_pts', 'type': 'one-bis'}
print net.node
{(204097.29746070135, 89662.23095525998): {'ShpName': 'network_pts', 'type': 'one-bis'}}

so it removes duplicate features to ensure only unique ones are used to build the network
But does this not have an effect, if routing is to be done on that network, since different routes may use a particular node more than once?

Stack Exchange Network

Inaccurate output (missing features) while reading a shapefile into networkx

1 Answer 1

Your Answer

Sign up or log in

Post as a guest

Post as a guest

Linked

Hot Network Questions

Inaccurate output (missing features) while reading a shapefile into networkx

1 Answer 1

Your Answer

Sign up or log in

Post as a guest

Post as a guest

Linked

Related

Hot Network Questions