Correcting addresses with Python or Model Builder

Question 1

I am trying to make a script in Python (or Model Builder in ArcGIS) for correcting addresses.

I have a point shapefile with the addresses (it has X and Y, and fields for "number", "street name", and a unique code for locality), and a line shapefile with the name of the streets and also the unique locality code (we assume that there are no errors in the streets shapefile).

Example of errors I have to find:

in the same locality I can't have two points with the same number on the same street
no major leaps between numbers (for example the number 32 between 12 and 16)
only even numbers on a side of the street and only odd numbers on the other side
isolated numbers on the same street (ex: 2-3 km away from the next point on the same street)
etc.

I'll have to divide it into many sub-problems and try to fix them separately.

Do you have any suggestions about how I should proceed?

Question 2

Welcome to the site ryan. Could you please refine the question into a single answerable question? Remember, you can always ask more quesions if you cannot find the answer in the archives. As this question stands, it is likely to be closed as being too broad. You can find more details on how to ask GIS SE type questions here: gis.stackexchange.com/help

Question 3

First, I'm skipping "only even numbers on a side of the street and only odd numbers on the other side" because the code for this might get a bit complicated. And because I'm not sure how to accomplish this task. Anyways, let's break these down. Python is my method of choice.

in the same locality I can't have two points with the same number on the same street

For this you are going to want to make use of a SearchCursor and a list. The type I use is for ArcGIS 10.1 and beyond. There is another type if you are using an older version of ArcGIS.

Try this:

shpfile = r"shape\file\full\path"
LocalityFldName = "locality"
NumberFldName = "num"
StreetFldName = "street"
import arcpy
li = []
flds = [LocalityFldName, NumberFldName, StreetFldName]
cursor = arcpy.da.SearchCursor (shpfile, flds)
for row in cursor:
 if not row[0] + str(row[1]) + row[2] in li[]:
 li.append(row[0] + str(row[1]) + row[2])
 else:
 print "Found double record"
 print row[0], row[1], row[2]
del row
del cursor

no major leaps between numbers (for example the number 32 between 12 and 16)

There is probably no perfect way to do this, given how varied the spatial relationships of parcels can be. Here is one method to get closer though, that may or may not work for you. In the code below, I first create a list of streets from the street name field using a cursor. I then iterate through this list of streets and create two layer files. One I will select an individual feature to be analyzed from using select by attribute, and the other I will select neighboring parcels from using select by location. I check if there are three features selected (initial and two neighbors) (this will probably not be a perfect method), and if so I perform a check to see if the neighboring street values are both smaller than the value in the feature being checked. You can mess a bit with this logic to see what works best for you.

shpfile = r"shape\file\full\path"
LocalityFldName = "locality"
NumberFldName = "num"
StreetFldName = "street"
import arcpy
arcpy.env.overwriteOutput = True
flds = [LocalityFldName, NumberFldName, StreetFldName, "OID@"]
#Get list of unique street names with searchcursor
streetsli = []
cursor = da.SearchCursor (shpfile, [StreetFldName])
for row in cursor:
 if not row[0] in streetsli:
 streetsli.append (row[0])
del row
del cursor
for street in streetsli:
 #Make layer for each stret
 sql = '"' + StreetFldName + '" = \'' + street + "'"
 #Make feature layer to iterate through features
 arcpy.MakeFeatureLayer_management (shpfile, "lyr", sql)
 #Make feature layers for neighbor selection
 arcpy.MakeFeatureLayer_management (shpfile, "neighborlyr", sql)
 cursor = arcpy.da.SearchCursor ("lyr", flds)
 OIDFld = arcpy.Describe ("lyr").OIDFieldName
 for row in cursor:
 #Select single feature
 sql = '"' + OIDFld + '" = ' + str(row[3])
 arcpy.SelectLayerByAttribute_management ("lyr", "", sql)
 #Select neighbors
 arcpy.SelectLayerByLocation_management ("neighborlyr", "", "lyr")
 #Check if three features are selected (selection feature plus two neighbors)
 if int(arcpy.GetCount_management("neighborlyr").getOutput(0)) == 3:
 first = True
 #Get two neighboring street values
 ncursor = arcpy.da.SearchCursor ("neighborlyr", [NumberFldName])
 for nrow in ncursor:
 #Make sure row is not same as initial street number
 if nrow[0] != row[1]:
 if first == True:
 val1 = nrow[0]
 first = False
 else:
 val2 = nrow[0]
 del nrow
 del ncursor
 #Check if initial street number value is greater than both neighboring street values
 if row[2] > val1 and row[2] > val2:
 print "Something strange with", row[0], row[1], street

isolated numbers on the same street (ex: 2-3 km away from the next point on the same street)

Look into accomplishing this with SelectLayerByLocation_management or Near_analysis or PointDistance_analysis.

I hope this helps!

Emil Brundage Emil Brundage 13.9k3 gold badges29 silver badges64 bronze badges · Accepted Answer · 2014-12-28 04:30:27Z

First, I'm skipping "only even numbers on a side of the street and only odd numbers on the other side" because the code for this might get a bit complicated. And because I'm not sure how to accomplish this task. Anyways, let's break these down. Python is my method of choice.

in the same locality I can't have two points with the same number on the same street

For this you are going to want to make use of a SearchCursor and a list. The type I use is for ArcGIS 10.1 and beyond. There is another type if you are using an older version of ArcGIS.

Try this:

shpfile = r"shape\file\full\path"
LocalityFldName = "locality"
NumberFldName = "num"
StreetFldName = "street"
import arcpy
li = []
flds = [LocalityFldName, NumberFldName, StreetFldName]
cursor = arcpy.da.SearchCursor (shpfile, flds)
for row in cursor:
 if not row[0] + str(row[1]) + row[2] in li[]:
 li.append(row[0] + str(row[1]) + row[2])
 else:
 print "Found double record"
 print row[0], row[1], row[2]
del row
del cursor

no major leaps between numbers (for example the number 32 between 12 and 16)

There is probably no perfect way to do this, given how varied the spatial relationships of parcels can be. Here is one method to get closer though, that may or may not work for you. In the code below, I first create a list of streets from the street name field using a cursor. I then iterate through this list of streets and create two layer files. One I will select an individual feature to be analyzed from using select by attribute, and the other I will select neighboring parcels from using select by location. I check if there are three features selected (initial and two neighbors) (this will probably not be a perfect method), and if so I perform a check to see if the neighboring street values are both smaller than the value in the feature being checked. You can mess a bit with this logic to see what works best for you.

shpfile = r"shape\file\full\path"
LocalityFldName = "locality"
NumberFldName = "num"
StreetFldName = "street"
import arcpy
arcpy.env.overwriteOutput = True
flds = [LocalityFldName, NumberFldName, StreetFldName, "OID@"]
#Get list of unique street names with searchcursor
streetsli = []
cursor = da.SearchCursor (shpfile, [StreetFldName])
for row in cursor:
 if not row[0] in streetsli:
 streetsli.append (row[0])
del row
del cursor
for street in streetsli:
 #Make layer for each stret
 sql = '"' + StreetFldName + '" = \'' + street + "'"
 #Make feature layer to iterate through features
 arcpy.MakeFeatureLayer_management (shpfile, "lyr", sql)
 #Make feature layers for neighbor selection
 arcpy.MakeFeatureLayer_management (shpfile, "neighborlyr", sql)
 cursor = arcpy.da.SearchCursor ("lyr", flds)
 OIDFld = arcpy.Describe ("lyr").OIDFieldName
 for row in cursor:
 #Select single feature
 sql = '"' + OIDFld + '" = ' + str(row[3])
 arcpy.SelectLayerByAttribute_management ("lyr", "", sql)
 #Select neighbors
 arcpy.SelectLayerByLocation_management ("neighborlyr", "", "lyr")
 #Check if three features are selected (selection feature plus two neighbors)
 if int(arcpy.GetCount_management("neighborlyr").getOutput(0)) == 3:
 first = True
 #Get two neighboring street values
 ncursor = arcpy.da.SearchCursor ("neighborlyr", [NumberFldName])
 for nrow in ncursor:
 #Make sure row is not same as initial street number
 if nrow[0] != row[1]:
 if first == True:
 val1 = nrow[0]
 first = False
 else:
 val2 = nrow[0]
 del nrow
 del ncursor
 #Check if initial street number value is greater than both neighboring street values
 if row[2] > val1 and row[2] > val2:
 print "Something strange with", row[0], row[1], street

isolated numbers on the same street (ex: 2-3 km away from the next point on the same street)

Look into accomplishing this with SelectLayerByLocation_management or Near_analysis or PointDistance_analysis.

I hope this helps!

Stack Exchange Network

Correcting addresses with Python or Model Builder

1 Answer 1

Your Answer

Sign up or log in

Post as a guest

Post as a guest

Hot Network Questions

Correcting addresses with Python or Model Builder

1 Answer 1

Your Answer

Sign up or log in

Post as a guest

Post as a guest

Related

Hot Network Questions