Referring to this topic: Randomly subsetting % of polygons by class/attributes using ArcPy...I am trying to randomly subset a % of polygons based on an attribute that they all share. For example, I want to randomly select 90% of all polygons in a feature class that have a value of 90 in a given attribute. I want to also randomly select 72% of all polygons in that feature class with a value of 72 in the attribute. And so on and so forth...
I have used this code before to randomly subset based on a single percent alone, but don't personally have the python chops to take it to the next level...
def SelectRandomByPercent (layer, percent):
#layer variable is the layer name in TOC
#percent is percent as whole number (0-100)
if percent > 100:
print "percent is greater than 100"
return
if percent < 0:
print "percent is less than zero"
return
import random
fc = arcpy.Describe (layer).catalogPath
featureCount = float (arcpy.GetCount_management (fc).getOutput (0))
count = int (featureCount * float (percent) / float (100))
if not count:
arcpy.SelectLayerByAttribute_management (layer, "CLEAR_SELECTION")
return
oids = [oid for oid, in arcpy.da.SearchCursor (fc, "OID@")]
oidFldName = arcpy.Describe (layer).OIDFieldName
path = arcpy.Describe (layer).path
delimOidFld = arcpy.AddFieldDelimiters (path, oidFldName)
randOids = random.sample (oids, count)
oidsStr = ", ".join (map (str, randOids))
sql = "{0} IN ({1})".format (delimOidFld, oidsStr)
So the idea is that each census block has a different rate of homeownership. Each of these parcel features has an attribute that names the homeownership rate for its census tract. I want to randomly select parcel features according to the various homeownership rates across the whole feature class.
Basically, I want the percentage to be the same for both the attribute and the random percent. and it should run for all percentage options in "Overall Homeownership Rate".
1 Answer 1
I would try something like this:
from collections import Counter
import random, math
lyr = r'A_layer'
fieldname = 'category'
data = [row for row in arcpy.da.SearchCursor(lyr, ['@OID', fieldname])] #List all oids and category into a list of tuple, [1, 'category1'), (2, 'categoryX'),...]
c = Counter([f[1] for f in data]) #Count each category. c will be a dictionary of whatever categories you have in fieldname as keys and their count as values
def selectthem(category, percentage):
n_to_select = math.ceil(c[category]*(percentage/100)) #c['house'] will fetch the "house" count
oids = [f[0] for f in data if f[1]==category]
sample = random.sample(oids, n_to_select) #Select 10 % of the oids
sql = """{0} IN{1}""".format(arcpy.AddFieldDelimiters(lyr, arcpy.Describe(lyr).OIDFieldName), tuple(sample)
return sql
#Lets say I want to select 10 % of a category called "house"
selectthem("house", 10)
will return a sql clause you can use in Select by attributes, make feature layer et.
-
Hey Bera, I think this is nearly there. One difference though -- I need to use the value in the field of interest as the percentage...so for example, my personalization to your code above: #Lets say I want to select B25003_calc_pctOwnE% of a category called "B25003_calc_pctOwnE" n_to_select = math.ceil(c['B25003_calc_pctOwnE']*(B25003_calc_pctOwnE/100)) #c['B25003_calc_pctOwnE'] will fetch the "B25003_calc_pctOwnE" count.....Maxmarie Wilmoth– Maxmarie Wilmoth2021年10月11日 18:18:21 +00:00Commented Oct 11, 2021 at 18:18
-
does this script allow for me to enter a fieldname as the percentage, or integers only?Maxmarie Wilmoth– Maxmarie Wilmoth2021年10月12日 14:04:03 +00:00Commented Oct 12, 2021 at 14:04
-
Only numbers. I still dont understand what you want to do. Can you edit your question and add a example with a screenshot of the attribute table?Bera– Bera2021年10月12日 16:31:51 +00:00Commented Oct 12, 2021 at 16:31