I have a field titled GEOGRAPHY
, which contains hundreds of records containing various strings of all similar format, for example:
"Witless Bay (1001559) T 00000 ( 6.5%)"
"Laniel (2485905) NO 00909 ( 6.7%)"
"Contrecoeur (2459035) V 00000 ( 4.9%)"
However, in order for me to join this table with my boundary layer, I need the records in this field to only contain the 7 digit number in the parenthesis, i.e. 1001559
, for the Witless Bay
record.
I cannot figure out how to do this through field calculator using regex. I have looked at ESRI's examples, and they only show examples for substitutions. Using LEFT()
or RIGHT()
doesn't work either, as the string lengths are variable.
I have tried something to the extent of the following without success with attached error:
Expression Type: Python 3
Expression:
GEOGRAPHY = update_name(!GEOGRAPHY!)
Code Block:
import re
def update_name(geo_name):
return re.search(r"""\((.*?)\)""", geo_name)
I do not understand why this wouldn't work. The expression is valid \((.*?)\)
. It gives me the following error:
The field is not nullable. [GEOGRAPHY]
Failed to execute (CalculateField).
I understand that it gives that error because it can't return a NULL value, but should this not be returning the 7 digit number?
1 Answer 1
Open up a Python console and run the following line:
re.search(r"""\((.*?)\)""", "Witless Bay (1001559) T 00000 ( 6.5%)")
You will see it returns a match object (<_sre.SRE_Match object at 0x15AB2E60>
) as @mikewatt commented.
There is no way ArcGIS can insert this object inside a field. You have to use the group
method.
Following the preceding example, you can access the groups depending on your result. In this case you can get only two groups and (I assume) you are interested in the second one:
match = re.search(r"""\((.*?)\)""", "Witless Bay (1001559) T 00000 ( 6.5%)")
print(match.group(0)) # '(1001559)'
print(match.group(1)) # '1001559'
print(match.group(2)) # IndexError: no such group
Furthermore, I would add some logic to handle the cases where re.search
does not find anything and returns None
.
For example, if you run re.search(r"""\((.*?)\)""", "Witless Bay")
it will return None
and if you run re.search(r"""\((.*?)\)""", "Witless Bay").group(0)
you will get the following error:
AttributeError: 'NoneType' object has no attribute 'group'
Your code block could look something similar to this:
import re
def update_name(geo_name):
match = re.search(r"""\((.*?)\)""", geo_name)
if match:
return match.group(1)
else:
return '-99999' # an arbitrary value
Explore related questions
See similar questions with these tags.