Query where one field includes the text string in another field

Question 1

I am trying to query mailing address records that match physical address records, but the two fields sometimes have data in a different format. For example, the property's mailing address may be shown as '123 NW Elm Street', while the physical address may be shown as '123 Elm St'. Even if I can just query whether the address contains the same street name that would help me if I knew if 'Elm' was anywhere in the address. There is a field with just street name, with no number or prefix or suffix.

I am trying to do a definition query to select nearly matching records with wildcards with something like this:

"StreetAddress" LIKE "%StreetName%"

I have tried numerous variations with no luck, including '%'+"STreetName"+'%' etc.

Is there a "FIELD_A" INCLUDES "FIELD_B" query or a "FIELD_A" CONTAINS "FIELD_B" command?

Question 2

Welcome to GIS SE! As a new user be sure to take the Tour to learn about our focussed Q&A format. What GIS software are you using?

Question 3

Please also edit the question to specify the data source format, since the functions available are defined by the software in use and the capabilities of the data store.

Question 4

You need a nested query.

StreetAddress IN (SELECT StreetName FROM <Whatever you layer is named>)

e.g. StreetAddress IN (SELECT StreetName FROM House_Addresses) - House_Addresses is the layer name in this example.

Also, if you have not heard of Fuzzy tables/relationships you should look into them. They may provide a solution.

Here's the tool for excell: https://www.microsoft.com/en-au/download/details.aspx?id=15011

Python: https://pypi.python.org/pypi/fuzzywuzzy

Video: https://www.youtube.com/watch?v=3v-qxcjZbyo

Question 5

Assuming you're using ArcGIS, you can make the comparison in field calculator (Python parser), and select on that:

1 if !StreetName! in !StreetAddress! else 0

Question 6

In addition to what @phloem suggests, you can use python's difflib module with get_close_matches or SequenceMatcher.ratio methods. The former gives you the best matching entries within a set/iterable, say whole of a column, while the latter gives you a score by comparing inputted pair, i.e., values in two columns of a row. For example:

difflib.get_close_matches('123 NW Elm Street', ['123 Elm St'],1,0.7)[0] will give you '123 Elm St', whereas

int(difflib.SequenceMatcher(None, '123 NW Elm Street', '123 Elm St').ratio()*100) will compare source with target and yield a matching score, 74.

jbalk jbalk 7,6971 gold badge19 silver badges42 bronze badges · Accepted Answer · 2016-12-15 00:01:18Z

You need a nested query.

StreetAddress IN (SELECT StreetName FROM <Whatever you layer is named>)

e.g. StreetAddress IN (SELECT StreetName FROM House_Addresses) - House_Addresses is the layer name in this example.

Also, if you have not heard of Fuzzy tables/relationships you should look into them. They may provide a solution.

Here's the tool for excell: https://www.microsoft.com/en-au/download/details.aspx?id=15011

Python: https://pypi.python.org/pypi/fuzzywuzzy

Video: https://www.youtube.com/watch?v=3v-qxcjZbyo

Stack Exchange Network

Query where one field includes the text string in another field

3 Answers 3

Your Answer

Sign up or log in

Post as a guest

Post as a guest

Hot Network Questions

Query where one field includes the text string in another field

3 Answers 3

Your Answer

Sign up or log in

Post as a guest

Post as a guest

Related

Hot Network Questions