Replacing column of comma separated strings of IDs with values stored in another field based on those IDs using QGIS

Question 1

I have a layer with three columns:

geoid: ID of each polygon (string)
totpop: Total population of each polygon (integer)
neigh6: Comma separated list of the IDs of polygons that neighbor that polygon (string)

So for example, instead of a column with these geoids: 49047940201, 49013940600, 49013940500

I want a column with population for each of those GEOIDs listed: 5039, 6893, 1094

I want to create a new field that has the population of the tract instead of the ID of that tract, eventually to sum up and identify the average population of the neighboring tracts.

I was trying to figure it out with the array_foreach() function, but got stuck. This doesn't work, (results in all null values), but I wasn't expecting it to since it doesn't account for the specific GEOID.

array_foreach(
 string_to_array("neigh6", ','),
 @element="TOTPOP")

attempt

Question 2

For this I would recommend using a Virtual Layer (Database --> DB Manager --> Virtual Layers --> Project layers). Then you can write a SQL Statement where you want to join your data on itself and use aggregate functions (QGIS supports the SQLite SQL syntax in virtual layers). For the ON part of the LEFT JOIN a LIKE between the neigh6 in t1 and the GEOID in t2 is used. Before and after the value a , is added so we have an unique match.

The query would look like (small modification form here) - You will need to replace sampleDataAgg with the name of your current layer:

SELECT 
 t1.*,
 GROUP_CONCAT(t2.TOTPOP, ',') as "GROUPED NEIGH6 TOTPOP",
 AVG(t2.TOTPOP) as "AVERAGE NEIGH6 TOTPOP"
FROM 
 sampleDataAgg as t1 LEFT JOIN sampleDataAgg as t2 
ON ',' || t1.neigh6 || ',' LIKE '%,' || t2.GEOID || ',%'
GROUP BY t1.GEOID;

Then you can load the virtual new layer using Load as new layer. If you want to work with the data later it is recommended to export the virtual layer. enter image description here

EDIT: After looking at your sample for a second time I think you will need to slightly modify the ON part of the LEFT JOIN like:

ON ', ' || t1.neigh6 || ',' LIKE '%, ' || t2.GEOID || ',%'

So it also takes into account that there is an extra space after the comma.

Question 3

You expression is a good start, but you need to do some more stuff. Explanation as comments in the expression:

array_to_string( -- turn the result into a string
 array_foreach( -- do the below stuff for each GEOID in neigh6 field
 string_to_array("neigh6",','), -- 1. step: turn the string of ids into an array of ids
 attribute( -- 4. step; see below
 get_feature(@layer,'GEOID', -- 3. step: get the feature with this GEOID
 to_int(@element) -- 2. step: turn each id of the array into an integer. DO NOT DO THIS IF YOUR IDS ARE STRINGS! If so, simply use @element here.
 )
 ,'TOTPOP') -- 4. step: read the attribute "TOTPOP" from the feature
 )
)

Be aware of the datatypes! If you use array_to_string() or string_to_array() all values will always be of type string! If they are integers, doubles or other types, you need to cast them to that type first!

Just for completness: of course your GEOID's must be unique, otherwise the expression will pick the first result.

Bernd Loigge Bernd Loigge 3,19011 silver badges18 bronze badges · Accepted Answer · 2022-07-23 21:46:47Z

For this I would recommend using a Virtual Layer (Database --> DB Manager --> Virtual Layers --> Project layers). Then you can write a SQL Statement where you want to join your data on itself and use aggregate functions (QGIS supports the SQLite SQL syntax in virtual layers). For the ON part of the LEFT JOIN a LIKE between the neigh6 in t1 and the GEOID in t2 is used. Before and after the value a , is added so we have an unique match.

The query would look like (small modification form here) - You will need to replace sampleDataAgg with the name of your current layer:

SELECT 
 t1.*,
 GROUP_CONCAT(t2.TOTPOP, ',') as "GROUPED NEIGH6 TOTPOP",
 AVG(t2.TOTPOP) as "AVERAGE NEIGH6 TOTPOP"
FROM 
 sampleDataAgg as t1 LEFT JOIN sampleDataAgg as t2 
ON ',' || t1.neigh6 || ',' LIKE '%,' || t2.GEOID || ',%'
GROUP BY t1.GEOID;

Then you can load the virtual new layer using Load as new layer. If you want to work with the data later it is recommended to export the virtual layer. enter image description here

EDIT: After looking at your sample for a second time I think you will need to slightly modify the ON part of the LEFT JOIN like:

ON ', ' || t1.neigh6 || ',' LIKE '%, ' || t2.GEOID || ',%'

So it also takes into account that there is an extra space after the comma.

Stack Exchange Network

Replacing column of comma separated strings of IDs with values stored in another field based on those IDs using QGIS

2 Answers 2

Your Answer

Sign up or log in

Post as a guest

Post as a guest

Linked

Hot Network Questions

Replacing column of comma separated strings of IDs with values stored in another field based on those IDs using QGIS

2 Answers 2

Your Answer

Sign up or log in

Post as a guest

Post as a guest

Linked

Related

Hot Network Questions