4

I have a layer with three columns:

  1. geoid: ID of each polygon (string)
  2. totpop: Total population of each polygon (integer)
  3. neigh6: Comma separated list of the IDs of polygons that neighbor that polygon (string)

screenshot of attribute table

So for example, instead of a column with these geoids: 49047940201, 49013940600, 49013940500

I want a column with population for each of those GEOIDs listed: 5039, 6893, 1094

I want to create a new field that has the population of the tract instead of the ID of that tract, eventually to sum up and identify the average population of the neighboring tracts.

I was trying to figure it out with the array_foreach() function, but got stuck. This doesn't work, (results in all null values), but I wasn't expecting it to since it doesn't account for the specific GEOID.

array_foreach(
 string_to_array("neigh6", ','),
 @element="TOTPOP")

attempt

PolyGeo
65.5k29 gold badges115 silver badges350 bronze badges
asked Jul 23, 2022 at 14:22

2 Answers 2

2

For this I would recommend using a Virtual Layer (Database --> DB Manager --> Virtual Layers --> Project layers). Then you can write a SQL Statement where you want to join your data on itself and use aggregate functions (QGIS supports the SQLite SQL syntax in virtual layers). For the ON part of the LEFT JOIN a LIKE between the neigh6 in t1 and the GEOID in t2 is used. Before and after the value a , is added so we have an unique match.

The query would look like (small modification form here) - You will need to replace sampleDataAgg with the name of your current layer:

SELECT 
 t1.*,
 GROUP_CONCAT(t2.TOTPOP, ',') as "GROUPED NEIGH6 TOTPOP",
 AVG(t2.TOTPOP) as "AVERAGE NEIGH6 TOTPOP"
FROM 
 sampleDataAgg as t1 LEFT JOIN sampleDataAgg as t2 
ON ',' || t1.neigh6 || ',' LIKE '%,' || t2.GEOID || ',%'
GROUP BY t1.GEOID;

Then you can load the virtual new layer using Load as new layer. If you want to work with the data later it is recommended to export the virtual layer. enter image description here

EDIT: After looking at your sample for a second time I think you will need to slightly modify the ON part of the LEFT JOIN like:

ON ', ' || t1.neigh6 || ',' LIKE '%, ' || t2.GEOID || ',%'

So it also takes into account that there is an extra space after the comma.

answered Jul 23, 2022 at 21:46
1

You expression is a good start, but you need to do some more stuff. Explanation as comments in the expression:

array_to_string( -- turn the result into a string
 array_foreach( -- do the below stuff for each GEOID in neigh6 field
 string_to_array("neigh6",','), -- 1. step: turn the string of ids into an array of ids
 attribute( -- 4. step; see below
 get_feature(@layer,'GEOID', -- 3. step: get the feature with this GEOID
 to_int(@element) -- 2. step: turn each id of the array into an integer. DO NOT DO THIS IF YOUR IDS ARE STRINGS! If so, simply use @element here.
 )
 ,'TOTPOP') -- 4. step: read the attribute "TOTPOP" from the feature
 )
)

Be aware of the datatypes! If you use array_to_string() or string_to_array() all values will always be of type string! If they are integers, doubles or other types, you need to cast them to that type first!

Just for completness: of course your GEOID's must be unique, otherwise the expression will pick the first result.

answered Jul 24, 2022 at 13:07
0

Your Answer

Draft saved
Draft discarded

Sign up or log in

Sign up using Google
Sign up using Email and Password

Post as a guest

Required, but never shown

Post as a guest

Required, but never shown

By clicking "Post Your Answer", you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.