2

Base: csv file with content structured like this:

"Role-ID","Role name","userid","name","name2","name3","email"

Content example:

"Role-ID","Role name","userid","name","name2","name3","email"
"312506","public-readonly","098750","doe_j","John","Doe","[email protected]"
"312507","public-readwrite","022968","smith_m","Michael","Smith","[email protected]"
"312509","private-sysadmin","086847","bloggs_j","Joe","Bloggs","[email protected]"
"312508","private-readwrite","086847",bloggs_j","Joe","Bloggs","[email protected]"
"312506","public-readonly","013658","schmoe_h","Hannah","Schmoe","[email protected]"
"312511","private-analytics","086847",bloggs_j","Joe","Bloggs","[email protected]"

The list contains multiple roles for every user. I like to create a new table with with unique user mails in the first column and the number of user roles for each mail address written in a second column. All of this should be done in the model builder since I want to create a repeatable process.

I imported the CSV file into QGIS (all string values) and achieved calculating the sum of roles with the field calculator using count(), but I struggle with getting rid of the duplicates. When I use Delete duplicates by attribute and use the value "email", the QGIS model just tells me the following:

Field "email" not found in INPUT layer, skipping Error encountered while running Delete duplicates by attribute: No input fields found

Alternatively, I tried the Aggregate tool, but I also couldn't get through the error code I received:

Parser error in expression "count(, "email")": syntax error, unexpected COMMA

I tried to delete the "," in the delimiter of the Dissolve tool, but that had no effect.

What am I missing? I guess it has something to do with my input.

J.R
18.1k2 gold badges25 silver badges57 bronze badges
asked Jan 17 at 14:13
1
  • 1
    This seem better done using a virtual layer with SQL query (those are dynamic and will auto-update ), or by using the "Execute SQL" processing tool in your model Commented Jan 17 at 14:39

2 Answers 2

2

You can use function Aggregate for this. Use field email as Group by expression. Delete the fields that you don't want to retain and/or add additional fields.

To count, use the expression count (@id, group_by:=email) as Source expression and give the newly created field a name (here: Email_count).

For Role name, use concatenate as for Aggregate function, define the desired delimiter (here /) and name the field (here: Role_list).

Be sure to define field types and length accordingly. You can see the settings, the input and the result in the screenshot:

enter image description here

J.R
18.1k2 gold badges25 silver badges57 bronze badges
answered Jan 20 at 11:39
1

By using the following SQL query (either in a virtual layer query or with the Execute SQL processing tool):

SELECT
 "email" ,
 Count("Role name") AS "Email count",
 group_concat("Role name", ' / ') AS "Role list"
FROM
 "Exemple data"
GROUP BY
 "email"
ORDER BY
 "email"

You should get what you want (I added the group_concat("Role name", ' / ') AS "Role list" part to also get a slash separated list of role for each email in a "Role list column).

With your exemple data the result are :

enter image description here

answered Jan 20 at 9:55

Your Answer

Draft saved
Draft discarded

Sign up or log in

Sign up using Google
Sign up using Email and Password

Post as a guest

Required, but never shown

Post as a guest

Required, but never shown

By clicking "Post Your Answer", you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.