I have an attribute table that shows how many buildings were built in what decade. The columns show the percentages for every 10 year (1939, 1940, 1950, 1960, 1970, 1980, 1990,2000, 2010) step for every polygon.
This is what it looks like:
Now I need to find out in which of the columns the maximum distribution lies. In the end, I'm supposed to have a choropleth map that shows the main decade it was built in. Does that make any sense?
I've already tried working with the styles but for the categorized settings you can always just choose one column but I need 9. Any tips?
4 Answers 4
Another method is to add a new column, "Year Constructed," using a conditional expression like this:
Case
when "P1939" = max("P1939", "P1940", "P1950", "P1960", "P1970", "P1980", "P1990", "P2000", "P2010") then 1939
when "P1940" = max("P1939", "P1940", "P1950", "P1960", "P1970", "P1980", "P1990", "P2000", "P2010") then 1940
when "P1950" = max("P1939", "P1940", "P1950", "P1960", "P1970", "P1980", "P1990", "P2000", "P2010") then 1950
when "P1960" = max("P1939", "P1940", "P1950", "P1960", "P1970", "P1980", "P1990", "P2000", "P2010") then 1960
when "P1970" = max("P1939", "P1940", "P1950", "P1960", "P1970", "P1980", "P1990", "P2000", "P2010") then 1970
when "P1980" = max("P1939", "P1940", "P1950", "P1960", "P1970", "P1980", "P1990", "P2000", "P2010") then 1980
when "P1990" = max("P1939", "P1940", "P1950", "P1960", "P1970", "P1980", "P1990", "P2000", "P2010") then 1990
when "P2000" = max("P1939", "P1940", "P1950", "P1960", "P1970", "P1980", "P1990", "P2000", "P2010") then 2000
when "P2010" = max("P1939", "P1940", "P1950", "P1960", "P1970", "P1980", "P1990", "P2000", "P2010") then 2010
else 0
end
Apply a categorized style using the new "Year Constructed" column.
Here's an alternative method using rule-based styling.
Set up a graduated style with 9 classes with your desired color ramp. Choose the color ramp carefully, because if you want won't be able to change this later. It doesn't matter what field or expression you use for "column", because you're going to change that in step 3. The entire point of this step is to establish the color gradient.
Change the style from "graduated" to "rule-based." The 9 graduated classes will be automatically converted to rules.
Change the label of the first rule to "Constructed in 1939", and change the filter expression to this:
"P1939" = max("P1939", "P1940", "P1950", "P1960", "P1970", "P1980", "P1990", "P2000", "P2010")
Repeat for all the rules, changing "P1939" to "P1940," "P1950," and so on.
Note: There's no way to apply a color ramp to an existing set of rules. If you want to change the symbol colors later, you have to change the color for each rule, one at at a time.
-
Thanks a lot! This was an easy and quick solution to my problem!Darleen– Darleen2019年07月15日 07:49:56 +00:00Commented Jul 15, 2019 at 7:49
This is possible, but the expression needed to compute it is a bit convoluted. I will explain below.
Your task is to find the column name which has the highest value for every feature. If the highest value is in P1940 column, your answer is 1940, if the highest value is in P1980, your answer is 1980 and so on.
You can do this using array
functions in QGIS. We are using array_sort()
function here, which is available in QGIS 3.6 onwards only.
Here's the algorithm
Create an array of all the values from relevant columns.
array("P1940", "P1950", "P1960", "P1970", "P1980", "P2000", "P2010")
This will get us an array like [0, 10, 20, 40, 10, 0, 0]
Sort the array and find the last value. This will be the highest value.
array_last(array_sort(array("P1940", "P1950", "P1960", "P1970", "P1980", "P2000", "P2010")))
This will be the value 40.
Lookup the index of this highest value in the original array.
array_find(array("P1940", "P1950", "P1960", "P1970", "P1980", "P2000", "P2010"), array_last(array_sort(array("P1940", "P1950", "P1960", "P1970", "P1980", "P2000", "P2010"))))
This will be 3. As index counting starts from 0, and the highest value is in the 4th place, so we get 3.
Lookup the index from a list of years in the same order as the original array.
array_get(array(1940, 1950, 1960, 1970, 1980, 1990, 2000, 2010), array_find(array("P1940", "P1950", "P1960", "P1970", "P1980", "P2000", "P2010"), array_last(array_sort(array("P1940", "P1950", "P1960", "P1970", "P1980", "P2000", "P2010")))))
This will be the value 1970.
The expression looks complex as we are repeating a lot of code as we can declare and use variables.
In your case, just create a new virtual field with the expression in Step 4 and you should see the correct values for each row that you can use for styling.
Hi using a tool like Miller https://github.com/johnkerl/miller starting from this example input file
p1940,p1950,p1960,p1970
436,490,446,195
526,320,963,780
220,888,705,831
and running
mlr --csv merge-fields -a max -r "^[a-z]" -o value -k then put '
for (key, value in $*) {
if (value == $value_max && key != "value_max") {
$fieldName=key;
$valueField=gsub(key,"p","")
}
}
' input.csv
You will have in output the max value by row, the field name in which you have the max by row, the value of the field name in which you have the max by row:
p1940,p1950,p1960,p1970,value_max,fieldName,valueField
436,490,446,195,490,p1950,1950
526,320,963,780,963,p1960,1960
220,888,705,831,888,p1950,1950
This is a pretty print version
+-------+-------+-------+-------+-----------+-----------+------------+
| p1940 | p1950 | p1960 | p1970 | value_max | fieldName | valueField |
+-------+-------+-------+-------+-----------+-----------+------------+
| 436 | 490 | 446 | 195 | 490 | p1950 | 1950 |
| 526 | 320 | 963 | 780 | 963 | p1960 | 1960 |
| 220 | 888 | 705 | 831 | 888 | p1950 | 1950 |
+-------+-------+-------+-------+-----------+-----------+------------+
Could it be useful to your goal?
max
-function in the field calculator (which can also be use as a base for classification)?