In QGIS 3.18, I have a vector layer of 1000+ features with 100+ attributes, many of which are strings of concatenated elements (substrings) delimited by commas. Some of those elements need to be replaced by other elements - entire elements as delimited, not parts of elements.
What I have done in the past to get the job done is to use replace()
expressions with 20+ clauses in them. Besides being long and unwieldy, this method was extremely impractical due to the fact that some of the elements are contained as substrings within other elements: for example, one element might be '51.1'
, another might be '51.11'
or '51.111'
, but these need to be treated and replaced differently. With the replace
method, this got very messy very fast and required a lot of cleanup work.
I am looking to do this with an array
function, which recognizes delimited elements. I have tried several attempts, such as: array_to_string(regexp_replace(string_to_array("Field_A",','),'51.1','String1'))
which the Field Calculator recognizes as a valid expression but which returns no results, and
array_to_string(
array_foreach(
string_to_array("Field_A",','),
regexp_replace(@element,'51.1','String1')
)
)
which returns results but seems not to be identifying the delimited elements, as I have the same problem as with replace
. Example: 51.11
is replaced as String11
.
I need to be able to list many substring replacements in one expression, if possible.
As an example of the sort of thing I want to do, see the last line of the table below:
Note: I cannot use array_replace
as I am unable to upgrade to QGIS 3.20 on this device at the moment, and this function is unavailable on 3.18.
4 Answers 4
You can achieve this combining arrays with maps (key:value
pairs, see help Maps Functions) and the following expression, where in line 4 you list the elements you want to replace, using the pattern 'value_old','value_new'
.
Explanation: The expression creates an array of all inputs in fieldA
(line 6) and for each element (line 5) looks in the map (the one from line 4, created as a variable @map
: line 2/3) at what position a key with this values from the array occurs in the map and returns the corresponding value (function map_get, line 7):
array_to_string (
with_variable(
'map',
map('51.1','String1','51.11','String2','51.111','String3','51.13','String4','51.15','String5'),
array_foreach (
string_to_array (fieldA),
map_get (@map,@element)
)
)
)
-
1A bunch of great and helpful answers on this thread -- thanks to all. I accepted this one because of its simplicity and because it fit really well with my existing workflow.pete– pete2021年11月13日 21:44:19 +00:00Commented Nov 13, 2021 at 21:44
-
1Brilliant! Excellent use case of
map()
andmap_get()
!!!Kazuhito– Kazuhito2021年11月14日 10:19:23 +00:00Commented Nov 14, 2021 at 10:19 -
1Thanks @Kazuhito - I feel honored by your comment. Indeed, combining maps and arrays offers a lot of (still underused, I have the impression) possibilities.Babel– Babel2021年11月14日 10:21:52 +00:00Commented Nov 14, 2021 at 10:21
-
1Very elegant indeed. Thanks for that.fastest– fastest2021年11月14日 11:06:12 +00:00Commented Nov 14, 2021 at 11:06
This seems to me a good task to be solved with Python and Function Editor. I have tried to build a minimum code fragment that uses the data provided in your example:
Start from a memory layer that has 2 attributes
"StringSource"
and"StringDest"
and six features that have the"StringSource"
attribute set as in your example:
2.Using the function editor, I define the function str_convert()
as follows:
from qgis.core import *
from qgis.gui import *
def map_str(str_in=None):
lst_str_map = ["51.1", "51.11", "51.111", "51.13", "51.15"]
if str_in not in lst_str_map:
raise ValueError("Invalid source for string.")
return f"String{lst_str_map.index(str_in):02d}"
@qgsfunction(args='auto', group='Custom', usesgeometry=False)
def str_convert(feature, parent):
lst_in = [el for el in feature["StringSource"].split(",") if el]
str_out = ",".join([map_str(el) for el in lst_in])
return str_out
The function shows up under the "Custom" menu and the expression editor shows the result of the function evaluated for a given feature.
Finally, using the field calculator, apply the
str_convert()
function to compute the content of the"StringDest"
attribute for each feature.After clicking Ok, the
"StringDest"
attribute of the feature is correctly filled.
The nice thing is that the code to convert the strings is fully encapsulated into the map_str()
function and it can be changed rather easily.
You can use PyQGIS (I don't have all numbers in my sample data, so the numbering isn't correct):
lyr = QgsProject.instance().mapLayersByName('New scratch layer')[0]
sourcefield = 'data1'
targetfield = 'data2'
allvals = [f[sourcefield] for f in lyr.getFeatures()] #List all values
allvals = [f.split(',') for f in allvals] #Split them into individual elements
allvals = list(set([item for sublist in allvals for item in sublist])) #Unpack all sublist and remove duplicate elements
allvals.sort(key=lambda x: float(x)) #Sort
replacedict = {x:'String{}'.format(e) for e, x in enumerate(allvals, 1)} #Create a dictionary with element number as key, and enumerate number as value
#Update the field
ix = lyr.fields().indexFromName(targetfield)
attrmap = {}
for f in lyr.getFeatures():
newval = ','.join([replacedict[x] for x in f[sourcefield].split(',')])
attrmap[f.id()] = {ix: newval}
lyr.dataProvider().changeAttributeValues(attrmap)
You can use a regex expression to find the text to be replaced. It should be '(^|,)15.1(,|$)'
The first (^|,)
is a capture group, which means to find either the beginning of the string (^
) or a ,
. Then there is your text of interest and at last another capture group (,|$)
which finds either a ,
or the end of the string ($
)
With this expression, it will find 15.1
, 0,15.1
, 15.1,0
, 0,15.1,0
but NOT 15.11
When replacing the found string, you must include the two search groups in the replaced text, using \1円
and \2円
For a single text to be replaced, the expression becomes:
regexp_replace(myColumn,'(^|,)15.1(,|$)','\1円string1\2円'),'(^|,)
To have multiple replacements, you can either nest calls to regexp_replace
, such as
regexp_replace(
regexp_replace(
myColumn, '(^|,)15.1(,|$)', '\1円string1\2円'
), '(^|,)15.2(,|$)', '\1円string2\2円'
)
Or, if you have many, you can create a function that makes X replacement one after another. It would be cleaner and easier to maintain, as in @fastest answer.
Explore related questions
See similar questions with these tags.
'51.1'
to'^51.1$'
in your second expression, perhaps?regexp_replace
terms, I can't get the expression to work. (I have a lot of terms to replace, so I need them all in one expression if possible.)Existing data
in your screenshot? And how should the output look like?