2

I've a database value 'INS. Company Cancelled' where multiple values are separated by '.' I'm trying to replace 'INS' with 'INSEXP' my expected result is 'INSEXP. Company Cancelled'

I've tried below two queries to update the field my output is like 'INSEXP. Company Cancelled. Company Cancelled'

update my_table
SET column_name = (select replace(column_name, 'INS', 'INSEXP') 
 from my_table WHERE seq_num = 123)
WHERE seq_num = 123;
update my_table
set column_name = replace(column_name, 'INS', 'INSEXP')
WHERE seq_num = 123;

Could someone please tell me what I'm doing wrong here?

Vérace
31k9 gold badges73 silver badges86 bronze badges
asked Jan 31, 2021 at 8:42
2
  • Your statements look ok to me (the first is going to be slower as you’re running an additional subquery per row when you already have all the information needed). The second statement is what I would write. I suggest querying the row before and after your update to prove that this update statement is not responsible for doubling up the 'Company Cancelled' bit. Commented Jan 31, 2021 at 11:16
  • I extensively revised my answer - you might want to take a look? I believe that I've covered all the bases - if not, let me know! Commented Jan 31, 2021 at 18:37

1 Answer 1

3

You want to do something like this (see fiddle here):

CREATE TABLE test 
(
 t_id INTEGER PRIMARY KEY,
 x VARCHAR (50),
 seq_num INTEGER NOT NULL
);

INSERT INTO test VALUES (1, 'INS. Company Cancelled - 127', 127);
INSERT INTO test VALUES (2, 'INS. Company Cancelled - 128', 128);
INSERT INTO test VALUES (3, 'INS. Company Cancelled - 126_1', 126);
INSERT INTO test VALUES (4, 'INS. Company Cancelled - 126_2', 126);

The record below has 126 as the seq_num and the fragment INS. isn't at the beginning of the target string:

-- Note: 126 as seq_num!
--
-- BUT, also note that the `INS.` isn't at the beginning of the string!
--
INSERT INTO test 
VALUES 
(5, 'stuff... ___INS. Company Cancelled - 126', 126); 

Then:

SELECT REPLACE(x, 'INS. ', 'INSEXP. ') AS "Check:" FROM test;

Result:

Check:
INSEXP. Company Cancelled - 127
INSEXP. Company Cancelled - 128
INSEXP. Company Cancelled - 126_1
INSEXP. Company Cancelled - 126_2
stuff... ___INSEXP. Company Cancelled - 126

The problem here is that the substring INS. is being replaced everywhere seq_num is 126 whereas we only want it done where INS. occurs at the beginning of the target string!

So, now we try REGEXP_REPLACE:

SELECT 
 REGEXP_REPLACE(x, '(^INS)(\. )', '1円EXP2円') AS "Check:" 
FROM test 
WHERE seq_num = 126;

Result:

Check:
INSEXP. Company Cancelled - 126_1
INSEXP. Company Cancelled - 126_2
stuff... ___INS. Company Cancelled - 126

So, we can see that REGEXP_REPLACE changes the first two INS. substrings, but not the last one - where it doesn't occur at the beginning of the string!

So, now for the UPDATE:

UPDATE test t SET t.x = REGEXP_REPLACE(t.x, '(^INS)(\. )', '1円EXP2円') 
WHERE t.seq_num = 126;

And then (to check - always check!):

SELECT t.* FROM test t ORDER BY t.t_id;

Then:

T_ID X SEQ_NUM
 1 INS. Company Cancelled - 127 127
 2 INS. Company Cancelled - 128 128
 3 INSEXP. Company Cancelled - 126_1 126
 4 INSEXP. Company Cancelled - 126_2 126
 5 stuff... ___INS. Company Cancelled - 126 126

Et voilà! We have replaced the correct substring from the correct part of the target string - all due to the power of regular expressions!

You should also look into the REGEXP_REPLACE() functionality (the oracle-base site is excellent for all things Oracle!). I've only recently started mastering them and it's amazing what they can let you do in a one-liner that would take tons of code otherwise.

Explanation of the regex:

Now, here we specify the beginning of the string using the ^ anchor, so the replacement only takes place at the start of the string x - this specificity can be very helpful (indeed critical, as we can see in this case) when substituting text.

The ( and ) brackets are for "capturing groups" of text - so we can refer to <start_of_string> followed directly by INS followed by . (dot is also a special character - it's a single character wildcard, so it's escaped by \).

So, the group (^INS) is represented by the place-holder 1円 - remember the backslash is a special character in regexps! And the (\. ) group represents a . (full stop or period) followed by a space - because . without the backslash is a wildcard for one character in regexps!

Caveat: There will be a performance penalty to pay here - naturally enough - more complex functionality requires more processing, but I would strongly urge any database programmer/DBA &c. to master regular expressions - there are many tutorials out there.

As an aside (and this is more for me really), in exploring this issue, I came across this page (again, from Tim Hall's fabulous oracle-base site) - it explains 3 different ways of updating from queries!

The one I liked the best is the MERGE (3rd method) functionality (fiddle):

MERGE INTO test tt
 USING test st
 ON (tt.t_id = st.t_id AND tt.seq_num = 126)
WHEN MATCHED THEN
 UPDATE SET tt.x = REGEXP_REPLACE(st.x, '(^INS)(\. )', '1円EXP2円');

The end result is the same!

In this snippet, tt = target_table & st = source_table - of course, in this case, they're both the same and the simpler code above works - but if you're going to be UPDATEing from table_1 to table_2, you'll need mechanisms like this! A fiddle for this code is to be found here.

My second favourite was what Oracle calls an Inline View (but most of us normal folks call a sub-query) - it looks like this (fiddle here):

UPDATE
 (SELECT target_table.t_id, target_table.x, target_table.seq_num,
 REGEXP_REPLACE(source_table.x, '(^INS)(\. )', '1円EXP2円') y 
 FROM test target_table, test source_table
 WHERE target_table.t_id = source_table.t_id
 AND target_table.seq_num = 126) ilv
 
 SET ilv.x = ilv.y;

And finally, there is what the page (and Oracle?) calls the sub-query method - it uses the EXISTS predicate thus (fiddle):

UPDATE test t1
 SET (t1.x) = 
 (
 SELECT REGEXP_REPLACE(t2.x, '(^INS)(\. )', '1円EXP2円')
 FROM test t2 
 WHERE t1.seq_num = 126 AND t1.t_id = t2.t_id
 ) WHERE EXISTS (SELECT t.t_id, t.seq_num FROM test t WHERE t.t_id = t1.t_id 
 AND t.seq_num = 126);

Interestingly, there is also a performance analysis after each one, which suggest to me that none is the best in all cases and therefore we should test (with realistic datasets) our queries for performance before implementing them in production (don't we always? :-) )!

I'm giving a +1 to the question because I learned a lot from answering it - I hope you did too!

answered Jan 31, 2021 at 9:45
1
  • Thank you very much for the detailed answer, I have learnt new with your answer. After my proper debugging I came to know that there is a pre insert update trigger which was adding the extra string. Commented Feb 2, 2021 at 5:07

Your Answer

Draft saved
Draft discarded

Sign up or log in

Sign up using Google
Sign up using Email and Password

Post as a guest

Required, but never shown

Post as a guest

Required, but never shown

By clicking "Post Your Answer", you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.