0

Why does casting this result of REGEXP_SUBSTR() to a DECIMAL fail?

SELECT
 REGEXP_SUBSTR('Cost (-14ドル.18)', '(?<=Cost [(]-[$])[0-9.]+') AS _extracted,
 CAST(REGEXP_SUBSTR('Cost (-14ドル.18)', '(?<=Cost [(]-[$])[0-9.]+') AS DECIMAL(8,2)) AS cost_1,
 CAST((SELECT _extracted) AS DECIMAL(8,2)) AS cost_2,
 CAST((SELECT _extracted) * 1 AS DECIMAL(8,2)) AS cost_3,
 CAST('14.18' AS DECIMAL(8,2)) AS cost_4;
+------------+--------+--------+--------+--------+
| _extracted | cost_1 | cost_2 | cost_3 | cost_4 |
+------------+--------+--------+--------+--------+
| 14.18 | 14.00 | 14.00 | 14.18 | 14.18 |
+------------+--------+--------+--------+--------+

Casting a plain string as in cost_4 seems to work. Multiplying the REGEXP_SUBSTR() result by 1 also appears to work. But simply casting the result as I've done with cost_1 and cost_2 fails to produce the correct fixed point version of _extracted.

Oddly, in my application using the backreference as I've done for cost_2 actually produces the correct result. Was unable to reproduce elsewhere but thought it worth mentioning.

asked May 19, 2021 at 8:41

2 Answers 2

2

This has been a long-standing issue with MySQL with people reporting this very issue as a bug since 2011. I have found that the problem is almost completely dependent on the collation being used within the REGEXP_SUBSTR() function.

For instance, if you cast the result of REGEXP_SUBSTR() as a CHAR(100), your decimals remain intact:

mysql> SELECT CAST(CAST(REGEXP_SUBSTR('Cost (-14ドル.18)', '[0-9.]+') AS CHAR(100)) AS DECIMAL(8,2)) AS result;
result
----- 
14.18

The result returned by REGEXP_SUBSTR() used a UTF-16 character set before MySQL 8.0.17. Versions after this supposedly use the same character set as configured by the client (See bug #94203 reported by Rick James), but this does not appear accurate. My SQL client is configured to use UTF-8 everywhere. Running your initial query in my client produces the exact same results as you shared in the question.

However, if I CONVERT( ... USING 'UTF8'):

SELECT CAST(CONVERT(REGEXP_SUBSTR('Cost (-14ドル.18)', '[0-9.]+') USING 'UTF8') AS DECIMAL(8,2)) AS result;
result
----- 
14.18

Surprise, surprise. A correct number.

Generally in this situation I do the very same thing that you did for cost_3; I multiply the returned value by 1, then cast it to the desired type. You can save a step by casting as FLOAT, but this sometimes has precision implications.

It is not a great answer, but it is one that can be used across multiple versions of MySQL.

answered May 19, 2021 at 9:27
1
  • That's about as thorough an explanation as I could have hoped for. Presumably this also sheds light on why my backreference may have worked in some cases but still scratching my head about that one. And yes, I was encountering float precision issues which is exactly what led me to this point. Thanks! Commented May 19, 2021 at 9:41
1

Not CAST. Use

FORMAT(expression, 2) -- for displaying with 2 decimal places
ROUND(expression, 2) -- for further computation
answered May 19, 2021 at 19:45
4
  • but, why does this fix the issue? Commented May 19, 2021 at 20:07
  • FORMAT(x, 2) returns float, not a fixed point number. This leads to precision issues and results like 18.18 + 9.69 = 27.869999999999997 which is why I'm using CAST() to begin with. Commented May 19, 2021 at 20:33
  • @billynoah - I understood our Question to be talking about display, not storage or further computation. Commented May 20, 2021 at 5:45
  • In that case I've already accomplished the display aspect in step one where REGEXP_SUBSTR() is used to extract the number from the string. You can infer that decimal computation is required by the fact that I'm casting as a decimal. Commented May 20, 2021 at 11:39

Your Answer

Draft saved
Draft discarded

Sign up or log in

Sign up using Google
Sign up using Email and Password

Post as a guest

Required, but never shown

Post as a guest

Required, but never shown

By clicking "Post Your Answer", you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.