1

Couldn't find any solution to a specific sequence. Context: italian cadaster, where plot number is embedded in a table coded in kml, in the "description" field. I want to extract the plot number.

Part of the code is:

...<tr>
<td>SVILUPPO</td>
<td>25</td>
</tr>
<tr bgcolor="#D4E4F3">
<td>NUMERO</td>
<td>156</td>
</tr>
<tr>
<td>LIVELLO</td>..."

In this example, I want to get: 156. There are other categories (commune, cadastral sheet, etc., with other numbers. I need to identify the right number quoting "NUMERO".

Tried:

regexp_substr( "description" , 'NUMERO</td> \n\n <td>(\\d+)<' )

or

regexp_substr( "description" , 'NUMERO \\D+ (\\d+) <' )

With:

'NUMERO as the start of the sequence, \D+ as any non decimal character, (\d+) to extract subchain with any number < to close the sequence.

Both formulas are valid, but I get 'NULL' in return. I can't see why. Any help much welcome.

asked Oct 16, 2023 at 10:17

1 Answer 1

4

Assuming the code you are trying to match is exactly as displayed in the question, you have errors in both of your regexes.

In the case of the first one, there were a few changes:

  • you need to double escape (\\) all of the backslashes, not just the ones for \\d
  • remove the spaces
  • I also needed to add \\r in front of \\n, but \r\n is a Windows-specific newline, and may or may not be relevant for you depending on operating system

regexp_substr("description", 'NUMERO</td>\\r\\n\\r\\n<td>(\\d+)<')

enter image description here

In the case of your second attempt, it worked when I removed the spaces - and is probably more robust:

regexp_substr("description", 'NUMERO\\D+(\\d+)<')

The following similar expression also worked for me:

regexp_substr("description" , 'NUMERO[^\\d]+(\\d+)<')

answered Oct 16, 2023 at 12:34
1
  • 2
    The newlines can be more easily addressed using the whitespace character class, \\s, since that character class captures carriage returns and linefeeds. The 'NUMERO</td>\\r\\n\\r\\n<td>(\\d+)<' can be simplfied to 'NUMERO</td>\\s*<td>(\\d+)<'. Commented Nov 24, 2023 at 17:28

Your Answer

Draft saved
Draft discarded

Sign up or log in

Sign up using Google
Sign up using Email and Password

Post as a guest

Required, but never shown

Post as a guest

Required, but never shown

By clicking "Post Your Answer", you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.