2

Have a look at the following XML data:

<data>
 <test color="red">Red text</test>
 <test color="green">green</test>
</data>

Let's say I have several xml-documents with this structure in my database:

CREATE TABLE xmldata (
 id bigserial not null,
 documents xml)

Now I want to select all possible colors:

SELECT id, xpath('//test', xml) FROM xmldata;

But this returns a table with the id of each document and a text-array of the test-nodes. Furthermore, documents without any "test" node exist in the result as well - with an empty array {}

What I really want is a table like this:

| id | node |
| 1 | <test color="red">Red text</test> |
| 1 | <test color="green">green</test> |

What is the syntax I have to use?

I heard that xpath_table may be the function to use - but this function is marked as deprecated...

(The returned table has to have one line for each occurence of the node I searched for. The node itself maybe an xml-snippet, text or something else - isn't really important)

asked Nov 9, 2016 at 9:50

2 Answers 2

3

There is nothing wrong with using xpath_table().

Even though some of the functions in the xml2 extension have been replaced with in-core functions, xpath_table is not one of them (I think this will happen for Postgres 10).

Until then, xpath_table() seems to be your only option:

SELECT * 
FROM xpath_table('id',
 'documents',
 'xmldata',
 '//test',
 'true') AS t(doc_id integer, data text);

Returns:

doc_id | data 
-------+---------
 1 | Red text
 1 | green 

In order for this to work you first need to create the extension (as a superuser):

create extension xml2;

Otherwise the function isn't available.

answered Nov 9, 2016 at 10:22
1

You have yet another alternative, using regexp_matches:

 WITH s0 AS
 (
 -- Your original query
 SELECT 
 id, xpath('//test', documents) AS x
 FROM 
 xmldata
 )
 , s1 AS
 (
 -- We unnest the array (convert it to rows)
 SELECT
 id, unnest(x) AS xml_node
 FROM
 s0
 )
 SELECT
 id, 
 xml_node, 
 (regexp_matches(xml_node::text, '<test[^>]*>(.*)<\/test>'))[1] AS data
 FROM
 s1 ;

 id | xml_node | data
-: | :-------------------------------- | :------- 1 | <test color="red">Red text</test> | Red text 1 | <test color="green">green</test> | green

... and you can have everything in just one SELECT

 -- Compacted version
 SELECT 
 id, (regexp_matches(unnest(xpath('//test', documents))::text, '<test[^>]*>(.*)<\/test>'))[1] AS xml_node
 FROM 
 xmldata

 id | xml_node
 -: | :-------
 1 | Red text
 1 | green

dbfiddle here

answered Jun 4, 2017 at 9:42

Your Answer

Draft saved
Draft discarded

Sign up or log in

Sign up using Google
Sign up using Email and Password

Post as a guest

Required, but never shown

Post as a guest

Required, but never shown

By clicking "Post Your Answer", you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.