adjecency list model vs nested sets

Question 1

I've stumbled upon an article describing nested sets model for storing hierarchical data and was quite intrigued by the supposed performance boost.

after implementing this in our database, i've found that the adjecency list model performs far better. though the article I've read and a couple of SA posts (one, two) stated that any tree traversal is supposed to benefit, my tests demonstrate the opposite.

Am I missing something?

there are roughly 1.1 Million entries in the following table:

SQL> show table tax_nodes;
TAX_ID INTEGER Not Null
PARENT_TAX_ID INTEGER Nullable
RANK VARCHAR(50) Nullable
DIVISION_ID INTEGER Nullable
TREE_RIGHT INTEGER Nullable
TREE_LEFT INTEGER Nullable
CONSTRAINT INTEG_202:
 Primary key (TAX_ID)

Adjecency List Query to fetch parents

with recursive lineage as 
(select tax_id, parent_tax_id 
 from tax_nodes 
 where tax_id = 365612
 union ALL 
 select tax_id, parent_tax_id 
 from tax_nodes t 
 join lineage l on l.parent_tax_id = t.tax_id
) 
select l.tax_id
from lineage l

Nested Sets Query to fetch parents

select
 p2.tax_id from tax_nodes as p1,
 tax_nodes as p2
where
 p1.tree_left between p2.tree_left and p2.tree_right
 and p1.tax_id = 365612

Result

365612 336487 10260 10241 10240 35237 10239 1 //1 is the root

Why is the Adjecency List performing better though most resources suggest otherwise?

I'm running Firebird 2.5 superserver by the way.

UPDATE

On @MarkRotteveel's hint, I took a look at the execution plans and I suspect the NATURAL being the culprit here. Question now is, how do I get rid of it?

adjecency list execution plan

PLAN (LINEAGE TAX_NODES INDEX (RDB$PRIMARY108))
PLAN (LINEAGE T INDEX (RDB$PRIMARY108))

nested sets execution plan

PLAN JOIN (P1 INDEX (RDB$PRIMARY108), P2 NATURAL)

Question 2

What are the execution plans for both queries?

Question 3

Many thanks, @MarkRotteveel, I've updated the question.

Question 4

Could you post the CREATE TABLE as well (including extra indexes if any)?

Question 5

@MarkRotteveel: yes, I've updated the question again

Question 6

The execution plan of your adjacency list example shows that it is able to use the primary key index (RDB$PRIMARY108) for both parts of the UNION. This is very efficient, mainly because you start with a single value, and then recursively look up its descendents (which is probably just a small set of the entire set of records).

For the nested set execution plan it is not able to use an index, because it has to scan all records for the values of p2.tree_left and p2.tree_right. It can only use the primary key index for looking up p1.tax_id = 365612.

You might be able to improve execution if you add a descending index for tax_nodes.tree_left and an ascending index for tax_nodes.tree_right. You may need to swap ascending/descending because I usually do it wrong initially, even if I account for usually doing it wrong. However you may also need to change the query condition so the optimizer picks the indexes for tree_left and tree_right:

p2.tree_left <= p1.tree_left AND p2.tree_right >= p1.tree_left

Question 7

I've added the indeces and the plan looks better already. I will benchmark it and report back. using a between b and c and b<=a and a>=c produced the same plan, though: PLAN JOIN (P1 INDEX (RDB$PRIMARY175), P2 INDEX (IDX_TAX, UQ_TAX2))

Question 8

@lordvlad If BETWEEN produces the same plan, then the Firebird optimizer is probably better at transforming the condition than I expected ;)

Question 9

benchmarks say that the two querys are now more or less equally performant. Thanks!

Question 10

ookay. so now I've added an index to the column PARENT_TAX_ID and now the adjecency-model is again 4-10 times faster, though the execution plans did not change. can I now safely assume that firebird's recursive CTEs are better than nested sets?

Question 11

@lordvlad I don't think you can conclusively say that, it probably depends on the distribution of data etc. However if adding an index improves the query time, I'd expect that index to show up in the plan.

Mark Rotteveel Mark Rotteveel 1,2367 silver badges17 bronze badges · Accepted Answer · 2014-07-10 08:20:30Z

The execution plan of your adjacency list example shows that it is able to use the primary key index (RDB$PRIMARY108) for both parts of the UNION. This is very efficient, mainly because you start with a single value, and then recursively look up its descendents (which is probably just a small set of the entire set of records).

For the nested set execution plan it is not able to use an index, because it has to scan all records for the values of p2.tree_left and p2.tree_right. It can only use the primary key index for looking up p1.tax_id = 365612.

You might be able to improve execution if you add a descending index for tax_nodes.tree_left and an ascending index for tax_nodes.tree_right. You may need to swap ascending/descending because I usually do it wrong initially, even if I account for usually doing it wrong. However you may also need to change the query condition so the optimizer picks the indexes for tree_left and tree_right:

p2.tree_left <= p1.tree_left AND p2.tree_right >= p1.tree_left

I've added the indeces and the plan looks better already. I will benchmark it and report back. using a between b and c and b<=a and a>=c produced the same plan, though: PLAN JOIN (P1 INDEX (RDB$PRIMARY175), P2 INDEX (IDX_TAX, UQ_TAX2))
@lordvlad If BETWEEN produces the same plan, then the Firebird optimizer is probably better at transforming the condition than I expected ;)
benchmarks say that the two querys are now more or less equally performant. Thanks!
ookay. so now I've added an index to the column PARENT_TAX_ID and now the adjecency-model is again 4-10 times faster, though the execution plans did not change. can I now safely assume that firebird's recursive CTEs are better than nested sets?
@lordvlad I don't think you can conclusively say that, it probably depends on the distribution of data etc. However if adding an index improves the query time, I'd expect that index to show up in the plan.

Stack Exchange Network

adjecency list model vs nested sets

1 Answer 1

Your Answer

Sign up or log in

Post as a guest

Post as a guest

Hot Network Questions

adjecency list model vs nested sets

1 Answer 1

Your Answer

Sign up or log in

Post as a guest

Post as a guest

Related

Hot Network Questions