Modification to GEQO (Genetic Query Optimization) of PostgreSQL

Question 1

I need to implement a functionality that is in line with the GEQO functionality of PostgreSQL. I understand that GEQO approach is to encode query plans as integer strings and GEQO generates these possible join sequences at random. Source : http://www.postgresql.org/docs/9.3/static/geqo-pg-intro.html

My question: how to modify the GEQO function if I definitively know the right join sequence, so that I don't have to search different join sequences. For ex., if I knew that the best way of joining the 4 relations is 4-1-3-2, I needn't have to check for other permutations.

There aren't any good materials on how GEQO is implemented in PostgreSQL. PostgreSQL only gives the overall view of the GEQO functionality but doesn't explain much.

Or could I achieve this functionality in the standard_join_search() itself without using GEQO?

Question 2

It sounds like you want to implement query hints. That's all well and good, but you shouldn't expect to get the change accepted in PostgreSQL core because the project community isn't what you'd call a big fan of query hints. If you're serious about this, you'll need to read quite a bit of the query planner code and you'll need to figure out how to pass your hints from the parser down through the rewriter and into the planner. I don't see a quick and simple answer here. What you want to eventually do is force a particular path choice in the planner/optimizer.

Question 3

Ah, yes they are skeptical about query hints. I have done the reading of the planner code and it seemed GEQO would be a way to minimize the changes to the existing core.

Question 4

Is that what you are trying to achieve, to implement query hints to force join ordering? If so, look into whether anyone else has already implemented it. You should also consider why you need it, why the planner is making the wrong choices in the first place. Consider producing a self-contained test case and reporting to pgsql-performance.

Question 5

There is pg_hint_plan: en.sourceforge.jp/projects/pghintplan , but i didn't use it. One dba told me that it was working on 9.2. There is also article in russian about it habrahabr.ru/post/169751

Question 6

One way you can do this without the need to mess around with GEKO is by using CTE.

CTE are optimisation barriers, hence you could wrap the joins inside CTEs in the order you want and PG will be forced to do so.

For example if we want to force the DB to first join t1 with t2 and only then with t4 we could run something like:

explain 
with j1 as (select *,t1.c4 as t1c4 from t1 join t2 on (t1.c2=t2.id))
 ,j2 as (select * from j1 join t4 on (t1c4=t4.id))
select * from j2;

This will result in:

 QUERY PLAN 
-------------------------------------------------------------------------------
CTE Scan on j2 (cost=51485.00..67785.00 rows=815000 width=64)
CTE j1
 -> Hash Join (cost=3473.00..14521.00 rows=815000 width=40)
 Hash Cond: (t2.id = t1.c2)
 -> Seq Scan on t2 (cost=0.00..26.30 rows=1630 width=20)
 -> Hash (cost=1637.00..1637.00 rows=100000 width=20)
 -> Seq Scan on t1 (cost=0.00..1637.00 rows=100000 width=20)
CTE j2
 -> Hash Join (cost=289.00..36964.00 rows=815000 width=64)
 Hash Cond: (j1.t1c4 = t4.id)
 -> CTE Scan on j1 (cost=0.00..16300.00 rows=815000 width=44)
 -> Hash (cost=164.00..164.00 rows=10000 width=20)
 -> Seq Scan on t4 (cost=0.00..164.00 rows=10000 width=20)
(13 rows)

This is just an example, you can change it around as needed - in any case PG can't change the order between the different CTEs.

Hope it helps :)

cohenjo cohenjo 1,1987 silver badges6 bronze badges · Answer 1 · 2016-09-01 08:12:25Z

One way you can do this without the need to mess around with GEKO is by using CTE.

CTE are optimisation barriers, hence you could wrap the joins inside CTEs in the order you want and PG will be forced to do so.

For example if we want to force the DB to first join t1 with t2 and only then with t4 we could run something like:

explain 
with j1 as (select *,t1.c4 as t1c4 from t1 join t2 on (t1.c2=t2.id))
 ,j2 as (select * from j1 join t4 on (t1c4=t4.id))
select * from j2;

This will result in:

 QUERY PLAN 
-------------------------------------------------------------------------------
CTE Scan on j2 (cost=51485.00..67785.00 rows=815000 width=64)
CTE j1
 -> Hash Join (cost=3473.00..14521.00 rows=815000 width=40)
 Hash Cond: (t2.id = t1.c2)
 -> Seq Scan on t2 (cost=0.00..26.30 rows=1630 width=20)
 -> Hash (cost=1637.00..1637.00 rows=100000 width=20)
 -> Seq Scan on t1 (cost=0.00..1637.00 rows=100000 width=20)
CTE j2
 -> Hash Join (cost=289.00..36964.00 rows=815000 width=64)
 Hash Cond: (j1.t1c4 = t4.id)
 -> CTE Scan on j1 (cost=0.00..16300.00 rows=815000 width=44)
 -> Hash (cost=164.00..164.00 rows=10000 width=20)
 -> Seq Scan on t4 (cost=0.00..164.00 rows=10000 width=20)
(13 rows)

This is just an example, you can change it around as needed - in any case PG can't change the order between the different CTEs.

Hope it helps :)

Stack Exchange Network

Modification to GEQO (Genetic Query Optimization) of PostgreSQL

1 Answer 1

Your Answer

Sign up or log in

Post as a guest

Post as a guest

Hot Network Questions

Modification to GEQO (Genetic Query Optimization) of PostgreSQL

1 Answer 1

Your Answer

Sign up or log in

Post as a guest

Post as a guest

Related

Hot Network Questions