Неверная ширина строки в плане и возможно неверный план #108

New issue

Closed

Labels

help wanted question

@NikitinNikolay

Description

@NikitinNikolay

NikitinNikolay

opened

on Aug 2, 2017

Здравствуйте!

Нашёл баг при вычислении ширины возвращаемых строк, и возможно второй при формировании плана.
У меня это всё проявляется на Red Hat Server 6.7, Postresql 9.6.3 и pg_pathman 1.4.2.
Стенд делается следующими командами:

drop table if exists public.tab_parent cascade;
drop table if exists public.tab_child cascade;
create table public.tab_parent 
(
 parent_id numeric not null,
 parent_data varchar(100),
 partition_id numeric not null,
 primary key (parent_id)
);
create table public.tab_child 
(
 child_id numeric not null,
 parent_id numeric not null,
 child_data varchar(100),
 partition_id numeric not null,
 primary key (child_id)
);
create index ix_tab_child_parent on public.tab_child(parent_id);
select public.create_range_partitions(c.oid, 'partition_id', 1, 1, 3)
from pg_class c
inner join pg_namespace n on c.relnamespace = n.oid and n.nspname = 'public'
where c.relname in ('tab_parent', 'tab_child');
insert into public.tab_parent (parent_id, parent_data, partition_id)
select a, gen_random_bytes(25), trunc(random() * 3) + 1
from generate_series(1, 1000000) a;
insert into public.tab_child (child_id, parent_id, child_data, partition_id)
select parent_id * 20 + a, parent_id, gen_random_bytes(25), partition_id
from public.tab_parent, generate_series(1, 20) a
where random() < 0.1;
analyze public.tab_parent;
analyze public.tab_parent_1;
analyze public.tab_parent_2;
analyze public.tab_parent_3;
analyze public.tab_child;
analyze public.tab_child_1;
analyze public.tab_child_2;
analyze public.tab_child_3;
explain (analyse, verbose, buffers)
select *
from public.tab_parent p
inner join public.tab_child c on c.partition_id = p.partition_id and c.parent_id = p.parent_id 
where p.partition_id in (2, 3)
limit 100;

Вывод плана запроса:

"Limit (cost=5.20..29.40 rows=100 width=596) (actual time=0.100..0.259 rows=100 loops=1)"
" Output: p.parent_id, p.parent_data, p.partition_id, c.child_id, c.parent_id, c.child_data, c.partition_id"
" Buffers: shared hit=20"
" -> Merge Join (cost=5.20..161291.40 rows=666439 width=596) (actual time=0.099..0.254 rows=100 loops=1)"
" Output: p.parent_id, p.parent_data, p.partition_id, c.child_id, c.parent_id, c.child_data, c.partition_id"
" Merge Cond: (p.parent_id = c.parent_id)"
" Join Filter: (p.partition_id = c.partition_id)"
" Buffers: shared hit=20"
" -> Merge Append (cost=0.85..28614.80 rows=666738 width=84) (actual time=0.048..0.070 rows=52 loops=1)"
" Sort Key: p.parent_id"
" Buffers: shared hit=8"
" -> Index Scan using tab_parent_2_pkey on public.tab_parent_2 p (cost=0.42..14289.93 rows=332925 width=84) (actual time=0.039..0.045 rows=22 loops=1)"
" Output: p.parent_id, p.parent_data, p.partition_id"
" Filter: (p.partition_id = ANY ('{2,3}'::numeric[]))"
" Buffers: shared hit=4"
" -> Index Scan using tab_parent_3_pkey on public.tab_parent_3 p_1 (cost=0.42..14324.87 rows=333813 width=84) (actual time=0.008..0.016 rows=31 loops=1)"
" Output: p_1.parent_id, p_1.parent_data, p_1.partition_id"
" Filter: (p_1.partition_id = ANY ('{2,3}'::numeric[]))"
" Buffers: shared hit=4"
" -> Materialize (cost=1.30..101023.04 rows=1999317 width=90) (actual time=0.048..0.129 rows=142 loops=1)"
" Output: c.child_id, c.parent_id, c.child_data, c.partition_id"
" Buffers: shared hit=12"
" -> Merge Append (cost=1.30..96024.75 rows=1999317 width=90) (actual time=0.047..0.114 rows=142 loops=1)"
" Sort Key: c.parent_id"
" Buffers: shared hit=12"
" -> Index Scan using tab_child_1_parent_id_idx on public.tab_child_1 c (cost=0.42..31697.12 rows=666997 width=90) (actual time=0.024..0.031 rows=43 loops=1)"
" Output: c.child_id, c.parent_id, c.child_data, c.partition_id"
" Buffers: shared hit=4"
" -> Index Scan using tab_child_2_parent_id_idx on public.tab_child_2 c_1 (cost=0.42..32334.17 rows=665736 width=90) (actual time=0.011..0.017 rows=46 loops=1)"
" Output: c_1.child_id, c_1.parent_id, c_1.child_data, c_1.partition_id"
" Buffers: shared hit=4"
" -> Index Scan using tab_child_3_parent_id_idx on public.tab_child_3 c_2 (cost=0.42..31993.44 rows=666584 width=90) (actual time=0.010..0.019 rows=55 loops=1)"
" Output: c_2.child_id, c_2.parent_id, c_2.child_data, c_2.partition_id"
" Buffers: shared hit=4"
"Planning time: 1.448 ms"
"Execution time: 0.300 ms"

Первый баг:
Хорошо видно, что у Merge Append width=84, у Materialize width=90, а у шага который их соединяет Merge Join width=596. Без pg_pathman ширина результирующей строки равна сумме ширин строк предшествующих шагов.
А вот pg_pathman похоже в качестве слагаемых почему-то берёт теоретические размеры записей из родительских таблиц tab_parent и tab_child.

Второй возможный баг:
Почему-то таблица tab_child присоединяется к первой через шаги Merge Append, Materialize и Merge Join и делает полный перебор партиций, хотя мне кажется тут был бы более уместен Ваш шаг RuntimeAppend, но он почему-то не выбирается.
К сожалению, я не знаю как оттрассировать в постгрес выбор плана и не могу определить причину почему выбирается именно этот план.

С уважением, Никитин Николай.

Metadata

Assignees

No one assigned

Labels

help wanted question

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Navigation Menu

Search code, repositories, users, issues, pull requests...

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Неверная ширина строки в плане и возможно неверный план #108

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions