In 9.4b2, postgresql_fdw
doesn't know how to "push down" aggregate queries on remote tables, e.g.
> explain verbose select max(col1) from remote_tables.table1;
QUERY PLAN
---------------------------------------------------------------------------------------------
Aggregate (cost=605587.30..605587.31 rows=1 width=4)
Output: max(col1)
-> Foreign Scan on remote_tables.table1 (cost=100.00..565653.20 rows=15973640 width=4)
Output: col1, col2, col3
Remote SQL: SELECT col1 FROM public.table1
It would obviously be much more efficient to send SELECT max(col1) FROM public.table1
to the remote server and just pull the one row back.
Is there a way to perform this optimization manually? I would be satisfied with something as low-level as (hypothetically speaking)
EXECUTE 'SELECT max(col1) FROM public.table1' ON remote RETURNING (col1 INTEGER);
although of course a higher-level construct would be preferred.
I'm aware that I could do something like this with dblink
, but that would involve rewriting a large body of code that already uses foreign tables, so I'd prefer not to.
EDIT: Here's the query plan for Erwin Brandstetter's suggestion:
=> explain verbose select col1 from remote_tables.table1
-> order by col1 desc nulls last limit 1;
QUERY PLAN
---------------------------------------------------------------------------------------------------
Limit (cost=645521.40..645521.40 rows=1 width=4)
Output: url
-> Sort (cost=645521.40..685455.50 rows=15973640 width=4)
Output: col1
Sort Key: table1.col1
-> Foreign Scan on remote_tables.table1 (cost=100.00..565653.20 rows=15973640 width=4)
Output: col1
Remote SQL: SELECT col1 FROM public.table1
This is better, in that it fetches only col1
, but it's still dragging 16 million rows over the network and now it's also sorting them. By way of comparison, the original query, applied on the remote server, doesn't even have to scan, because that column has an index. (The core query planner isn't clever enough to do that for the modified query applied on the remote server, but that's minor.)
2 Answers 2
For the time being, it seems that the best available option is to create a view on the remote server that encapsulates the query needing to be "pushed down". postgres_fdw
is happy to define and use foreign tables backed by views on the remote, and regular old query optimization within the view does the Right Thing. For instance, given
CREATE VIEW id_ranges AS
SELECT 'url_strings'::text AS tbl,
min(url_strings.id)::bigint AS lo,
max(url_strings.id)::bigint AS hi
FROM url_strings
UNION
SELECT 'captured_pages'::text AS tbl,
min(captured_pages.url)::bigint AS lo,
max(captured_pages.url)::bigint AS hi
FROM captured_pages
UNION
-- ... several more like that ...
on the remote, and a FOREIGN TABLE
of the same name on the local server,
SELECT lo, hi FROM id_ranges WHERE tbl = 'url_strings';
the existing pushdown optimization will send the WHERE constraint to the remote, and the remote will scan only one table (making use of indexes if possible) and send back a single-row result.
Remote Query Optimization is rather basic:
postgres_fdw
attempts to optimize remote queries to reduce the amount of data transferred from foreign servers. This is done by sending queryWHERE
clauses to the remote server for execution, and by not retrieving table columns that are not needed for the current query. [...]
My first idea to substitute with the following isn't much of an improvement either as you found out:
(削除) SELECT col1
FROM public.table1
ORDER BY col1 DESC NULLS LAST
LIMIT 1; (削除ここまで)
Currently (including pg 9.4), only WHERE
conditions with all immutable functions are pushed down. I found this exhaustive thread discussing the Status of FDW pushdowns on pgsql-hackers.
Your best option seems to use dblink like you already mentioned yourself.
-
Alas, the sort happens locally.zwol– zwol2014年09月22日 23:45:52 +00:00Commented Sep 22, 2014 at 23:45
-
@Zack: Too bad. Currently, really only
WHERE
conditions with all immutable functions are pushed down.Erwin Brandstetter– Erwin Brandstetter2014年09月23日 00:29:06 +00:00Commented Sep 23, 2014 at 0:29
Explore related questions
See similar questions with these tags.