In the server hosting the PostgreSQL server, there is a very specific directory where a series of .csv files will be loaded regularly to update one of the databases. I want to make the process of uploading the data contained in these files as automatic as possible, and I have thus created a .sh
script to do this. It is a simple for
loop iterating through the set of .csv files in that directory and passing their names as parameters to a \COPY
sentence.
Now, since the server administrator are being a little privy of their server, they would like to give us access only to the SQL server and not to the underlying unix server. So, here goes the question:
Is there a way to accomplish the task described above through a stored procedure executed from inside the database? Can you really read and access a path and its contents in that way from the database? The whole set of .csv files could vary so I don't think a hard-coded solution would work, plus it would look rather dirty (although, if that is the only way I can make it work, so be it).
My guess is that you cannot but... you never know.
3 Answers 3
There's a built-in pg_ls_dir
function that is quite close to what you need.
There are two security-related caveats:
- it's reserved to superusers.
- absolute paths are not allowed, paths are relative to the PostgreSQL data directory.
Concerning the need to be superuser, any solution would have this requirement anyway, since it is a feature by design that a normal user has zero access to the filesystem.
A DBA (superuser) can grant access to an otherwise-forbidden specific functionality through a proxy function defined with SECURITY DEFINER
access rights.
For instance:
create function pg_ls_dir2(text) as
'SELECT pg_ls_dir(1ドル)'
language sql SECURITY DEFINER;
-- optional (to give access to a specific role only)
revoke execute on function pg_ls_dir2(text) from public;
grant execute on pg_ls_dir2(text) TO specific_role;
Concerning the second issue, a DBA can create a symlink from inside the $PGDATA
directory to any directory, and pg_ls_dir
will follow it, so the real upload directory can be anywhere on the file system.
If the system admin agrees to this setup, as a non-priviledged user you could eventually run a simple plpgsql function matching the functionality of the shell script:
FOR filename IN select pg_ls_dir2('relative_path') LOOP
IF (filename ~ '.csv$') THEN
COPY '/fullpath/' || filename TO table...
END IF;
END LOOP;
As COPY FROM file
itself requires to be superuser, this function will also need to be validated and owned and checked as SECURITY DEFINER
by a superuser.
-
You really know your way around PostgreSQL! This is incredibly helpful and I think I will stick to this solution in the end. The behavior of the symlinks seems awesome and the thing about being a superuser and how to get a work around to it is really interesting too :DFeillen– Feillen2016年12月07日 13:25:49 +00:00Commented Dec 7, 2016 at 13:25
-
I was finally able to solve my situation thanks to this approach. I can remotely upload to the database the content of some files stored in the server to which I don't have access via ssh (I have rehearsed this and it works!)Feillen– Feillen2016年12月07日 16:22:27 +00:00Commented Dec 7, 2016 at 16:22
Now, since the server administrator are being a little privy of their server, they would like to give us access only to the SQL server and not to the underlying unix server.
I'm not sure how you're accessing the SQL server from that. However, if you're going to keep getting access to psql
, perhaps you can work around their security with \!
in psql
\! [COMMAND] execute command in shell or start interactive shell
-
I did not know about this meta command, and it is quite useful indeed. Thanks for the help :)Feillen– Feillen2016年12月07日 11:08:18 +00:00Commented Dec 7, 2016 at 11:08
-
I downloaded a portable version of the pgadmin and apparently it does not allow these meta commands as valid 'queries' :(Feillen– Feillen2016年12月07日 11:24:39 +00:00Commented Dec 7, 2016 at 11:24
-
Right, it's psql only, and only on the server.Evan Carroll– Evan Carroll2016年12月07日 11:27:15 +00:00Commented Dec 7, 2016 at 11:27
-
see if you can run
CREATE EXTENSION plperl
with your pgadmin. Or,plperlu
.. If so look at the above.Evan Carroll– Evan Carroll2016年12月07日 11:27:41 +00:00Commented Dec 7, 2016 at 11:27
Depends on how far you want to take this, right? If they let you run plperlu
,
- You can access the filesystem at the oslevel.
If you really want be a gigantic douche nozzle, you can call in Net::Dropbear::SSHd:
- Install plperl/plperlu (in Ubuntu this is found the
postgresql-plperl-9.5
package -- it's normally prepackaged and sounds innocuous) - Set it up on the db
CREATE EXTENSION plperlu;
- run
sudo cpan Net::Dropbear::SSHd;
(or install locally and uselocal::lib
) - Create a function that uses it.
- Install plperl/plperlu (in Ubuntu this is found the
Here is an example of such a function,
CREATE OR REPLACE FUNCTION ssh_in_server() RETURNS integer AS $$
use Net::Dropbear::SSHd;
Net::Dropbear::XS::gen_key($key_filename);
my $sshd = Net::Dropbear::SSHd->new(
addrs => '2222',
keys => $key_filename,
hooks => {
on_log => sub
{
my $priority = shift;
my $msg = shift;
warn( "$msg\n" );
return HOOK_CONTINUE;
},
}
);
$sshd->run;
$sshd->wait;
return undef;
$$ LANGUAGE plperlu;
If your database doesn't sshd, you can't data.
-
Wow, Perl has been out of my reach for long. I may have to reconsider starting to learn a thing or two (at least) about it. I cannot say this is what I was looking for (I don't really understand what this does ^^' ) so I wouldn't mark it as accepted answer, but the insight into just how many possibilities there were to my solve problem has been amazing :DFeillen– Feillen2016年12月07日 11:13:45 +00:00Commented Dec 7, 2016 at 11:13
-
Most languages have some kind of procedural language in postgresql. Or you can build your own extension in C. Whatever have you, you can have fun.Evan Carroll– Evan Carroll2016年12月07日 11:16:53 +00:00Commented Dec 7, 2016 at 11:16
-
Question: if I run from remote a stored procedure that looks into the file system to do some operations (providing I can get to do this), what is the expected behavior: to look into the file system of the host to the sql server or to look into the file system of the host running the procedure?Feillen– Feillen2016年12月07日 11:50:24 +00:00Commented Dec 7, 2016 at 11:50
insert into archive select * from fdw_table