I have a PostgreSQL query that won't execute properly when I look for empty strings. I've tried escaping single quotes with the backslash, I've tried double quotes, nothing seems to work. Ideas? I need to look for empty strings because my script doesn't do what I need it to do when it finds empty strings in my dataset.
this_query = """
copy(SELECT phone_number from mytable WHERE phone_number is not null and phone_number != '''' ORDER BY random() LIMIT 25)
to stdout csv header;
"""
Without the empty string phrase (phone_number != ''''), the query will work when I execute:
THIS_COMMAND = "psql -h hostname -d dbname -c '{query}' > {file_name}"
command = THIS_COMMAND.format(query=this_query, file_name=a_file_name)
os.system(command)
ATTN: After reviewing answers below, I experimented and found this to work:
phone_number != '\"'\"''\"'\"'
2 Answers 2
There are ways to make this work with the quotes.
EDIT: Cross-referencing this UNIX-shell quoting answer: The quotes need to both escaped for Python and sh/bash (presumably). So for sh this needs to be '"'"''"'"', but then the " characters need to be escaped for Python, giving '\"'\"''\"'\"'. Phew!
this_query = """
copy(SELECT phone_number from mytable WHERE phone_number is not null and phone_number != '\"'\"''\"'\"''\"'\"''\"'\"' ORDER BY random() LIMIT 25) to stdout csv header;
"""
But to give another approach, have you considered checking for character_length(phone_number) > 0 ?
Ref: https://www.postgresql.org/docs/9.1/functions-string.html
e.g.:
SELECT phone_number FROM mytable WHERE phone_number IS NOT NULL AND character_length(phone_number) > 0
But maybe it's better to clean the data?
UPDATE mytable SET phone_number = NULL WHERE character_length(trim(phone_number)) = 0;
6 Comments
bash (version > 2.02), you can also use the syntax $'\'' for a single quote.where created_on::date <= '\"'\"'2018年12月11日'\"'\"' and THAT worked. Your quote suggestions did not work. In any case, I might want to try the suggestion made by @Parfait, to use the \copy command instead... this quote business is a little crazyConsider Python's built-in subprocess to pass multiple arguments into a command line call and avoid any need for quote enclosures or escaping. This is a more enhanced command line caller than os.system(). Additionally, as shown below you can set environment variables like PGPASSWORD.
Specifically, use psql's meta-command \copy command (distinct from Postgres' COPY) to pass file directly in query name to be saved client-side. Notice too the use of r (raw string literal) so backslash in \copy can be handled as is and not as a special character.
from subprocess import Popen, PIPE
file_name = '/path/to/file.csv'
this_query = r"""\copy (SELECT phone_number
FROM mytable
WHERE phone_number is not null
AND phone_number != ''
ORDER BY random() LIMIT 25)
TO '{myfile}' with (FORMAT CSV, HEADER);
""".format(file_name)
host_name = 'XX.XX.XXX'
db_name = 'mydatabase'
user_name = 'myuser'
pwd = 'mypwd'
# LIST OF ARGS
cmd = ["psql", "-h", host_name, "-d", db_name,
"-U", user_name, "-p", "5432", "-c", this_query]
# COMMAND LINE CALL
p = Popen(cmd, env={'PGPASSWORD': pwd},
stdin=PIPE, stdout=PIPE, stderr=PIPE)
Plus, you can receive return errors or console output via PIPE.
output, error = p.communicate()
if p.returncode == 0:
print('OUTPUT:\n {0}'.format(output))
print("\nSuccessfully completed!")
else:
print('ERROR:\n {0}'.format(error))