I am trying to import big fixed length files into Postgres and I am using the copy command and a staging table following this guide.
The problem is that this files has a lot of different characters and i don't know which to use as delimiter. In fact I don't need a delimiter at all as I am importing files in a single column table.
-
I'm working on a ELT tool for external data, which is also able to manage fixed length files: github.com/Karlodun/pgingest probably you can use it.Mihail Gershkovich– Mihail Gershkovich2021年03月24日 22:52:05 +00:00Commented Mar 24, 2021 at 22:52
3 Answers 3
You cannot use COPY
or \copy
if there is no character which is certain not to appear in the data. You will either need to pick a character to act as the delimiter, then escape or quote it and the escape/quote character in the input file (perhaps using PROGRAM). Or, not use COPY
or \copy
, but rather use INSERT
instead.
It isn't clear from your question if your files have newlines as record separators or if every file is one record and internal newlines are literal. If the latter you do use COPY
or \copy
, you will have to escape/quote that as well.
Postgres can't handle that format but it can candle CSV and translation is fairly easy.
sed 's/"/""/g;s/^\|$/"/g'
will translate a text file to a single column CSV file which can then be read by copy.
so your import might go something like
sed 's/"/""/g;s/^\|$/"/g'_yourfile_ | psql _yourdatabase_ -C copy _yourtable_ from stdin with CSV
sed 's/"/""/g;1 s/^/"/;$ s/$/"/'
will translate a text file to a single CSV record should you require that translation instead.
As long as you are not working with very exotic data (e.x. fax transfers), you can use some of the non-printable ASCII chars. I'd suggest to use beep: chr(7)
I'm working on a ELT tool for external data, which is also able to manage fixed length files: https://github.com/Karlodun/pgingest/ probably you can use it. Feel free to ask questions here or in git.
The tool uses beep as default separator, thus you would not even have to define it. It works like a miracle in linux.