I'm looking to import a folder of over 100+ csv files into one table on PostgreSQL using pgadmin. I have the following script but is there a way to change the file path to all csvs?
COPY schema.table FROM ‘C:\Documents\Data’ DELIMITER ‘,’ CSV HEADER;
3 Answers 3
I solved the problem by merging all csv files into one with the shell script:
for i in ./*.csv
do
cat $i >> all.csv
done
Then I imported the all.csv file. Instead of pgAdmin or COPY statement I just used IntelliJ: right click on the table and select Import data. This is the most convenient way that I found
-
1If you want to add a newline between files, you can adjust the cat expression to:
(cat "${i}"; echo) >> all.csv
Mark M– Mark M2024年06月10日 18:12:31 +00:00Commented Jun 10, 2024 at 18:12 -
thank you, I changed the answer and added the newlineSergey Ponomarev– Sergey Ponomarev2024年06月17日 06:38:36 +00:00Commented Jun 17, 2024 at 6:38
-
1Hey, no need to change it, it was a great answer! I was just augmenting it a bit. Your answer was 100% perfect, it definitely helped me out. Cheers!Mark M– Mark M2024年06月17日 15:35:52 +00:00Commented Jun 17, 2024 at 15:35
- Use your computer's command-line environment to list the names of all the files in your directory
- On windows, use cmd +
dir
or PowerShell+gci
- On macOS, probably terminal+
ls
- This might also be a good opportunity to familiarise yourself with psql and the meta-commands which you could use for this purpose
- On windows, use cmd +
- Use the list of files to create the commands you want
- Some people use excel for this
- I prefer multi-line editing in something like VSCode
- Manually tweak your commands line-by-line or change the formula/process you used in step 2 until you get the output you want.
Gl;hf ;)
Peter's answer is probably the way to go, but wanted to chime in with another approach using COPY FROM PROGRAM
. Unfortunately I only have a linux env so I can't give you a command/statement that will work for your case, but hopefully this gives you the idea and/or is useful for someone else.
COPY FROM
with a file can only support one file at a time, but being a bit clever with a COPY FROM PROGRAM
is one way of importing multiple files with one COPY
. Here's a trivial example:
sh-5.0$ mkdir /tmp/so_267604/
sh-5.0$ cd /tmp/so_267604/
sh-5.0$ echo -e "cola,colb\naval1,bval1\naval2,bval2" > first.csv
sh-5.0$ cat first.csv
cola,colb
aval1,bval1
aval2,bval2
sh-5.0$ echo -e "cola,colb\naval3,bval3\naval4,bval4" > second.csv
sh-5.0$ tail --quiet -n +2 *.csv
aval1,bval1
aval2,bval2
aval3,bval3
aval4,bval4
sh-5.0$ psql -X testdb
testdb=# create table tt(cola text, colb text);
CREATE TABLE
testdb=# copy tt from program 'tail --quiet -n +2 /tmp/so_267604/*.csv' csv;
COPY 4
testdb=# select * from tt;
cola | colb
-------+-------
aval1 | bval1
aval2 | bval2
aval3 | bval3
aval4 | bval4
(4 rows)
Note that there's no HEADER
modifier on the copy statement; this because -n +2
on the tail
command will always start at line #2 of each file it outputs. This should not break anything; the HEADER directive only ever tells pg to skip reading one line, the actual data is always imported using column order specified in the COPY statement, or the table's column order if the COPY did not specify it.