I'm trying to import a large csv file into Mysql. Unfortunately, the data within the file is separated both by spaces and tabs.
As a result, whenever I load the data into my table, I end up with countless empty cells (because Mysql only recognizes one field separator). Modifying the data before importing it is not an option, as I'm working with something like 400 million rows.
Here is an example of the data:
# 1574 1 1 1
$ 1587 6 6 2
115ドル 1878 8 9 23
(Where the second and third value of every row are separated by a tab)
Any ideas?
3 Answers 3
If you're on *nix - check out the tools sed, awk, grep and split and related (or even vi). As mentioned perl could do it, so could python or PHP (or C or Java or ...) This looks more like a task for programming tools rather than a database. That's not to say you couldn't do this using PL/SQL or T-SQL or , but sometimes the most suitable tool is not within the database server (and in this case, certainly not within MySQL).
If you created the CSV from an Excel spreadsheet, I know Microsoft Office is notorious for not pasting/exporting correctly. Try pasting as plain text, and then exporting as CSV. Then, import that into MySQL as tables. Hope this helps, but I don't know enough about where you got the CSV from.
-
1If he's managing 400 million rows in Excel then he as bigger problems than delimiters.mdahlman– mdahlman2013年12月08日 21:51:00 +00:00Commented Dec 8, 2013 at 21:51
You can refer HERE using perl
on *nix:
You can use perl
to replace the death lines to any delimiter you want (I used comma in this example).
File:
Command: perl -wnlpi -e 's/\s+/,/g;' text.txt
Result:
perl -i -pe 's/\s/|/g' your_file_name
for example would transform every space or tab in the file into a pipe|
character, which seems like a pretty straightforward solution, because then you'd have a single delimiter for MySQL to work with.