So I ran into an issue where a client sent me a sample file and the line breaks were not preserved when reading the file using PHP from linux. So I wrote up a little function that should hopefully fix the issue for all platforms.
This is from a larger File class. The function names are pretty self explanatory.
public function fixLineBreaks() {
$this->openFile();
// replace all placeholder with proper \r\n
$contents = str_replace("[%LINEBREAK%]", "\r\n",
// replace all double place holder with single
// in case both \r and \n was used.
str_replace("[%LINEBREAK%][%LINEBREAK%]", "[%LINEBREAK%]",
// replace all \n with place holder
str_replace("\n", "[%LINEBREAK%]",
// replace all \r with placeholder
str_replace("\r", "[%LINEBREAK%]",
// First, read the file.
fread($this->fp, filesize($this->filename))
)
)
)
);
$this->closeFile();
$this->writeOpen();
fwrite($this->fp, $contents);
$this->closeFile();
}
EDIT
Okay, to clarify. I received a CSV file that was supposed to be parsed in PHP. Open it up in notepad and everything is on one line. Opening it with fopen and then reading with fgetcsv would also return the contents as one line. However, if you copy and paste the contents from notepad into NetBeans then all the line breaks are there. This made me realize that there are line breaks but they're not the ones that notepad or PHP recognize.
This was the function that I wrote up which essentially re-writes the file with proper line breaks.
I was posting this here to see if anyone possible had a better solution.
1 Answer 1
This conversion can be done by simple apps from the Dos2Unix package.
However, if you insist on writing your own converter, I'd suggest making fewer function calls. The function str_replace()
can accept arrays, so there is no need for embedded calls.
$contents = str_replace(
array("\r", "\n", "[%LINEBREAK%][%LINEBREAK%]", "[%LINEBREAK%]"),
array("[%LINEBREAK%]", "[%LINEBREAK%]", "[%LINEBREAK%]", "\r\n"),
fread($this->fp, filesize($this->filename))
);
However, you ought to be careful. For example, if your file has two Unix-standard newlines, i.e., \n\n
, then your code will replace this with \r\n
, which is only one Windows-standard newline.
This regex (see about its syntax here) would probably do the trick:
preg_replace('#(?<!\r)\n|\r(?!\n)#', "\r\n", fread($this->fp, filesize($this->filename)));
But I still think it's better to use already existing and proven software, like the one from the beginning of this answer, or simply turn on the auto_detect_line_endings setting, as suggested in the comments by 200_success.
By the way, if your files are huge, your fread
will still (try to) read them whole at once, which is not pleasant for the computer memory and performance.
auto_detect_line_endings
setting? \$\endgroup\$