I have this simple function that just reads a file, replaces all occurances of a string, and write it back to the file:
def fileSearchReplace(targetFile, search, replace):
with open(targetFile, 'r') as openFile:
filedata = openFile.read()
newData = filedata.replace(search, replace)
with open(targetFile, 'w') as openFile:
openFile.write(newData)
I was wondering if there are any better implementation?
2 Answers 2
Personally, I would do this kind of thing line by line. I don't like the idea of putting my entire file into memory, then overwriting it. What if something goes wrong in between? Basically I don't trust myself to write good code!
So, looping through line by line, I would write to a new file with any changes. In Python, you can open two files at once, which is perfect for reading from one and writing to another.
Applying PEP8 for naming, you'd end up with something like this:
def file_search_replace(open_file, search, replacement):
with open(open_file, 'r') as read_file, open('new_file.txt', 'w+') as write_file:
for line in read_file:
write_file.write(line.replace(search, replacement))
I've changed a few of the names, notably replace
to replacement
so it's clear it's not the same as replace()
.
Finally, if you're happy with the result, you can easily rename the file to something else. Personally I'd keep the old one just in case.
-
\$\begingroup\$ This doesn't work without backup (when open_file == 'new_file.txt') \$\endgroup\$sheldonzy– sheldonzy2019年05月15日 15:49:59 +00:00Commented May 15, 2019 at 15:49
You could consider using the fileinput.FileInput
to perform in-place substitutions in a file. The accepted answer in this Stack Overflow answer illustrates a simple solution.
If you need to perform replacements across multiple lines, you will likely want to read all of the data into a variable (if possible, depending on file size), then use the re.sub
function with the re.DOTALL
option specified to treat newlines as inclusive in your regular expression.
-
\$\begingroup\$ Thank you for your edit :) This is an example of what the second would look like:
out.write(re.sub(re.escape(search), replace, in_.read()))
. Changingin_
to anmmap
will also mean the entire file isn't read into memory. \$\endgroup\$2019年05月15日 20:48:01 +00:00Commented May 15, 2019 at 20:48