I'm trying to do a subprocess call from my python script that replaces the carriage return and newline characters in a file with a space, and then saves it back to file itself. I have verified that this works:
cat file.txt | tr '\r\n' ' ' > file.txt
and so tried to do the same thing in python. My call looks like this:
formatCommand = "cat " + fileName + " | tr '\\r\\n' ' ' > " + fileName
print(formatCommand) #this showed me that the command above is being passed
subprocess.call(formatCommand, shell=True)
Rather than successfully delete the newlines like I expect it to, the file ends up being empty.
I consulted this post about a similar problem, but the solution was to use shell=True which I already employ, and the redirect makes the Popen more complicated. Furthermore, I don't see why it doesn't work with the shell=True.
1 Answer 1
There's a race condition in your shell command. The first command in your pipeline is cat file.txt, the second command is tr '\r\n' ' ' > file.txt. Both commands are run in parallel at the same time. The first command reads from file.txt, the second trunctates file.txt and then writes to it. If the truncation happens before the first command reads from the file then the file will be empty.
2 Comments
> file.txt is done by the shell before tr command is even run. I don't know whether cat may ever see non-empty file.txt here.tr command, including redirections, is done in parallel to to the cat command. So it's possible for the cat command to output something, even finish if the file is smaller than the pipe buffer, before the tr redirections to occur. For example echo test > foo.txt; cat foo.txt | tr t T $(sleep 1) > foo.txt; cat foo.txt will probably print TesT.
trinvocation is wrong, both from the shell and from Python; it has nothing to do with your use ofsubprocess.str.formator%rather than concatenation, especially when you have quotes like this all of the place. And raw strings might help too. For example, isn't this more readable (and more obviously right)?r"cat {} | tr '\r\n' ' ' > {}".format(fileName, fileName)