Arguments:
- The directory in which you would like to change file extensions
- Old file extension
- And what the new extension should be
Example:
$ touch /tmp/spam.eggs
$ python change-ext.py /tmp .eggs .spam
$ ls /tmp
$ spam.spam
My Concerns.
- Is the code easy to read and comprehend?
- Is the code Pythonic?
- Does the code have any BUGS?
- Should i have used
argparse
instead ofsys.argv
'''
Batch renames file's extension in a given directory
'''
import os
import sys
from os.path import join
from os.path import splitext
def main():
try:
work_dir, old_ext, new_ext = sys.argv[1:]
except ValueError:
sys.exit("Usage: {} directory old-ext new-ext".format(__file__))
for filename in os.listdir(work_dir):
if old_ext == splitext(filename)[1]:
newfile = filename.replace(old_ext, new_ext)
os.rename(join(work_dir, filename), join(work_dir, newfile))
if __name__ == '__main__':
main()
1 Answer 1
Questions
Is the code easy to read and comprehend?
Import both join
and splitext
on the same line, and it'll be perfect.
Is the code Pythonic?
Yes... But is it good?
It's doesn't follow *nix too much. And sys.exit
is just wrong.
When using sys.exit
you pass an error status, not a message.
The status should be a number from 0-127, where 0 is successful.
And all the other numbers are an error of sorts.
Also you can set up your terminal to change colour or add words if you get an error response. So when sys.exit(1)
is called my prompt goes from green to red.
Does the code have any BUGS?
Yes:
>>> '.txt_file.txt'.replace('.txt', '.spam')
'.spam_file.spam'
Should i have used
argparse
instead ofsys.argv
I'd say no, as this is a very small script.
Otherwise, I'd say yes.
If you need flags and such you should use argparse
, even if the program is very small. So if you think about adding a --verbose
flag, you should.
Code review
As noted above, your usage for sys.exit
is wrong.
To amend this, I would change your sys.exit
to a standard raise
.
And then catch it, print the message, and exit with status 1.
class InputError(Exception):pass
def main():
...
except ValueError:
raise InputError("Usage: {} directory old-ext new-ext".format(__file__))
...
if __name__ == '__main__':
try:
main()
except InputError as e:
print(str(e))
sys.exit(1)
To amend the bug, I would use a completely different approach. A more *nix and Python approach.
*nix encourages regex's, and Python has str.format
, but you can use the both of them!
They are both micro-languages but regex's are for pattern matching where Python's format
is for output formatting.
I would argue having two dedicated micro-languages, is far better than anything we manually program.
import sys
import re
import os
from os.path import join
class InputError(Exception):pass
def main():
try:
work_dir, regex, format_ = sys.argv[1:]
except ValueError:
raise InputError("Usage: {} directory regex format".format(__file__))
regex = re.compile(regex)
for filename in os.listdir(work_dir):
if regex.match(filename) is not None:
newfile = format_.format(*regex.findall(filename)[0])
os.rename(join(work_dir, filename), join(work_dir, newfile))
print('Replaced {} with {}'.format(filename, newfile))
if __name__ == '__main__':
try:
main()
except InputError as e:
print(str(e))
sys.exit(1)
This is as complex as your previous code. However you can now do things like:
$ mkdir temp $ touch temp/txt.boo $ python change.py temp "(.*)\.(txt|boo)" ".{1}_file.{0}" Replaced txt.boo with .boo_file.txt $ ls temp .boo_file.txt $ python change.py temp "(.*)\.(txt|boo)" ".{1}_file.{0}" Replaced .boo_file.txt with .txt_file..boo_file $ ls temp .txt_file..boo_file $ python change.py temp Usage: change.py directory regex format err$
As I was asked
Wouldn't the fnmatch module be better suited for matching a pattern in a filename then a regular expression.
I thought I should make a pro's and con's for each method:
Regex
Pros
Well know micro-language
Allows an extremely large amount of customisation to the query
Easy to implement
Cons
- Larger input
glob / fnmatch
Pros
Simple input
Easy to implement
Cons
No capture groups
Good luck with substitution
'{0[:-4]}.boo'.format('hello.txt')
Some inputs are impossible to have reliable output.
*file*
glob based regex
So as the main fallback is that you can't have capture-groups in glob,
you can add them.
E.g:
re.compile(fnmatch.translate().replace(r'\(', '(').replace(r'\)', ')'))
Pros
Easy for people who don't know regex
Smaller input, but not as small is glob
Cons
Hard to implement well, without becoming regex where
.*
is substituted with*
.inp = '\(*\).txt' # Matches r'\\(.*\\)\.txt'. fnmatch.translate(inp).replace(r'\(', '(').replace(r'\)', ')'))
However if we were to use
re.sub
.inp = '(*).txt' # Matches r'(.*).txtZ' re.sub(r'\\(.)', lambda m: m.group(1), fnmatch.translate(inp)) inp = r'(*)\.txt' # Might as well be regex. re.sub(r'\\(.)', lambda m: m.group(1), fnmatch.translate(inp + '\\'))
And then there is the edge-case for
'[a-z]*'
. As the docs say:For a literal match, wrap the meta-characters in brackets. For example,
'[?]'
matches the character'?'
.But that make
'[a-z][*]'
turn into'[a-z][*]'
not'[a-z]*'
.
Conclusion
I chose re
as more people know it.
You can extract information out of it with ease.
And is super easy to implement.
If you wanted to use glob & regex, you might as well use regex.
And glob is kinda broken:
>>> '{0[:-4]}.boo'.format('hello.txt')
TypeError: string indices must be integers
>>> '{0[-4]}.boo'.format('hello.txt')
TypeError: string indices must be integers
-
\$\begingroup\$ Wouldn't the fnmatch module be better suited for matching a pattern in a filename then a regular expression. \$\endgroup\$Ricky Wilson– Ricky Wilson2015年11月20日 22:51:09 +00:00Commented Nov 20, 2015 at 22:51
-
\$\begingroup\$ @RickyWilson I was going to use
glob
, which usesfnmatch
. However, I didn't find an easy way to justregex.findall
. Also I thinkfnmatch
usesre
, due tofnmatch.translate
. \$\endgroup\$2015年11月20日 22:56:41 +00:00Commented Nov 20, 2015 at 22:56