I've been learning python for some time and I've come up with the idea to practice it a little. I created a script that removes files by their extension .html and .srt and also empty directories. It works good, but I've got a feeling that it could be improved. Can someone help me improve the code? Also I know it is not a good practice to use for-loop inside another for-loop. Files structure:
super secret folder
|___empty folder
|___folder1
|___file01.jpg
|___file02.srt
|___file03.html
|___folder2
|___file01.jpg
|___file02.srt
|___file03.html
|___folder3
|___file01.jpg
|___file02.srt
|___file03.html
import os
mypath = "C:\\Users\\user\\super secret folder"
def extract_ext(file_path):
split_tuple = os.path.splitext(file_path)
return split_tuple[0], split_tuple[1]
def abs_path(my_file_path, file_name):
return f"{my_file_path}\\{file_name}"
subfolders_list = os.listdir(mypath)
subfolders_paths = [os.path.join(mypath, path) for path in subfolders_list]
for folder_path in subfolders_paths:
files_list = os.listdir(folder_path)
if len(files_list) != 0:
for file_name in files_list:
file_abs_path = abs_path(folder_path, file_name)
_, file_extension = extract_ext(file_abs_path)
if file_extension == ".html" or file_extension == ".srt":
os.remove(file_abs_path)
else:
os.rmdir(folder_path)
2 Answers 2
Your code can be simplified through the usage of os.walk()
and pathlib.Path
.
from os import walk
from pathlib import Path
def cleanup(directory: Path, suffixes: set[str]) -> None:
"""Remove unwanted files with the given
suffixes as well as empty directories.
"""
for root, dirs, files in walk(directory):
root = Path(root)
for file in map(lambda filename: root / filename, files):
if file.suffix in suffixes:
file.unlink()
if root.is_dir() and not dirs and not files:
root.rmdir()
def main() -> None:
cleanup(Path('folder'), {'.html', '.srt'})
if __name__ == '__main__':
main()
This way it is more naturally readable, what the code does plus you have directory recursion.
By using a separate function (cleanup()
), you can also re-use the code with other directories or suffix sets.
Before
$ tree folder/
folder/
├── bar
│ ├── file01.jpg
│ ├── file02.html
│ └── file03.srt
├── foo
│ ├── file01.jpg
│ ├── file02.html
│ └── file03.srt
└── spamm
After
$ tree folder/
folder/
├── bar
│ └── file01.jpg
└── foo
└── file01.jpg
2 directories, 2 files
For loops inside for loops are necessary for this use case, you can however decrease indention levels by extracting code blocks into methods:
for folder_path in subfolders_paths:
if not is_empty(folder_path):
remove_unwanted_files(folder_path)
else:
remove_path(folder_path)
Indention can further be decreased by extracting the conditional as well:
for folder_path in subfolders_paths:
remove_empty_and_unwanted(folder_path)
pathlib.Path
overrides the division operator and adds OS specific separators:path1 / path2
="path1Content\path2Content"
on Windows and"path1Content/path2Content"
on POSIX \$\endgroup\$