Return to Answer

note default value of topdown

Source Link

edited Oct 26, 2016 at 8:59

Gareth Rees

edited Oct 26, 2016 at 8:59

Gareth Rees

50.1k
3
130
210

1. Review

There's quite a lot of code here, so I'm going to look at just one part of it, namely the function pic_files.

This function has two tasks: first, it has to find files that might contain images, and second, it has to call the function new_pic for each image. This makes it hard to test the code (how can you be sure that it found the right set of files?) and hard to reuse in other programs (because you'd probably have to change it to call some other function).

It would be better to split these tasks into two functions, then you could reuse one of them. What you want is to have one function that generates the filenames, and another that calls new_pic. See the revised code below for how this can be done.

The function calls os.walk twice, first to get the files:
```
 files = next(os.walk(Dir))[2]
```

and second to get the directories:

 dirs = next(os.walk(Dir))[1]

It would be better to call it just once and remember the result. You can use tuple unpacking to avoid the [2] and [1]:

 _, dirs, files = next(os.walk(dir))

It would be a good idea if the list of ignored directories were an argument to the function. This would make the function easier to reuse, because you could pass different lists of ignored directories in different circumstances.
This function only uses the first result generated by os.walk, and implements its own recursion over the directory tree. I guess that's because of the need to ignore some of the directories. But in fact os.walk already has built-in support for pruning the directory tree. The documentation says:

When topdown is True [which it is by default — GDR], the caller can modify the dirnames list in-place (perhaps using del or slice assignment), and walk() will only recurse into the subdirectories whose names remain in dirnames; this can be used to prune the search [or] impose a specific order of visiting.

2. Revised code

def iterfiles(top, ignored=()):
 """Generate files below the top directory in sorted order, ignoring
 directories with names in the ignored iterable.
 """
 ignored = set(ignored)
 for root, dirs, files in os.walk(top):
 for f in sorted(files):
 yield os.path.join(root, f)
 dirs[:] = sorted(set(dirs) - ignored)
def pic_files(top):
 """Call new_pic for each file below the top directory."""
 for f in iterfiles(top, ignored=ignore):
 new_pic(f)

1. Review

There's quite a lot of code here, so I'm going to look at just one part of it, namely the function pic_files.

This function has two tasks: first, it has to find files that might contain images, and second, it has to call the function new_pic for each image. This makes it hard to test the code (how can you be sure that it found the right set of files?) and hard to reuse in other programs (because you'd probably have to change it to call some other function).

The function calls os.walk twice, first to get the files:
```
 files = next(os.walk(Dir))[2]
```

and second to get the directories:

 dirs = next(os.walk(Dir))[1]

It would be better to call it just once and remember the result. You can use tuple unpacking to avoid the [2] and [1]:

 _, dirs, files = next(os.walk(dir))

It would be a good idea if the list of ignored directories were an argument to the function. This would make the function easier to reuse, because you could pass different lists of ignored directories in different circumstances.
This function only uses the first result generated by os.walk, and implements its own recursion over the directory tree. I guess that's because of the need to ignore some of the directories. But in fact os.walk already has built-in support for pruning the directory tree. The documentation says:

When topdown is True, the caller can modify the dirnames list in-place (perhaps using del or slice assignment), and walk() will only recurse into the subdirectories whose names remain in dirnames; this can be used to prune the search [or] impose a specific order of visiting.

2. Revised code

def iterfiles(top, ignored=()):
 """Generate files below the top directory in sorted order, ignoring
 directories with names in the ignored iterable.
 """
 ignored = set(ignored)
 for root, dirs, files in os.walk(top):
 for f in sorted(files):
 yield os.path.join(root, f)
 dirs[:] = sorted(set(dirs) - ignored)
def pic_files(top):
 """Call new_pic for each file below the top directory."""
 for f in iterfiles(top, ignored=ignore):
 new_pic(f)

1. Review

There's quite a lot of code here, so I'm going to look at just one part of it, namely the function pic_files.

This function has two tasks: first, it has to find files that might contain images, and second, it has to call the function new_pic for each image. This makes it hard to test the code (how can you be sure that it found the right set of files?) and hard to reuse in other programs (because you'd probably have to change it to call some other function).

The function calls os.walk twice, first to get the files:
```
 files = next(os.walk(Dir))[2]
```

and second to get the directories:

 dirs = next(os.walk(Dir))[1]

It would be better to call it just once and remember the result. You can use tuple unpacking to avoid the [2] and [1]:

 _, dirs, files = next(os.walk(dir))

It would be a good idea if the list of ignored directories were an argument to the function. This would make the function easier to reuse, because you could pass different lists of ignored directories in different circumstances.
This function only uses the first result generated by os.walk, and implements its own recursion over the directory tree. I guess that's because of the need to ignore some of the directories. But in fact os.walk already has built-in support for pruning the directory tree. The documentation says:

When topdown is True [which it is by default — GDR], the caller can modify the dirnames list in-place (perhaps using del or slice assignment), and walk() will only recurse into the subdirectories whose names remain in dirnames; this can be used to prune the search [or] impose a specific order of visiting.

2. Revised code

def iterfiles(top, ignored=()):
 """Generate files below the top directory in sorted order, ignoring
 directories with names in the ignored iterable.
 """
 ignored = set(ignored)
 for root, dirs, files in os.walk(top):
 for f in sorted(files):
 yield os.path.join(root, f)
 dirs[:] = sorted(set(dirs) - ignored)
def pic_files(top):
 """Call new_pic for each file below the top directory."""
 for f in iterfiles(top, ignored=ignore):
 new_pic(f)

Source Link

answered Oct 17, 2016 at 18:09

Gareth Rees

answered Oct 17, 2016 at 18:09

Gareth Rees

50.1k
3
130
210

1. Review

There's quite a lot of code here, so I'm going to look at just one part of it, namely the function pic_files.

This function has two tasks: first, it has to find files that might contain images, and second, it has to call the function new_pic for each image. This makes it hard to test the code (how can you be sure that it found the right set of files?) and hard to reuse in other programs (because you'd probably have to change it to call some other function).

The function calls os.walk twice, first to get the files:
```
 files = next(os.walk(Dir))[2]
```

and second to get the directories:

 dirs = next(os.walk(Dir))[1]

It would be better to call it just once and remember the result. You can use tuple unpacking to avoid the [2] and [1]:

 _, dirs, files = next(os.walk(dir))

It would be a good idea if the list of ignored directories were an argument to the function. This would make the function easier to reuse, because you could pass different lists of ignored directories in different circumstances.
This function only uses the first result generated by os.walk, and implements its own recursion over the directory tree. I guess that's because of the need to ignore some of the directories. But in fact os.walk already has built-in support for pruning the directory tree. The documentation says:

When topdown is True, the caller can modify the dirnames list in-place (perhaps using del or slice assignment), and walk() will only recurse into the subdirectories whose names remain in dirnames; this can be used to prune the search [or] impose a specific order of visiting.

2. Revised code

def iterfiles(top, ignored=()):
 """Generate files below the top directory in sorted order, ignoring
 directories with names in the ignored iterable.
 """
 ignored = set(ignored)
 for root, dirs, files in os.walk(top):
 for f in sorted(files):
 yield os.path.join(root, f)
 dirs[:] = sorted(set(dirs) - ignored)
def pic_files(top):
 """Call new_pic for each file below the top directory."""
 for f in iterfiles(top, ignored=ignore):
 new_pic(f)

lang-py