Optimizing Perl script which looks for world-writable files

Question 1

A script which scans a local filesystem and writes to a file a list of world-writable files found on the filesystem.

I ran it on a server with a couple terabytes of storage; not all of it used. In fact, only 195GB have been used. I'm not sure how many files there are, though.

The script takes 23 minutes to run. Is there something I can do to make it run faster?

#!/usr/bin/perl
use warnings;
use strict;
use Fcntl ':mode';
use File::Find;
no warnings 'File::Find';
no warnings 'uninitialized';
my $dir = "/var/log/tivoli/";
my $mtab = "/etc/mtab";
my $permFile = "world_writable_w_files.txt";
my $tmpFile = "world_writable_files.tmp";
my $exclude = "/usr/local/etc/world_writable_excludes.txt";
my $root = "/";
my (%excludes, %devNums);
my ($regExcld, $errHeader);
# Create an array of the file stats for "/"
my @rootStats = stat($root);
# Compile a list of mountpoints that need to be scanned
my @mounts;
open MT, "<${mtab}" or die "Cannot open ${mtab}, $!";
# We only want the local mountpoints
while (<MT>) {
 if ($_ =~ /ext[34]/) {
 my @line = split;
 push(@mounts, $line[1]);
 }
}
close MT;
# Build a hash of each mountpoint's device number for future comparison
foreach (@mounts) {
 my @stats = stat($_);
 $devNums{$stats[0]} = $_;
}
# Build a hash from /usr/local/etc/world_writables_excludes.txt
if ((! -e $exclude) || (-z $exclude)) {
 $errHeader = <<HEADER;
!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
!! !!
!! /usr/local/etc/world_writable_excludes.txt is !!
!! is missing or empty. This report includes !!
!! every world-writable file including those which !!
!! are expected and should be excluded. !!
!! !!
!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
HEADER
} else {
 open XCLD, "<${exclude}" or die "Cannot open ${exclude}, $!\n";
 while (<XCLD>) {
 chomp;
 $excludes{$_} = 1;
}
sub wanted {
 # Is it excluded from the report...
 return if (exists $excludes{$File::Find::name});
 # ...in a basic directory, ...
 return if $File::Find::dir =~ /sys|proc|dev/;
 # ...a regular file, ...
 return unless -f;
 # ...local, ...
 my @dirStats = stat($File::Find::name);
 return if (exists $devNums{$dirStats[0]});
 # ...and world writable?
 return unless (((stat)[2] & S_IWUSR) && ((stat)[2] & S_IWGRP) && ((stat)[2] & S_IWOTH));
 # If so, add the file to the list of world writable files
 print(WWFILE "$File::Find::name\n");
}
# Create the output file path if it doesn't already exist.
mkdir($dir or die "Cannot execute mkdir on ${dir}, $!") unless (-d $dir);
# Create our filehandle for writing our findings
open WWFILE, ">${dir}${tmpFile}" or die "Cannot open ${dir}${tmpFile}, $!";
print(WWFILE "${errHeader}") if ($errHeader);
find(\&wanted, @mounts);
close WWFILE;
# If no world-writable files have been found ${tmpFile} should be zero-size;
# Delete it so Tivoli won't alert
if (-z "${dir}${tmpFile}") {
 unlink "${dir}${tmpFile}";
} else {
 rename("${dir}${tmpFile}","${dir}${permFile}") or die "Cannot rename file ${dir}${tmpFile}, $!";
}

Question 2

You might try rearranging the filters so that you determine if a file is interesting AFTER you determine if it is world-writable. Also, your test for basic directory will work improperly for any directory named "procedure" for instance.

Question 3

Moved the world-writable test to the top. It did not offer a change to the execution time. I also anchored each of the special directories in the regex.

Question 4

I would replace -f and stat calls inside wanted with File::stat equivalents.

Question 5

I'd try a run using the unix find utility for comparison, e.g. find / -perm -2 ! -type l -ls.

Question 6

here are my suggestions:

Your hash %excludes would use less memory, if you would set the value to undef instead of 1.

You could set $File::Find::prune=1 in those cases where the whole directory is to be excluded. This shortcuts the search for this directory.

You could also use the cached data of stat() afterwards like this -f _.

The regex should use anchors (faster and more robust).

Here is the revised wanted subroutine (untested).

sub wanted {
 my @dirStats = stat($File::Find::name);
 # Is it excluded from the report...
 if (exists $excludes{$File::Find::name}) {
 $File::Find::prune=1 if (-d _);
 return;
 }
 # ...in a basic directory, ...
 if ($File::Find::name =~ /^\bsys\b|\bproc\b|\bdev\b$/) {
 $File::Find::prune=1 if (-d _);
 return;
 }
 # ... not a regular file, ...
 return unless -f _;
 # ...local, ...
 return if (exists $devNums{$dirStats[0]});
 # ...and world writable?
 my $protection = $dirStats[2];
 my $writemask = (S_IWUSR | S_IWGRP | S_IWOTH);
 return unless $writemask == $protection & $writemask;
 # If so, add the file to the list of world writable files
 print(WWFILE "$File::Find::name\n");
}

Question 7

Welcome to CR. This is a nice first answer.

Question 8

Can you elaborate on the Find::File::prune option? This actually came up on our weekly call today. Would the directory be added to the excludes file?

Question 9

@theillien: File::Find::find() does a depth-first search over the directory tree. $Find::File::prune=1 is used to signal 'I am the last leave at this point in the tree. You may return to next upper level of the tree'.

Question 10

@theillien: (continuation) The tree traversal engine will stop processing all not yet processed files and subdirectories in this directory then.

Question 11

Yup, got that. If I understand it correctly it removes the section of the file name string following the last /. This leaves the directory in which it resides. How does the script then know if the directory is considered entirely ignore-able? Unless I'm not understanding what Find::File::prune does at all.

hexcoder hexcoder 1512 bronze badges · Answer 1 · 2014-06-14 14:16:43Z

here are my suggestions:

Your hash %excludes would use less memory, if you would set the value to undef instead of 1.

You could set $File::Find::prune=1 in those cases where the whole directory is to be excluded. This shortcuts the search for this directory.

You could also use the cached data of stat() afterwards like this -f _.

The regex should use anchors (faster and more robust).

Here is the revised wanted subroutine (untested).

sub wanted {
 my @dirStats = stat($File::Find::name);
 # Is it excluded from the report...
 if (exists $excludes{$File::Find::name}) {
 $File::Find::prune=1 if (-d _);
 return;
 }
 # ...in a basic directory, ...
 if ($File::Find::name =~ /^\bsys\b|\bproc\b|\bdev\b$/) {
 $File::Find::prune=1 if (-d _);
 return;
 }
 # ... not a regular file, ...
 return unless -f _;
 # ...local, ...
 return if (exists $devNums{$dirStats[0]});
 # ...and world writable?
 my $protection = $dirStats[2];
 my $writemask = (S_IWUSR | S_IWGRP | S_IWOTH);
 return unless $writemask == $protection & $writemask;
 # If so, add the file to the list of world writable files
 print(WWFILE "$File::Find::name\n");
}

Can you elaborate on the Find::File::prune option? This actually came up on our weekly call today. Would the directory be added to the excludes file?
@theillien: File::Find::find() does a depth-first search over the directory tree. $Find::File::prune=1 is used to signal 'I am the last leave at this point in the tree. You may return to next upper level of the tree'.
@theillien: (continuation) The tree traversal engine will stop processing all not yet processed files and subdirectories in this directory then.
Yup, got that. If I understand it correctly it removes the section of the file name string following the last /. This leaves the directory in which it resides. How does the script then know if the directory is considered entirely ignore-able? Unless I'm not understanding what Find::File::prune does at all.

Stack Exchange Network

Optimizing Perl script which looks for world-writable files

1 Answer 1

Your Answer

Sign up or log in

Post as a guest

Post as a guest

Hot Network Questions

Optimizing Perl script which looks for world-writable files

1 Answer 1

Your Answer

Sign up or log in

Post as a guest

Post as a guest

Related

Hot Network Questions