I'm working on a Unix machine where I can't use more than vanilla Perl, and I'm using Perl5.8. This script exits with a 1 if the current directory size is smaller than 1 GB (the character after -d
is a literal "tab" character).
my $du = `du --si | tail -1 | cut -d" " -f1`;
chomp $du;
if (substr($du, -1) ne "G") {
exit 1;
}
exit 0;
This is gross, but I know the data is in du --si
so I can write it in 30 seconds. Is there a cleaner, more robust way?
3 Answers 3
I agree with @rolfl that this would be much simpler as a one-line shell pipeline. The -s
option to du
makes it produce a total. awk
is a good tool to use for processing multi-column text.
du -s --si | awk '1ドル ~ /G/ { exit 1 }'
However, the --si
option seems to be a non-portable GNU extension. A more portable version would look at the number of 512-byte blocks. The magic number 1953125 is \$\dfrac{10^9}{512}\$.
du -s | awk '1ドル < 1953125 { exit 1 }'
The second version also works even if the total is in the terabyte or exabyte range.
There is an inefficiency, though: you should be able to exit early as soon as you find that the total exceeds 1 GB. For that, you would go back to Perl, but with a proper Perl program instead of a wrapper around du
.
use File::Find;
use strict;
my $sum = 0;
my %seen_inodes;
find(sub {
my ($inode, $blocks) = (stat)[1, 12] or die "${File::Find::name}: $!";
# Do not double-count hard links
if (!$seen_inodes{$inode}) {
$seen_inodes{$inode} = 1;
$sum += 512 * $blocks;
exit 0 if $sum >= 1_000_000_000;
}
}, ".");
exit 1;
It is unusual on Code Review, to recommend a different approach, but this process can be simplified a whole bunch..... and avoid perl entirely.....
du -s -B 1 | grep -P -q '^\d{10,}+\s.*'
It breaks down as follows:
du -s -B 1
print a summary (no details for each file), with a byte-per-block size ... i.e. print the number of bytes in the current directory.
Then, using grep (and perl-compatible regex).... use quiet output, which returns 0 on a successful match, and 1 on no-match.
In other words, make sure the line starts with at least 10 digits.... i.e. >= 1,000,000,000
bytes.
Putting it together, the grep will be successful if the current directory is at least 1GB.
I tested this with:
du -s -B 1 | grep -P -q '^\d{10,}+\s.*' && echo "Bigger than 1G" || echo "less than 1G"
Edit:
This is compatible with your original code, which uses --si
on du, which uses 1,000,000,000 bytes to represent GB. If you want to use GiB ( \2ドル^{31}\$ ) then it is actually substantially harder ....
Calling du
to calculate the full size is Ok, as it is not a trivial task. Everything else is better done on the Perl side. Simpler and cleaner.
my $du = `du -bs .`;
my $bytes = $du =~ /^(\d+)/ or die "du failed";
if ($bytes > 1e9) {
print "directory is bigger than 1GB\n"
}
-
\$\begingroup\$ Hi, and welcome to code review.
du -s .
does not count the number of bytes used ... but the number of kiloBytes. Consider adding the-B 1
option todu
\$\endgroup\$rolfl– rolfl2014年06月13日 11:19:54 +00:00Commented Jun 13, 2014 at 11:19
tail -1
? Doesn't that just choose one of the subdirectories arbitrarily? \$\endgroup\$.
, which is the current directory). The other lines are sizes of subdirectories. \$\endgroup\$