For my classes, I had to finish this task:
Directory's disk usage list
For indicated directory print a list of files and subdirectories in descending order according to their total disk usage. Disk usage of a simple file is equivalent to its size, for a subdirectory - a sum of the sizes of all files in its filesystem branch (subtree). Printed list format:
<disk usage> <holder name> <type> <name>
In the preceding print format is in case of a file not being a directory an owner, otherwise, the holder of a directory is the owner of the files having greatest total disk usage in it.<type>
is a letter: d for a directory, - for a standard file, l - for a symbolic link etc.Below file disk usage list, print a summary:
Total disk usage of the files in directory's subtree, the list of total disk usage of files in directory's subtree with respect to the owners (sums over owners). CAUTION: during recursive directories listing do not dereference symbolic links.
Here is my code:
#!/bin/bash
cd "1ドル"
ls -lA | awk 'NR != 1{
name=8ドル
for ( i=9; i<=NF; i++) name=name " " $i
type=substr(1,1,1ドル)
if (type!="d")
{
print 5,ドル3,ドルtype,name
sum_all+=5ドル
usr_sum_all[3ドル]+=5ドル
}
else
{
sum=0;delete usr_sum;
find_proc="find " name " -printf \"%s %u\\n\" 2>/dev/null"
while ( find_proc | getline )
{
sum+=1ドル;usr_sum[2ドル]+=1ドル;
sum_all+=1ドル;usr_sum_all[2ドル]+=1ドル;
}
close(find_proc)
owner=""
owner_sum=-1
for ( iowner in usr_sum ) if ( usr_sum[iowner] > owner_sum ) {owner = iowner; owner_sum=usr_sum[iowner]}
print sum " " owner " d " name
}
}
END {
print "Space taken: " sum_all
for ( iowner in usr_sum_all ) print "\t user: " iowner " " usr_sum_all[iowner]
}' | sort -gr
Is it ok? Should I make any changes?
-
\$\begingroup\$ I guess they want you to use recursion? "CAUTION: during recursive directories listing do not dereference symbolic links." : your solution is iterative \$\endgroup\$Olivier Dulac– Olivier Dulac2013年06月20日 19:24:38 +00:00Commented Jun 20, 2013 at 19:24
-
\$\begingroup\$ Yeah. I mailed my tutor about that he said i can use iteration as long as I will fulfil all the requirements? Are they fulfilled? \$\endgroup\$user2506200– user25062002013年06月20日 19:34:42 +00:00Commented Jun 20, 2013 at 19:34
1 Answer 1
I just copied your code, pasted it in to a terminal, ran it, and it did exactly what it is supposed to do. Now, how to make it better?
You don't mention it in the question, but using ls
, awk
, find
, and sort
together somehow seems unlikely to be the answer your tutor expected. You don't mention what your course is in, but I would have expected there to be a less 'cobbled together' solution, using perhaps bash only, or perl, python, or something else.
I certainly would not have solved this problem this way (I would have used perl).
Still, assuming your code meets the expectations for tool use (as in, you are supposed to use awk), there are some things that are off.
Single-line multi-statements
You have a number of 1-liners that you should have expanded. You already have the massive awk code-block, why mix the compact 1-liner and block statements like you have done?
for ( iowner in usr_sum ) if ( usr_sum[iowner] > owner_sum ) {owner = iowner; owner_sum=usr_sum[iowner]}
The above should be:
for ( iowner in usr_sum )
{
if ( usr_sum[iowner] > owner_sum )
{
owner = iowner
owner_sum=usr_sum[iowner]
}
}
Note, the use of block {}
braces for the single if
statement, and not using the semi-colon ;
to separate statements on a single line.
These other statements should also be on their own lines (and the for statement should have a {}
block):
for ( i=9; i<=NF; i++) name=name " " $i ... sum=0;delete usr_sum; ... sum+=1ドル;usr_sum[2ドル]+=1ドル; sum_all+=1ドル;usr_sum_all[2ドル]+=1ドル;
Negative if
Your main if-block would be better if done as a positive check, not a negative check. your code is:
if (type!="d") { ... do not a directory stuff } else { ... do directory stuff }
but would be more readable as:
if (type == "d")
{
... do directory stuff
}
else
{
... do not a directory stuff
}
Breathing space
Your code is suffocating since it has little breathing space. Many of your statements are compressed around the operators and other symbols. Consider lines like:
print 5,ドル3,ドルtype,name sum_all+=5ドル usr_sum_all[3ドル]+=5ドル
Which should be written as:
print 5,ドル 3,ドル type, name
sum_all += 5ドル
usr_sum_all[3ドル] += 5ドル
Performance
The multiple calls to find
may be hurting your performance. In general I can't help but think there's a better way to do this without all the nested OS calls.
Still, it ran fast enough for me on my machine, so it is probably not a real problem.