4
\$\begingroup\$

I'm working with a small bash code which is working fine but i'm just looking if there is better way to formulate this, In this code i'm looking for the Files between year 2002 and 2018 on the 7th column.

Below is the working code,

Script:

#!/bin/bash
# scriptName: Ftpcal.sh
FILE="/home/pygo/Cyberark/ftplogs_3"
AWK="/bin/awk"
GREP="/bin/grep"
USERS="`"$AWK" '7ドル >= "2002" && 7ドル <= "2018"' $FILE | "$AWK" '{print 3ドル}' | sort -u`"
for user in $USERS;
do
echo "User $user " | tr -d "\n";
"$AWK" '7ドル >= "2002" && 7ドル <= "2018"' "$FILE" | "$GREP" "$user" | "$AWK" '{ total += 4ドル}; END { print "Total Space consumed: " total/1024/1024/1024 "GB"}';
done | column -t
echo ""
echo "=============================================================="
"$AWK" '7ドル >= "2002" && 7ドル <= "2018"' "$FILE" | "$AWK" '{ total += 4ドル}; END { print "Total Space consumed by All Users: " total/1024/1024/1024 "GB"}';
echo ""

Actual data Result:

$ sh Ftpcal.sh
User 16871 Total Space consumed: 0.0905161GB
User 253758 Total Space consumed: 0.0750855GB
User 34130 Total Space consumed: 3.52537GB
User 36640 Total Space consumed: 0.55393GB
User 8490 Total Space consumed: 3.70858GB
User tx-am Total Space consumed: 0.18992GB
User tx-ffv Total Space consumed: 0.183137GB
User tx-ttv Total Space consumed: 17.2371GB
User tx-st Total Space consumed: 0.201205GB
User tx-ti Total Space consumed: 58.9704GB
User tx-tts Total Space consumed: 0.0762068GB
------------ snipped output --------------
==============================================================
Total Space consumed by All Users: 255.368GB

Sample data:

-rw-r--r-- 1 34130 14063436 Aug 15 2002 /current/focus-del/files/from_fix.v.gz
-rw-r--r-- 1 34130 14060876 Jul 12 2007 /current/focus-del/files/from1_fix.v.gz
-rw-r--r-- 1 34130 58668461 Feb 23 2006 /current/focus-del/files/from_1.tar.gz
-rw-r--r-- 1 34130 14069343 Aug 7 20017 /current/focus-del/files/from_tm_fix.v.gz
-rw-r--r-- 1 34130 38179000 Dec 7 20016 /current/focus-del/files/from_tm.gds.gz
-rw-r--r-- 1 34130 15157902 Nov 22 20015 /current/focus-del/files/from_for.tar.gz
-rw-r--r-- 1 34130 97986560 Nov 4 20015 /current/focus-del/files/from_layout.tar

Sample Result:

$ sh Ftp_cal.sh
User 34130 Total Space consumed: 0.0808321GB
==============================================================
Total Space consumed by All Users: 0.0808321GB

I'm okay with any better approach as a review process to make it more robust.

Thanks.

asked Apr 18, 2019 at 15:45
\$\endgroup\$

1 Answer 1

2
\$\begingroup\$
AWK="/bin/awk"

It's easier and more readable if you just set your PATH to something appropriate.

USERS="`"$AWK" '7ドル >= "2002" && 7ドル <= "2018"' $FILE | "$AWK" '{print 3ドル}' | sort -u`"

Backticks should almost always be replaced by $( ... ), which is faster because it does not invoke a subshell.

Literal numbers should not be quoted. It happens to still do what you want in awk; in some languages it won't. A bad habit, easily avoided.

There's no need to invoke awk a second time to extract the third field. Simply pair the action {print 3ドル} with the condition (7ドル >= ...) that's already there.

It's good form to indent the body of a for block (or any other block).

echo "User $user " | tr -d "\n";

To suppress a newline on echo, use echo -n.

column -t

This has some awkward consequences, like tabs inside of labels ("TotalTABSpace") and unaligned numbers. printf will give much prettier results. Both bash and awk provide it.

total/1024/1024/1024 

Nothing wrong with this, as such, but 2**30 is useful shorthand for gigabyte.

==============================================================

Bash can generate sequences like this with the idiom printf "=%.0s" {1..62}. The = is the character and 62 is the count.

You're traversing the file three times and extracting the same information each time. This is going to get slow as the file grows. Awk has associative arrays: you can store a subtotal for each user, then iterate and print those subtotals at the end of the awk script, accomplishing the whole thing in one go.

Putting it all together:

/bin/awk -vusrfmt="User %-20s Total Space consumed: %11.6f GB\n" \
 -vsumfmt=$( printf "=%.0s" {1..62} )"\nTotal Space consumed by All Users: %.6f GB\n" '
 7ドル >= 2002 && 7ドル <= 2018 { 
 subtot[3ドル]+=4ドル
 tot+=4ドル
 }
 END {
 for (u in subtot) printf usrfmt, u, subtot[u] / 2**30
 printf sumfmt, tot / 2**30
 }'
answered Apr 18, 2019 at 16:31
\$\endgroup\$
2
  • \$\begingroup\$ "Backticks should almost always be replaced by $( ... )". Why just "almost always" and not "always" ? \$\endgroup\$ Commented Apr 20, 2019 at 21:24
  • \$\begingroup\$ I think backticks can improve readability versus nested $( $ ( ) ); consider something like x=$( printf %d $( wc -l $file ) ); replacing the inner parens with backticks is okay there. \$\endgroup\$ Commented Apr 21, 2019 at 0:44

Your Answer

Draft saved
Draft discarded

Sign up or log in

Sign up using Google
Sign up using Email and Password

Post as a guest

Required, but never shown

Post as a guest

Required, but never shown

By clicking "Post Your Answer", you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.