Using the columns 4 and 2 will create a report like the output file showed below. My code works fine but I believe it can be done shorter.
I have a doubt in the part of the split:
CNTLM = split ("20,30,40,60", LMT
It works but will be better to have exactly the values "10,20,30,40" as values in column 4:
4052538693,2910,04-May-2018-22,10 4052538705,2910,04-May-2018-22,10 4052538717,2910,04-May-2018-22,10 4052538729,2911,04-May-2018-22,20 4052538741,2911,04-May-2018-22,20 4052538753,2912,04-May-2018-22,20 4052538765,2912,04-May-2018-22,20 4052538777,2914,04-May-2018-22,10 4052538789,2914,04-May-2018-22,10 4052538801,2914,04-May-2018-22,30 4052539029,2914,04-May-2018-22,20 4052539041,2914,04-May-2018-22,20 4052539509,2915,04-May-2018-22,30 4052539521,2915,04-May-2018-22,30 4052539665,2915,04-May-2018-22,30 4052539677,2915,04-May-2018-22,10 4052539689,2915,04-May-2018-22,10 4052539701,2916,04-May-2018-22,40 4052539713,2916,04-May-2018-22,40 4052539725,2916,04-May-2018-22,40 4052539737,2916,04-May-2018-22,40 4052539749,2916,04-May-2018-22,40 4052539761,2917,04-May-2018-22,10 4052539773,2917,04-May-2018-22,10
Here is the code I use to get the output desired:
printf " Code 10 20 30 40 Total\n" > header
dd=`cat header | wc -L`
awk -F"," '
BEGIN {CNTLM = split ("20,30,40,60", LMT)
cmdsort = "sort -nr"
DASHES = sprintf ("%0*d", '$dd', _)
gsub (/0/, "-", DASHES)
}
{for (IX=1; IX<=CNTLM; IX++) if (4ドル <= LMT[IX]) break
CNT[2,ドルIX]++
COLTOT[IX]++
LNC[2ドル]++
TOT++
}
END {
print DASHES
for (l in LNC)
{printf "%5d", l | cmdsort
for (IX=1; IX<=CNTLM; IX++) {printf "%9d", CNT[l,IX]+0 | cmdsort
}
printf " = %6d" RS, LNC[l] | cmdsort
}
close (cmdsort)
print DASHES
printf "Total"
for (IX=1; IX<=CNTLM; IX++) printf "%9d", COLTOT[IX]+0
printf " = %6d" RS, TOT
print DASHES
printf "PCT "
for (IX=1; IX<=CNTLM; IX++) printf "%9.1f", COLTOT[IX]/TOT*100
printf RS
print DASHES
}
' file
cat header output
Output file I got:
Code 10 20 30 40 Total ---------------------------------------------------- 2917 2 0 0 0 =わ 2 2916 0 0 0 5 =わ 5 2915 2 0 3 0 =わ 5 2914 2 2 1 0 =わ 5 2912 0 2 0 0 =わ 2 2911 0 2 0 0 =わ 2 2910 3 0 0 0 =わ 3 ---------------------------------------------------- Total 9 6 4 5 = 24 ---------------------------------------------------- PCT 37.5 25.0 16.7 20.8 ----------------------------------------------------
1 Answer 1
This is not very much different from your solution. It does not rely on the header being hardcoded though. Depends on GNU awk for the use of PROCINFO to control array traversal.
gawk -F, '
{count[2,ドル4ドル]++; code[2ドル]; val[4ドル]}
END {
PROCINFO["sorted_in"] = "@ind_num_asc"
printf "Code\t"
dashes = "--------"
for (v in val) {
printf "%8d", v
dashes = dashes "--------"
}
printf " =%8s\n", "Total"
dashes = dashes "-----------"
print dashes
for (c in code) {
sum_code = 0
printf "%d\t", c
for (v in val) {
sum_code += count[c,v]
sum_val[v] += count[c,v]
printf "%8d", count[c,v]
}
printf " =%8d\n", sum_code
}
print dashes
printf "Total\t"
sum = 0
for (v in val) {
sum += sum_val[v]
printf "%8d", sum_val[v]
}
printf " =%8d\n", sum
print dashes
printf "PCT\t"
for (v in val) {
printf "%8.1f", 100*sum_val[v]/sum
}
print "\n" dashes
}
' file
Code 10 20 30 40 = Total
---------------------------------------------------
2910 3 0 0 0 =わ 3
2911 0 2 0 0 =わ 2
2912 0 2 0 0 =わ 2
2914 2 2 1 0 =わ 5
2915 2 0 3 0 =わ 5
2916 0 0 0 5 =わ 5
2917 2 0 0 0 =わ 2
---------------------------------------------------
Total 9 6 4 5 = 24
---------------------------------------------------
PCT 37.5 25.0 16.7 20.8
---------------------------------------------------
I'm not a fan of ALL_CAPS_VARNAMES
-
\$\begingroup\$ Glenn.. your code version works perfectly. appreciate the modification done \$\endgroup\$OXXO– OXXO2018年05月13日 11:28:01 +00:00Commented May 13, 2018 at 11:28