Return to Answer

Add single-pass awk version

edited Jun 3, 2019 at 10:19

87.1k
14
104
322

Although this makes three passes over the input files, that's probablythat might be an acceptable trade-off against the complexity of a single-pass approach here (and is already the approach taken in the original code).

#Single-pass version using awk A single-pass version doesn't require us to use an array to store the filenames; we can simply stream the file contents into a suitable counting function. We could implement that counting function in shell, but it's probably easier to write a short awk program. Note that with no arrays, we can make this a POSIX shell program:

#!/bin/sh
set -eu
fileExt='*.js'
find "${@:-.}" -name "$fileExt" -type f -print0 | xargs -0 cat |
 awk 'BEGIN { all = 0; blank = 0; comment = 0; incomment = 0; }
 { ++all
 if (0ドル ~ "/\\*") { incomment = 1 }
 if (incomment) { ++comment; if (0ドル ~ "\\*/") incomment = 0; }
 else { blank += (0ドル ~ "^[[:blank:]]*$"); comment += (0ドル ~ "^[[:blank:]]*//") } }
 
 END { print "Total comment lines is:", comment
 print "Total blank lines is:", blank
 print "Total all lines is:", all }'

Although this makes three passes over the input files, that's probably an acceptable trade-off against the complexity of a single-pass approach here (and is already the approach taken in the original code).

Although this makes three passes over the input files, that might be an acceptable trade-off against the complexity of a single-pass approach here (and is already the approach taken in the original code).

#!/bin/sh
set -eu
fileExt='*.js'
find "${@:-.}" -name "$fileExt" -type f -print0 | xargs -0 cat |
 awk 'BEGIN { all = 0; blank = 0; comment = 0; incomment = 0; }
 { ++all
 if (0ドル ~ "/\\*") { incomment = 1 }
 if (incomment) { ++comment; if (0ドル ~ "\\*/") incomment = 0; }
 else { blank += (0ドル ~ "^[[:blank:]]*$"); comment += (0ドル ~ "^[[:blank:]]*//") } }
 
 END { print "Total comment lines is:", comment
 print "Total blank lines is:", blank
 print "Total all lines is:", all }'

Source Link

answered May 31, 2019 at 13:18

Toby Speight

answered May 31, 2019 at 13:18

Toby Speight

87.1k
14
104
322

#General Run shellcheck on this script - almost all variable expansions are unquoted, but need to be quoted. That will also highlight the non-portable echo -e (prefer printf instead) and a dodgy use of $(( where $( ( would be safer.

I recommend setting -u and -e shell options to help you catch more errors.

#Flexibility Instead of requiring users to change to the top directory of the project, we could allow them to specify one or more directories as command-line arguments, and use current directory as a fallback if no arguments are provided:

dirs=("${@:-.}")

#Finding files allFiles will include directories and other non-regular files, if they happen to end in .js. We need to add a file type predicate:

allFiles=$(find "${dirs[@]}" -name "$fileExt" -type f)

Since we're using Bash, it makes sense to take advantage of array variables - though we'll still have problems for filenames containing whitespace. To fix that, we need to read answers to How can I store the "find" command results as an array in Bash?:

allFiles=()
while IFS= read -r -d ''; do
 allFiles+=("$REPLY")
done < <(find ./ -name "$fileExt" -type f -print0)

It may almost be simpler to set globstar shell option and then remove non-regular files from the glob result.

#Counting comment lines I didn't follow your Perl code, but I have an alternative approach using sed:

convert all lines from initial /** to final */ to begin with // instead,
then keep only the lines beginning with optional whitespace then //:

sed -e '\!^[[:blank:]]*/\*\*!,\!\*/!s/.*/\\\\/' \
 -e '\|^[[:blank:]]*//|!d'

(Actually, that's a lot less pretty than I'd hoped!)

#Blank lines Here, we've used the regular expression that matches comment lines. We want '^[[:blank:]]*$' instead, to match lines that contain only (optional) whitespace.

#All lines Again, over-complicated: just cat the files together and then use wc -l.

#Printing I find it easier to visualise output formatting if we simply use a here-document:

cat <<EOF
Total comments lines is: $commentLines.
Total blank lines is: $blankLines.
Total all lines is: $allLines.
EOF
exit

#Modified code #!/bin/bash set -eu

fileExt='*.js'
dirs=("${@:-/usr/lib/nodejs/lodash}")
allFiles=()
while IFS= read -r -d ''; do
 allFiles+=("$REPLY")
done < <(find "${dirs[@]}" -name "$fileExt" -type f -print0)
 
commentLines=$(sed -e '\!^[[:blank:]]*/\*\*!,\!\*/!s/.*/\\\\/' \
 -e '\|^[[:blank:]]*//|!d' \
 "${allFiles[@]}" | wc -l)
blankLines=$(cat "${allFiles[@]}" | grep -c '^[[:blank:]]*$')
allLines=$(cat "${allFiles[@]}" | wc -l)
cat <<EOF
Total comment lines is: $commentLines.
Total blank lines is: $blankLines.
Total all lines is: $allLines.
EOF

default