Although this makes three passes over the input files, that's probablythat might be an acceptable trade-off against the complexity of a single-pass approach here (and is already the approach taken in the original code).
#Single-pass version using awk
A single-pass version doesn't require us to use an array to store the filenames; we can simply stream the file contents into a suitable counting function. We could implement that counting function in shell, but it's probably easier to write a short awk
program. Note that with no arrays, we can make this a POSIX shell program:
#!/bin/sh
set -eu
fileExt='*.js'
find "${@:-.}" -name "$fileExt" -type f -print0 | xargs -0 cat |
awk 'BEGIN { all = 0; blank = 0; comment = 0; incomment = 0; }
{ ++all
if (0ドル ~ "/\\*") { incomment = 1 }
if (incomment) { ++comment; if (0ドル ~ "\\*/") incomment = 0; }
else { blank += (0ドル ~ "^[[:blank:]]*$"); comment += (0ドル ~ "^[[:blank:]]*//") } }
END { print "Total comment lines is:", comment
print "Total blank lines is:", blank
print "Total all lines is:", all }'
Although this makes three passes over the input files, that's probably an acceptable trade-off against the complexity of a single-pass approach here (and is already the approach taken in the original code).
Although this makes three passes over the input files, that might be an acceptable trade-off against the complexity of a single-pass approach here (and is already the approach taken in the original code).
#Single-pass version using awk
A single-pass version doesn't require us to use an array to store the filenames; we can simply stream the file contents into a suitable counting function. We could implement that counting function in shell, but it's probably easier to write a short awk
program. Note that with no arrays, we can make this a POSIX shell program:
#!/bin/sh
set -eu
fileExt='*.js'
find "${@:-.}" -name "$fileExt" -type f -print0 | xargs -0 cat |
awk 'BEGIN { all = 0; blank = 0; comment = 0; incomment = 0; }
{ ++all
if (0ドル ~ "/\\*") { incomment = 1 }
if (incomment) { ++comment; if (0ドル ~ "\\*/") incomment = 0; }
else { blank += (0ドル ~ "^[[:blank:]]*$"); comment += (0ドル ~ "^[[:blank:]]*//") } }
END { print "Total comment lines is:", comment
print "Total blank lines is:", blank
print "Total all lines is:", all }'
#General
Run shellcheck
on this script - almost all variable expansions are unquoted, but need to be quoted. That will also highlight the non-portable echo -e
(prefer printf
instead) and a dodgy use of $((
where $( (
would be safer.
I recommend setting -u
and -e
shell options to help you catch more errors.
#Flexibility Instead of requiring users to change to the top directory of the project, we could allow them to specify one or more directories as command-line arguments, and use current directory as a fallback if no arguments are provided:
dirs=("${@:-.}")
#Finding files
allFiles
will include directories and other non-regular files, if they happen to end in .js
. We need to add a file type predicate:
allFiles=$(find "${dirs[@]}" -name "$fileExt" -type f)
Since we're using Bash, it makes sense to take advantage of array variables - though we'll still have problems for filenames containing whitespace. To fix that, we need to read answers to How can I store the "find" command results as an array in Bash?:
allFiles=()
while IFS= read -r -d ''; do
allFiles+=("$REPLY")
done < <(find ./ -name "$fileExt" -type f -print0)
It may almost be simpler to set globstar
shell option and then remove non-regular files from the glob result.
#Counting comment lines
I didn't follow your Perl code, but I have an alternative approach using sed
:
- convert all lines from initial
/**
to final*/
to begin with//
instead, - then keep only the lines beginning with optional whitespace then
//
:
sed -e '\!^[[:blank:]]*/\*\*!,\!\*/!s/.*/\\\\/' \
-e '\|^[[:blank:]]*//|!d'
(Actually, that's a lot less pretty than I'd hoped!)
#Blank lines
Here, we've used the regular expression that matches comment lines. We want '^[[:blank:]]*$'
instead, to match lines that contain only (optional) whitespace.
#All lines
Again, over-complicated: just cat
the files together and then use wc -l
.
#Printing I find it easier to visualise output formatting if we simply use a here-document:
cat <<EOF
Total comments lines is: $commentLines.
Total blank lines is: $blankLines.
Total all lines is: $allLines.
EOF
exit
#Modified code #!/bin/bash set -eu
fileExt='*.js'
dirs=("${@:-/usr/lib/nodejs/lodash}")
allFiles=()
while IFS= read -r -d ''; do
allFiles+=("$REPLY")
done < <(find "${dirs[@]}" -name "$fileExt" -type f -print0)
commentLines=$(sed -e '\!^[[:blank:]]*/\*\*!,\!\*/!s/.*/\\\\/' \
-e '\|^[[:blank:]]*//|!d' \
"${allFiles[@]}" | wc -l)
blankLines=$(cat "${allFiles[@]}" | grep -c '^[[:blank:]]*$')
allLines=$(cat "${allFiles[@]}" | wc -l)
cat <<EOF
Total comment lines is: $commentLines.
Total blank lines is: $blankLines.
Total all lines is: $allLines.
EOF
Although this makes three passes over the input files, that's probably an acceptable trade-off against the complexity of a single-pass approach here (and is already the approach taken in the original code).