As a part of my bash routine I am printing some message in terminal regarding status of the workflow. The message in splited into two parts (part 1: begining of the task, part 2: status of its finishing)
echo -n "Dataset is being processed ! "; execution of some AWK script; echo " Processing has been COMPLETED!"
Here is realisation in bash contained a part of the AWK code:
# print pharase 1: initiation of the process
echo -n "Dataset is being rescored.. Please wait"; sleep 0.5
# this is the process: makedir for the result and execute AWK code to process input file
mkdir ${results}
# Apply the following AWK code on the directory contained input file
while read -r d; do
awk -F, '
}' "${d}_"*/target_file.csv > "${results}/"${d%%_*}".csv"
done < <(find . -maxdepth 1 -type d -name '*_*_*' | awk -F '[_/]' '!seen[2ドル]++ {print 2ドル}')
# print pharase 2: finish of the result, which would appear in the terminal near phrase 1
# this will print "COMPLETED" letter-by-letter with the pause of 0.2 sec between each letter
echo -n " C"; sleep 0.2; echo -n "O"; sleep 0.2; echo -n "M"; sleep 0.2; echo -n "P"; sleep 0.2; echo -n "L"; echo -n "E"; sleep 0.2; echo -n "T"; echo -n "E"; sleep 0.2; echo "D!"
While executing this script in bash, everything seems to be OK and I have not noticed any problems related to the parts of the code between both 'echo -n' blocks. May such splitting of the status phrase using "echo -n" lead to some bugs of the routine in bash ? Any suggestions for realisation of such status message in bash using another syntax?
1 Answer 1
I see very little error-checking in this script. It's important to know whether mkdir
succeeded, for example (it certainly won't as it stands, as results
is never assigned).
We really ought to be quoting variable expansions, to prevent unwanted word-splitting:
mkdir "$results" || exit
The arbitrary sleep
values need documenting. Why do we need to sleep, and how was the duration determined? Can we wait for something instead?
Not all echo
implementations accept -n
option. To be portable, we should use printf %s
instead.
The Awk command in the read loop looks like it's corrupted: the closing brace is unmatched.
Since this program is quite chatty, consider printing each file name as it's reached. Or a fraction complete:
readarray -t files \
< <(find . -maxdepth 1 -type d -name '*_*_*' | awk -F '[_/]' '!seen[2ドル]++ {print 2ドル}')
for i in ${!files[@]}
do
d=${files[$i]}
printf 'Processing (%d/%d)\r' $i ${#files[@]}
awk -F, $'\n}' "${d}_"*/target_file.csv >"$results/${d%%_*}.csv"
done
I don't understand why we have the !seen[2ドル]++ {print 2ドル}
in the find
pipeline - find
will output each filename exactly once anyway. Much better to have find
print just the directories, and zero-terminate them so we're robust enough to handle all possible filenames:
readarray -t -d '' files \
< <(find * -maxdepth 0 -type d -name '*_*_*' -print0)
printf %s
is more portable thanecho -n
. \$\endgroup\$