I have a bash script (just doing some simple log file pattern matching) that I have had running in a cron for a few years. Recently it broke and started returning odd results. Upon digging into the script and doing some debugging on it, I found that the problem seems to be with a while loop I do on a file.
To illustrate the problem I did a cat on the file 'cat /var/tmp/file' and get what I expect to get which is a time stamp and some (unformatted) IDs:
15:56:14,965 [,PCC12345678(PSI12345678),,]
18:08:43,706 [,PCC23456789(PSI23456789),,]
12:01:49,233 [,PCC34567891(PSI34567891),,]
When I put this in a while loop I would expect it to remain the same, but it does not. When I echo the line in the loop the second field (the IDs) is always changed to '1', like this:
cat /var/tmp/file | while read line
do
echo $line
done
It gives me an output of:
15:56:14,965 1
18:08:43,706 1
12:01:49,233 1
Which is obviously completely different to what is in the file and has left me baffled.
Things I've tried so far:
- I thought maybe the $line variable got stuck in memory so tried clearing that, didn't work.
- I tried the classic 'turn it off and on again' to the server for lack of better ideas thinking maybe it would clear out whatever might have got stuck.
- Tried running the script on two other servers, got the same problem.
Currently I am thinking maybe it doesn't like the format of the file, possibly something to do with the square brackets or commas. Although I am not sure why that would be the case or why it would suddenly be happening.
Note: Nothing has changed on either the script or the log files it is running against since being written. It was working previously for two years.
Edit: After suggestions I have checked what may have changed in the environment as far as I can tell its the same. Unless something has changed with an OS upgrade (which I wouldn't know how to check):
- .bash_profile and .profile have not been edited in over three years.
- I have tried to run the script in other shells ksh, bash, csh. Same problem encountered in all of them.
- I have tried running the script with other users including root. Again same issue with all of them.
Thanks Matt
1 Answer 1
You have a file named 1
in the directory the script is running in.
As MelBurslan commented, []
has a special meaning for the shell, but it doesn't have much to do with regexes: it just means "a single character taken from any of the characters between the brackets". So when you run
echo 15:56:14,965 [,PCC12345678(PSI12345678),,]
the shell looks for a file named ,
, or P
, or C
, or 1
... If at least one file matches, [,PCC12345678(PSI12345678),,]
is replaced with all the matching file names in the output; otherwise it's reproduced as is.
If you remove the 1
file the old behaviour should be restored. You can fix the script by protecting $line
:
cat /var/tmp/file | while read line
do
echo "$line"
done
-
Thanks @stephen that was exactly the issue I was having. I wonder if you are able to point me in the direction of some further reading on the square bracket's special meanings and their uses? I have never encountered this behavior before, I guess because I have always previously been in the habit of protecting my variables.MattM– MattM2016年02月03日 04:14:34 +00:00Commented Feb 3, 2016 at 4:14
-
1@MattM, check out the relevant section in the
bash
manual, and the examples in the globbing section of the Advances Bash-Scripting Guide.Stephen Kitt– Stephen Kitt2016年02月03日 12:33:43 +00:00Commented Feb 3, 2016 at 12:33
regex
fu is not strong enough to decipher what the string between square brackets are resolving to but in general, any string in between square brackets are treated as a selection list, if they are separated by commas. For instancea[1,2,3]
is expanded into any one ofa1
ora2
ora3
. Since you are reading the whole line into a variable,line
, bash is taking the square brackets as delimiters of a regex.