19

I'm writing scripts that will run in parallel and will get their input data from the same file. These scripts will open the input file, read the first line, store it for further treatment and finally erase this read line from the input file.

Now the problem is that multiple scripts accessing the file can lead to the situation where two scripts access the input file simultaneously and read the same line, which produces the unacceptable result of the line being processed twice.

Now one solution is to write a lock file (.lock_input) before accessing the input file, and then erase it when releasing the input file, but this solution is not appealing in my case because sometimes NFS slows down network communication randomly and may not have reliable locking.

Another solution is to put a process lock instead of writing a file, which means the first script to access the input file will launch a process called lock_input, and the other scripts will ps -elf | grep lock_input. If it is present on the process list they will wait. This may be faster than writing to the NFS but still not perfect solution ...

So my question is: Is there any bash command (or other script interpreter) or a service I can use that will behave like semaphore or mutex locks used for synchronization in thread programming?

Thank you.

Small rough example:

Let's say we have input_file as following:

Monday
Tuesday
Wednesday
Thursday
Friday
Saturday 
Sunday

Treatment script : TrScript.sh

#!/bin/bash 
NbLines=$(cat input_file | wc -l) 
while [ ! $NbLines = 0 ] 
do 
 FirstLine=$(head -1 input_file) 
 echo "Hello World today is $FirstLine" 
 RemainingLines=$(expr $NbLines - 1 ) 
 tail -n $RemainingLines input_file > tmp 
 mv tmp input_file 
 NbLines=$(cat input_file | wc -l) 
done

Main script:

#! /bin/bash 
./TrScript.sh & 
./TrScript.sh & 
./TrScript.sh & 
wait

The result should be:

Hello World today is Monday 
Hello World today is Tuesday 
Hello World today is Wednesday 
Hello World today is Thursday 
Hello World today is Friday 
Hello World today is Saturday 
Hello World today is Sunday
sth
231k56 gold badges288 silver badges370 bronze badges
asked Feb 23, 2010 at 15:01
2

3 Answers 3

15

use

line=`flock $lockfile -c "(gawk 'NR==1' < $infile ; gawk 'NR>1' < $infile > $infile.tmp ; mv $infile.tmp $infile)"`

for accessing the file you want to read from. This uses file locks, though.

gawk NR==1 < ...

prints the first line of the input

answered Feb 24, 2010 at 16:22
Sign up to request clarification or add additional context in comments.

2 Comments

perfect. Seeing you use the flock command hinted for me to do man flock and that was exactly what I needed.
+1 flock, needs more upvotes. Ex: flock -n your_lock_file -c "rsync -rl /media/dir1 /media/dir2"
10

I have always liked the lockfile program (sample search result for lockfile manpage) from the procmail set of tools (should be available on most systems, though it might not be installed by default).

It was designed to lock mail spool files, which are (were?) commonly mounted via NFS, so it does work properly over NFS (as much as anything can).

Also, as long as you you are making the assumption that all your ‘workers’ are on the same machine (by assuming you can check for PIDs, which may not work properly when PIDs eventually wrap), you could put your lock file in some other, local, directory (e.g. /tmp) while processing files hosted on an NFS server. As long as all the workers use the same lock file location (and a one-to-one mapping of lockfile filenames to locked pathnames), it will work fine.

answered Feb 23, 2010 at 20:31

1 Comment

lockfile seems nice. But the other half of your answer is flawed and may result in collisions. Dennis Williamson´s first link seems like a working solution without non-standard commands.
1

Using FLOM (Free LOck Manager) tool your main script can become as easy as:

#!/bin/bash 
flom -- ./TrScript.sh & 
flom -- ./TrScript.sh & 
flom -- ./TrScript.sh & 
wait

if you are running the script inside a single host and something like:

flom -A 224.0.0.1 -- ./TrScript.sh &

if you want to distribute your script on many hosts. Some usage examples are available at this URL: http://sourceforge.net/p/flom/wiki/FLOM%20by%20examples/

answered Apr 24, 2014 at 20:21

Comments

Your Answer

Draft saved
Draft discarded

Sign up or log in

Sign up using Google
Sign up using Email and Password

Post as a guest

Required, but never shown

Post as a guest

Required, but never shown

By clicking "Post Your Answer", you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.