1
\$\begingroup\$

i am testing an application that i wrote and want to test the solution my algorithm produces to a Monte carlo solution. I use the harddisk a lot and i was wondering if there was a solution that uses writing data to a file a lot less, since it is really slowing the process down.

The solutions are computed on the nodes of a cluster and examined using this script ( that runs on a node): Parameter 1ドル is an outputfile that the program wrote.

file=1ドル
script=/home/hefke/ov_paper/scripts
mv $file.out $file.out.old
grep "Overlapscore:" $file.monte > $file.grepped
awk '/./{print 2ドル}' $file.grepped > $file.overlap
print "$script/std_dev.sh $file.overlap > $file.out"
$script/std_dev.sh $file.overlap > $file.out
cat $file.analy >> $file.out
cat "DONE" >> $file.out

Here is the script that collects the data on the main node. Analy and Monte files are my output files.

echo "Processing outputfiles for the mc_stdev_of_ov"
script=/home/hefke/ov_paper/scripts
curdir=`pwd`
folder=filedata
for file in `ls -1 $curdir/temp_output/$folder/*.analy| sed 's/\(.*\)\..*/1円/'|uniq`
do 
 echo $file
 $script/submitter.sh $curdir "processonefile.sh $file.out"
done
echo "$file.out now contains what stdtev spat out."
cat $curdir/temp_output/$folder/*.out >> $curdir/temp_output/tmp.out 
awk -f keys.awk $curdir/temp_output/tmp.out >> table.out
cat table.out

How can i optimize this procedure for speed?

asked Mar 13, 2012 at 14:42
\$\endgroup\$

2 Answers 2

1
\$\begingroup\$

You don't need to store in files between each command. Instead, just redirect the output:

$script/std_dev.sh < <(grep "Overlapscore:" $file.monte | awk '/./{print 2ドル}') > $file.out

The Bash Guide has an excellent article about I/O.

There's only one place where you write to tmp.out, and awk can take more than one file, so you can simplify those lines similarly:

awk -f keys.awk $curdir/temp_output/$folder/*.out

There's no need to redirect to table.out and cating it afterwards.

You shouldn't use ls in scripts; you can simply loop over a glob:

for file in $curdir/temp_output/$folder/*.analy
 file="${file%.*}" # Remove extension
answered Mar 13, 2012 at 15:09
\$\endgroup\$
5
  • \$\begingroup\$ when i use script/std_dev.sh < <(grep "Overlapscore:" $file.monte | awk '/./{print 2ドル}') > $file.out, it tells me :Missing name for redirect. \$\endgroup\$ Commented Mar 14, 2012 at 13:28
  • \$\begingroup\$ Are you sure you're actually running Bash? \$\endgroup\$ Commented Mar 14, 2012 at 13:50
  • \$\begingroup\$ l0b0 you sir are a genius. as a matter of fact i am not :(. I am running the cshell. Thank you very much for your answer anyways :) \$\endgroup\$ Commented Mar 15, 2012 at 6:44
  • \$\begingroup\$ not relating to the question any more, but is there a way to group commands with the () as in bash in cshell? \$\endgroup\$ Commented Mar 15, 2012 at 8:03
  • \$\begingroup\$ Sorry @tarrasch, csh is one beast I've never had to handle, so I really don't know. Maybe food for a separate question on USE? \$\endgroup\$ Commented Mar 15, 2012 at 8:53
2
\$\begingroup\$

It's not related, but please don't mind if I use an "answer" to just comment : it seems I can't comment, maybe because I don't have enough points yet to do so...

Tarrasch, if you still use csh for your shell, please do not script in it.

Please read: http://www.faqs.org/faqs/unix-faq/shell/csh-whynot/

Use instead sh, bash (or even ksh). And better to stick to sh-only because that's what's all unix system rely on (and rc scripts, for example, are based on).

answered Dec 6, 2012 at 16:45
\$\endgroup\$

Your Answer

Draft saved
Draft discarded

Sign up or log in

Sign up using Google
Sign up using Email and Password

Post as a guest

Required, but never shown

Post as a guest

Required, but never shown

By clicking "Post Your Answer", you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.