How to add column in the beginning of file using perl?

Question 1

I want a Perl one-liner that checks whether the first fields of an input file is the file's name and, if it isn't, adds the file name as the first column on every line.

Example written in shell :

for f in *file*.csv;
do 
 file_column=`cat ${f} | awk -F',' '{print1ドル}'`
 if [ $file_column != ${f} ]
 then
 sed -i "s/^/$f,/" $f 2>/dev/null;
 fi 
done

But the approach above, which checks whether the file name is present in the first column and adds it if it isn't, is taking ~3 Hours for 4 Laks files. I understand that Perl is faster for file operations.

The Perl command I tried:

perl -p -i -e 's/^/Welcome to Hell,/' file*.csv

Please help me add the logic to check whether the field exists already and only change if it doesn't.

Input : file1.csv 
col1,col2,col3 
data1,data2,dat3 
Output: file1.csv 
file1.csv,col1,col2,col3 
file1.csv,data1,data2,data3

or if here is any faster way please suggest. Perl one liner because it's part of another shell script so tiny call will be better i guess (suggest please)

Question 2

Can you give some sample input/output? Also: Why is a one liner desirable?

Question 3

I would offer - embedding perl into another script isn't as useful as just writing a script in perl.

Question 4

@Sobrique , i would be happy if you offer an perl script for the above problem

Question 5

Here's your perl one-liner: it works with multiple file arguments

perl -i -pe '/^$ARGV,/ or print "$ARGV,"' file1 file2 ...

$ARGV is the magic variable that holds the filename of the current file.
See http://perldoc.perl.org/perlvar.html#Variables-related-to-filehandles

The field separator (comma) is hardcoded. You can decide if that's a problem.

Small performance improvement:

perl -i -pe 'index($_, "$ARGV,") == 0 or print "$ARGV,"' file1 file2 ...

Question 6

Before told about perl speed try to speed up your own script

for f in *file*.csv;
do 
 sed -i "/^$f,/! s/^/$f,/" "$f"
done

Question 7

other than first line it removed rest of the line in the file ,i need file name along with the data as starting column

Question 8

@WilliamR I have edited already

Question 9

Thanks lot @Costas , It took 20 seconds for 30K file, Let me check for 4Laks files.

Question 10

It's taking long time to complete for 4 Laks files , For smaller amount like 30K files it's taking ~5 to 10 seconds. Do you have any suggestion ?

Question 11

While you can actually do this with Perl, the syntax is not the simplest (or at least, it isn't with the best I can come up with). It will probably be both simpler and faster to use other tools. For example,

sed

gawk (relatively recent versions)

for f in file*csv; do 
 awk -i inplace -F, '{
 if(1ドル==FILENAME){print} else{print FILENAME","0ドル}
 }' "$f"; 
done

Question 12

OK, the problem with a 'perl one liner' as you note:

perl -p -i -e 's/^/Welcome to Hell,/' file*.csv

This applies a transform to the file right enough, but perl 'handles' opening the file(s) and streaming them through STDIN automagically. Which means you don't know your file name when you're doing it.

The in place edit option (-i) is a convenience but actually becomes rather more difficult to actually use effectively, since you're potentially opening a file for reading and writing concurrently.

Anyway, I'd approach your problem like this:

#!/usr/bin/perl
use strict;
use warnings;
use Text::CSV;
my $csv = Text::CSV->new( { binary => 1 } );
foreach my $filename ( glob("*.csv") ) {
 open( my $output, ">", "new.$filename.csv" ) or warn $!;
 open( my $input, "<", "$filename.csv" ) or warn $!;
 while ( my $row = $csv->getline($input) ) {
 if ( not $row->[0] eq m/$filename/ ) {
 unshift( @{$row}, $filename );
 }
 $csv->print( $output, $row );
 }
}

It uses the Text::CSV module, because actually CSV is often more complicated than just "split on comma" (think multi-line fields, and commas in text).

Question 13

Can't manage a one liner, but here's a perl script. Put it in a file and make it executable. Then give it the *.csv filenames as args. It creates *.new files. If you are confident it works, uncomment the rename command at the end.

#!/usr/bin/perl
use strict;
foreach my $file(@ARGV){
 open(F,$file) or die "$file:$!";
 $_ = <F>;
 next if $_=~/^$file,/;
 open(OUT,">$file.new") or die;
 my $add = "$file,";
 print OUT $add,$_;
 while(<F>){
 print OUT $add,$_;
 }
 close OUT;
 close F;
 #rename("$file.new","$file");
}

score 3 · Accepted Answer · 2015-06-24 14:20:24Z

Here's your perl one-liner: it works with multiple file arguments

perl -i -pe '/^$ARGV,/ or print "$ARGV,"' file1 file2 ...

$ARGV is the magic variable that holds the filename of the current file.
See http://perldoc.perl.org/perlvar.html#Variables-related-to-filehandles

The field separator (comma) is hardcoded. You can decide if that's a problem.

Small performance improvement:

perl -i -pe 'index($_, "$ARGV,") == 0 or print "$ARGV,"' file1 file2 ...

Stack Exchange Network

How to add column in the beginning of file using perl?

5 Answers 5

You must log in to answer this question.

Hot Network Questions

How to add column in the beginning of file using perl?

5 Answers 5

You must log in to answer this question.

Related

Hot Network Questions