add a new column with sum of columns in perl

Question 1

I have a huge text file which has below columns

col1 col2 Jan Feb Mar Apr May Jun Jul Aug Sep Oct Nov Dec
abc dec 10 20 30 40 50 60 70 80 90 11 12 13

The output I am looking for is an addition of all months in new column FullYear.

col1 col2 Jan Feb Mar Apr May Jun Jul Aug Sep Oct Nov Dec FullYear
abc dec 10 20 30 40 50 60 70 80 90 11 12 13 486

I tried using awk command, however, my data is with huge precision numbers. And the below command is giving wrong output.

awk -F ' ' {print 1ドル" "2ドル" "3ドル" "4ドル" "5ドル" "6ドル" "7ドル" "8ドル" "9ドル" "10ドル" "11ドル" "12ドル" "13ドル" "14ドル" "3ドル+4ドル+5ドル+6ドル+7ドル+8ドル+9ドル+10ドル+11ドル+12ドル+13ドル+14ドル}' inputfile.txt > outputfile.txt

I need to write a Perl script to get this done.

Question 2

Define "wrong". How is the awk script not working as expected?

Question 3

The actual numbers in the file are with huge precision 13438.40828455529 14782.24911301082 14782.24911301082 14782.24911301082 14782.24911301082 14782.24911301082 14782.24911301082 14782.24911301082 14782.24911301082 14782.24911301082 14782.24911301082 14782.24911301082 when these numbers are added. It gives addition in unusual format.

Question 4

That's not a probem; use printf to specify the format in which you want the results to be printed. It's also helpful in your example input to use data which demonstrates the problem you're having. By default it's probably using exponential notation; if you want fixed-point notation you can do something like printf( "%5.10f", 6ドル ) to get five and ten places of output before and after the decimal point, respectively.

Question 5

I want to add a new column to existing file. awk -F ' ' {printf "%s %s %d %d %d %d %d %d %d %d %d %d %10.10f" , 1ドル" "2ドル" "3ドル" "4ドル" "5ドル" "6ドル" "7ドル" "8ドル" "9ドル" "10ドル" "11ドル" "12ドル" "13ドル" "14ドル" "3ドル+4ドル+5ドル+6ドル+7ドル+8ドル+9ドル+10ドル+11ドル+12ドル+13ドル+14ドル}' inputfile.txt > outputfile.txt Tried this, however it doesn't work.

Question 6

Again, define "doesn't work". How does it not function as expected or intended? Please answer this question not by adding an additional comment, but by editing your question to include the relevant information.

Question 7

This is fairly easy to do in Perl, even as a one-liner:

perl -MList::Util=sum -anE 'if (1 == $.) { say join(q{ }, @F, q{FullYear}) } else { say join(q{ }, @F, sum(@F[2..13])) }' «YOUR-FILE»

Explanation:

-MList::Util=sum loads the List::Util module and imports the sum function. This is the same as use List::Util qw(sum).

-n tells Perl to process the input file line-by-line, running the script for each line. (Actually redundant, as the next option implicitly turns this on). -a turns on autosplit mode, so we get an array @F with one entry per field. -E means we're going to provide a script as a command-line argument, using current Perl features (for "say" in this case).

Full details for those options can be found in the perlrun manpage/podfile.

Then, here is the script, with spacing added, and comments explaining:

if (1 == $.) { # $. is the line number. Line 1 is header line.
 say join(' ', @F, q{FullYear}); # print out the heder + FullYear
}
else {
 # print out rows + sum of columns 2..13. Remember Perl counts from 0 in arrays,
 # so column 2 is the 3rd column (the number for January).
 say join(' ', @F, sum(@F[2..13]));
}

BTW: You can ask Perl to help understand one-liners (at least ones you trust — this is not safe with untrusted scripts) with -MO=Deparse, which gives output like this:

command:

perl -MO=Deparse -MList::Util=sum -anE 'if (1 == $.) { say join(q{ }, @F, q{FullYear}) } else { say join(q{ }, @F, sum(@F[2..13])) }' t-file

output:

use List::Util (split(/,/, 'sum', 0));
use feature 'current_sub', 'bitwise', 'evalbytes', 'fc', 'postderef_qq', 'say', 'state', 'switch', 'unicode_strings', 'unicode_eval';
LINE: while (defined($_ = readline ARGV)) {
 our @F = split(' ', $_, 0);
 if (1 == $.) {
 say join(' ', @F, 'FullYear');
 }
 else {
 say join(' ', @F, &sum(@F[2..13]));
 }
}
-e syntax OK

So you can see the List::Util load, the -n going line-by-line, and -a adding the split.

Question 8

This works if my numbers are up to 5 digits. However, it gives incorrect added numbers if my numbers are in 7 to 8 digits.

Question 9

Would Math::BigFloat do for your "huge precision"?

perl -MMath::BigFloat -ape 'my $s=0; $s += new Math::BigFloat($_) for @F[2..$#F]; s/$/ $s/'
abc dec 7.5 8.5
abc dec 7.5 8.5 16

You could also use List::Util::sum with Math::BigFloat; it's quite pointless, though:

perl -MMath::BigFloat -MList::Util=sum -ape 's/$/" ".sum map new Math::BigFloat($_), @F[2..$#F]/e'

Question 10

Tried this approach. However, the results I am getting is NaN. Not sure if I am missing something here. perl -MMath::BigFloat -ape 'my $s=0; $s += new Math::BigFloat($_) for @F[4..15]; s/$/ $s/' input

Question 11

that means that any of the fields from the 5th to 16th is not a number. Notice that in perl array indexes start from 0, as in C, not from 1 as in awk or Fortran.

Question 12

@Parix In your awk script, you have 3ドル+...+14ドル; that should translate to @F[2..13] in perl, not to @F[4..15].

Question 13

It's not perl, but this seems to get the job done:

awk 'NR==1 {$(NF+1) = "FullYear"; print} NR>1 {subtotal=0; for(f=0;f<=NF; f++) {subtotal+=$f}; $(NF+1)=subtotal; printf( "%s %s %5.10f %5.10f %5.10f %5.10f %5.10f %5.10f %5.10f %5.10f %5.10f %5.10f %5.10f %5.10f %5.10f\n", 1,ドル 2,ドル 3,ドル 4,ドル 5,ドル 6,ドル 7,ドル 8,ドル 9,ドル 10,ドル 11,ドル 12,ドル 13,ドル 14,ドル 15ドル ) }' inputfile

Question 14

Just a variant of @derobert:

perl -MList::Util=sum -nlE 'say "$_ ", sum((split)[2..13])||"FullYear"' input

or using -a

perl -MList::Util=sum -nalE 'say "$_ ", sum(@F[2..13])||"FullYear"' input

Question 15

Tries this approach too. This works if my numbers are up to 5 digits. However, it gives incorrect added numbers if my numbers are in 7 to 8 digits.

Question 16

@Parix, could you please show me an example of such a situation?

Question 17

Output for 2 records is as below: Jan Feb Mar Apr May Jun Jul Aug Sep Oct Nov Dec FullYear col11 col12 2298846.53 2328664.3 2326527.39 2385298.77 2400046.08 2404192.36 2394351.11 2415755.8 2383387.25 2410001.65 2388574.37 2387894.37 26135645.61 col21 col22 13438.40828 14782.24911 14782.24911 14782.24911 14782.24911 14782.24911 14782.24911 14782.24911 14782.24911 14782.24911 14782.24911 14782.24911 176043.1485 The #1 record <-- gives incorrect added value The #2 record <-- gives correct added value

Question 18

@Parix, with both variants, in my machine I get not "26135645.61" but "28523539.98"

Question 19

hi.. seems my starting data columns are causing the issue. The calculation works where it's a single word. however, it is not considering the last column where in it's more than a single word. Though my entire file is tab delimited with data identifier as double quotes "

derobert derobert 113k20 gold badges242 silver badges288 bronze badges · Accepted Answer · 2019-01-23 21:45:31Z

This is fairly easy to do in Perl, even as a one-liner:

perl -MList::Util=sum -anE 'if (1 == $.) { say join(q{ }, @F, q{FullYear}) } else { say join(q{ }, @F, sum(@F[2..13])) }' «YOUR-FILE»

Explanation:

-MList::Util=sum loads the List::Util module and imports the sum function. This is the same as use List::Util qw(sum).

-n tells Perl to process the input file line-by-line, running the script for each line. (Actually redundant, as the next option implicitly turns this on). -a turns on autosplit mode, so we get an array @F with one entry per field. -E means we're going to provide a script as a command-line argument, using current Perl features (for "say" in this case).

Full details for those options can be found in the perlrun manpage/podfile.

Then, here is the script, with spacing added, and comments explaining:

if (1 == $.) { # $. is the line number. Line 1 is header line.
 say join(' ', @F, q{FullYear}); # print out the heder + FullYear
}
else {
 # print out rows + sum of columns 2..13. Remember Perl counts from 0 in arrays,
 # so column 2 is the 3rd column (the number for January).
 say join(' ', @F, sum(@F[2..13]));
}

BTW: You can ask Perl to help understand one-liners (at least ones you trust — this is not safe with untrusted scripts) with -MO=Deparse, which gives output like this:

command:

perl -MO=Deparse -MList::Util=sum -anE 'if (1 == $.) { say join(q{ }, @F, q{FullYear}) } else { say join(q{ }, @F, sum(@F[2..13])) }' t-file

output:

use List::Util (split(/,/, 'sum', 0));
use feature 'current_sub', 'bitwise', 'evalbytes', 'fc', 'postderef_qq', 'say', 'state', 'switch', 'unicode_strings', 'unicode_eval';
LINE: while (defined($_ = readline ARGV)) {
 our @F = split(' ', $_, 0);
 if (1 == $.) {
 say join(' ', @F, 'FullYear');
 }
 else {
 say join(' ', @F, &sum(@F[2..13]));
 }
}
-e syntax OK

So you can see the List::Util load, the -n going line-by-line, and -a adding the split.

This works if my numbers are up to 5 digits. However, it gives incorrect added numbers if my numbers are in 7 to 8 digits.

Stack Exchange Network

add a new column with sum of columns in perl

4 Answers 4

You must log in to answer this question.

Hot Network Questions

add a new column with sum of columns in perl

4 Answers 4

You must log in to answer this question.

Related

Hot Network Questions