Command line tools to format tables

Question 1

I am a big fan of one-liners using sed awk perl and other tools. But, there are things that are hard to do in one-liners, such as when you are working with a CSV file and there are commas between quotes, or when you want to print a centralized field with printf.

A few months ago I wrote ftable more for fun than anything else, but last weekend I took it seriously and created a GitHub repository and a tutorial for it.

ftable tutorial
ftable code

Questions

Do you know of a tool that's similar to ftable? I hate feeling like re-inventing the wheel.

I am not programmer (I am sysadmin/devops). Is there anybody willing to review the code and spot my endless mistakes?

#! /usr/bin/env perl
# Author: Tiago Lopo Da Silva
# Date: 20/10/2013
# Purpose: Print formatted table
use strict;
use warnings;
use POSIX;
use Switch;
use Getopt::Long qw(:config no_ignore_case);
use Data::Dumper;
our $comma="<comma>";
our $dollar="<dollar>";
our $pipe="|";
our $plus="+";
our $minus="-";
our $FS=',';
our $nb=0;
my %h;
if ($#ARGV >= 0){
my $lf; my $cf; my $rf; my $print;
GetOptions( 'l|left:s' => \$lf,
 'r|right:s' => \$rf,
 'c|center:s' => \$cf,
 'p|print:s' => \$print,
 'F:s' => \$FS,
 'n|noborder' => \$nb,
 ) || print_usage();
%h=get_details($lf,$cf,$rf,$print);
}else {
%h=get_details();
}
print_table(\%h);
sub get_quoted_fields {
# this sub finds quoted fields
 my $str1 = $_[0];
 my $qf;
 while ( $str1 =~ /(["'].*?["'])/ ){
 $qf.="1ドル${comma}";
 $str1 =~ s/1ドル//;
 }
 return $qf;
}
sub get_translated {
 my $qf = $_[0];
 my $str= $_[1];
 my @arr;
 my %h;
 if (defined ($qf)) {
 @arr = split(/$comma/,$qf);
 }
 foreach my $i ( @arr ){
 my $tmpvar=$i;
 $i =~ s/$FS/$comma/g;
 $h{$tmpvar} = $i;
 }
 while ( my($key,$value) = each(%h) ){
 $key =~ s/\$/\\\$/g;
 eval "\$str =~ s/$key/$value/g; ";
 }
 
 return $str;
}
sub special_split {
#This sub splits strings but taking quoted fields in consideration
 my $str=$_[0];
 $str =~ s/\$/$dollar/g;
 $str =~ s/\(/<op>/g;
 $str =~ s/\)/<cp>/g;
 $str =~ s/\//<slash>/g;
 my $str1=$str;
 my $qf;
 my @a;
 $qf=get_quoted_fields("$str1");
 my $translated = get_translated($qf,$str);
 
 @a = split (/$FS/,$translated);
 
 foreach my $i ( @a ){
my $safe_fs=$FS;
switch($safe_fs) {
case '\.' {$safe_fs =~ s/\\//g;}
case '\t' {$safe_fs =~ s/\\t/\t/g;}
case '\s' {$safe_fs =~ s/\\s/ /g;}
}
 $i =~ s/$comma/$safe_fs/eg;
 $i =~ s/$dollar/\$/g;
 $i =~ s/<op>/\(/g;
 $i =~ s/<cp>/\)/g;
 $i =~ s/<slash>/\//g;
 $i =~ s/["']//g;
 $i =~ s/\s+/ /g;
 }
 return @a;
}
sub fill_str {
# This sub fills the string with padding chars
my $f_char=$_[0];
my $f_times=$_[1];
my $str;
$str="$f_char"x$f_times;
return $str;
}
sub print_border {
# This sub prints horizontal border
my @length=@{$_[0]};
foreach my $i (@length){
unless (defined($i)){$i=1;}
print "$plus";
my $counter=0;
while ( $counter < ($i+2) ){
print "$minus";
$counter++;
}
}
print "$pipe\n";
}
sub print_left {
#This sub prints fields with proper padding in the left side.
#It takes two args, 1st length of the maximun field and field content.
my $length=$_[0];
my $col=$_[1];
unless (defined($length)){ $length="";}
unless (defined($col)){ $col="";}
my $str="printf ' %-".$length."s ','".$col."';";
eval $str
}
sub print_right {
#This sub prints fields with proper padding in the right side.
#It takes two args, 1st length of the maximun field and field content.
my $length=$_[0];
my $col=$_[1];
unless (defined($length)){ $length="";}
unless (defined($col)){ $col="";}
my $str="printf ' %".$length."s ','".$col."';";
eval $str
}
sub print_center {
#This sub prints fields with proper padding in the both sides.
#It takes two args, 1st length of the maximun field and field content.
my $length=$_[0];
my $col=$_[1];
my $str;
unless (defined($length)){ $length=1}
my $cl=length($col);
my $padding=(($length - $cl)/2);
 my $lp; my $rp;
if ( (($length - $cl) % 2 ) == 0 ){
$lp=$padding; 
$rp=$padding; 
}else{
$lp=ceil($padding); 
$rp=floor($padding); 
}
my $l_str ; my $r_str;
$l_str=fill_str(" ",$lp);
$r_str=fill_str(" ",$rp);
$str="printf ' %".$length."s ','".$l_str.$col.$r_str."';";
eval $str;
}
sub get_details {
# This subs creates a hash containg the whole content of the table, alignment info and
# number of columns/ fields
my @align = get_align($_[0],$_[1],$_[2]);
my @print = get_print($_[3]);
my @content;
my @tmp_arr;
my @tmp_arr2;
my @length;
my $n_col=0;
my $counter=0;
my $p_print;
if(@print){ $p_print=1; }else{ $p_print=0;}
 
while (<>){
@tmp_arr= special_split("$_");
unless( $p_print ){
for ( my $i=0 ; $i <= $#tmp_arr; $i++){
$print[$i]=$i; 
}
}
my $counter2=0;
foreach my $i (@print){
$tmp_arr2[$counter2] = $tmp_arr[$i];
$counter2++
}
$counter2=0;
foreach my $i (@tmp_arr2){
defined($i) && $i =~ s/^\s+//;
defined($i) && $i =~ s/\s+$//;
$content[$counter][$counter2] = $i;
my $li= length($i);
if ( defined( $length[$counter2] ) ){
if( $li > $length[$counter2] ) {
$length[$counter2]=$li;
}
}else{
$length[$counter2]=$li;
} 
$counter2++;
}
if ( $counter2 > $n_col ){ $n_col=$counter2;}
$counter++;
}
my %details= (
content => \@content, # content of the file
length => \@length, # Maximun length of fields
align => \@align, # Alignment
n_col => $n_col, # Maximun number of columns/fields
);
return %details;
}
sub print_table{
my %h = %{$_[0]};
my @content=@{$h{"content"}};
my @length=@{$h{"length"}};
my @align=@{$h{"align"}};
my $n_col=$h{"n_col"};
my $counter=0;
foreach my $line (@content){
$nb || print_border(\@length);
my $str;
my $counter2=0;
for ( my $i=0; $i < $n_col ; $i++ ){
my $col = $content[$counter][$i];
unless (defined($col)) { $col = ""}
$col =~ s/"//g;
$col =~ s/'//g;
my $l=$length[$counter2];
$nb || print "$pipe";
my $left="false"; my $right="false";
my $center="false";
switch ($align[$counter2]){
case "l" { $left="true";}
case "r" { $right="true";}
else { $center="true";}
}
if ( $right eq "true" ){
print_right($l,$col);
}
if ( $left eq "true" ){
print_left($l,$col);
}
if ( $center eq "true" ){
print_center($l,$col);
}
$counter2++;
}
unless ($nb) {print "$pipe\n"}else{print "\n"}
$counter++;
}
$nb || print_border(\@length);
}
sub get_align {
# This sub creates an array with the alignment information
my $lf = $_[0];
my $cf = $_[1];
my $rf = $_[2];
my @align;
defined($lf) && (my @lf = split (/,/,$lf));
defined($cf) && (my @cf = split (/,/,$cf));
defined($rf) && (my @rf = split (/,/,$rf));
foreach my $i (@lf){
$align[$i] = "l"; 
}
foreach my $i (@cf){
$align[$i] = "c"; 
}
foreach my $i (@rf){
$align[$i] = "r"; 
}
shift(@align);
return @align;
}
sub get_print {
# This sub creates an array containing the field numbers to be printed
my $print = $_[0];
my @print;
defined($print) && (my @a = split(/,/,$print));
my $counter=1;
foreach my $i (@a){
$print[$counter] = $i-1;
$counter++;
}
shift(@print);
return @print;
}
sub print_usage {
my $usage = << 'EOF';
Usage: ftable [OPTIONS] [FILE]
Options:
 -l, --left
 List of field numbers (separated by comma) to be left aligned
 -r, --right
 List of field numbers (separated by comma) to be right aligned
 -c, --center
 List of field numbers (separated by comma) to be center aligned
 It is default if no alignmnet provided
 -p, --print
 List of field numbers (separated by comma) to be printed and ordered
 -n, --noborder
 Do not print border
 -F, --field-separator
 Field separator, if no specified "comma" (,) is the default value
Examples:
 ftable -F ':' -p 3,1,6 /etc/passwd
 ftable -l 1 -c 2,3 -r 4 /tmp/table.csv
 ftable -n -F ':' /etc/passwd
EOF
print $usage;
exit 2;
}

Question 2

I understand your desire to use only core Perl modules, but be aware that parsing CSV is hard. There's a lot of potential gotchas, and home-grown solutions pretty much never handle them properly. For example, what happens if there are embedded quotes inside a quoted field? Or a newline inside a field?

I think the tool is potentially useful, but I would strongly urge you to use Text::CSV (or similar). It will be faster, most robust, and more maintainable. And it won't be reinventing that part of the wheel.

Question 3

column(1) is a Unix tool with a similar purpose, but does not draw box borders using ASCII art.

There may not be a standard command to draw tables with box borders, but as always, there's a CPAN module for that. As for parsing the input, there's a CPAN module for that too. By taking advantage of CPAN, you can avoid reinventing-the-wheel and discard most of your code.

Question 4

column does not align the fields, and does not allow you to order the fields neither. I am aware of CPAN but the idea is to keep it easy to install even when older version of Perl is installed. But thanks I appreciate your suggestions.

Question 5

Is there anybody willing to review the code ... ?

Yes, I am willing to do so :)

Layout

Much of the code posted into the question lacks indentation. It looks like there may have been a problem when posting the code. Regardless, perltidy can be used to automatically format the code with consistent indentation.

Do not place multiple statements on the same line:

my $lf; my $cf; my $rf; my $print;

Again, perltidy will split those onto separate lines.

Namespace

It is best to import only what is needed to avoid namespace pollution. Change:

use POSIX;

to:

use POSIX qw(ceil floor);

Unused code

The following is unused and can be deleted:

use Data::Dumper;

Switch

The Switch module is not intended for production code, and it's use is discouraged. Remove:

use Switch;

and replace switch/case with if/else or other standard constructs.

Our

It is more customary to use my instead of our to declare variables in code like this. Change:

our $nb = 0;

to:

my $nb = 0;

foreach

Since foreach is identical to for, you can just shorten it to for everywhere in your code.

Options

It is great that you provide usage help documentation and that you use GetOptions.

However, it is unusual to check $#ARGV before calling GetOptions, and the help information does not clearly cover this case. Typically, GetOptions is called unconditionally.

The code just hangs if the user mistakenly runs the code without providing the input file on the command line. It would be better if the code generated an error if no file was provided, or the usage information should explicitly mention if the file can be read from stdin (if that is the intent).

Consider adding a -h/-help option; it is standard practice.

Another approach is to store options in a hash variable as opposed to individual scalar variables. Refer to this answer.

Documentation

It is also standard practice to add usage help with plain old documentation (POD) in conjunction with the Pod::Usage module.

This gives you manpage-like help with perldoc:

perldoc ftable

Question 6

Since you prefer not to use CPAN modules, have a look at Perl formats. The format function is part of Perl 5. It allows you to specify a template for the table, and then you can fill it in with values.

Christopher Cashell Christopher Cashell 1711 silver badge3 bronze badges · Answer 1 · 2014-08-08 20:35:52Z

I understand your desire to use only core Perl modules, but be aware that parsing CSV is hard. There's a lot of potential gotchas, and home-grown solutions pretty much never handle them properly. For example, what happens if there are embedded quotes inside a quoted field? Or a newline inside a field?

I think the tool is potentially useful, but I would strongly urge you to use Text::CSV (or similar). It will be faster, most robust, and more maintainable. And it won't be reinventing that part of the wheel.

200_success 200_success 146k22 gold badges190 silver badges479 bronze badges · Answer 2 · 2014-05-26 20:48:52Z

6

\$\begingroup\$

column(1) is a Unix tool with a similar purpose, but does not draw box borders using ASCII art.

There may not be a standard command to draw tables with box borders, but as always, there's a CPAN module for that. As for parsing the input, there's a CPAN module for that too. By taking advantage of CPAN, you can avoid reinventing-the-wheel and discard most of your code.

Share

edited Apr 10, 2024 at 17:08

toolic's user avatar

toolic

14.8k5 gold badges29 silver badges205 bronze badges

answered May 26, 2014 at 20:48

200_success's user avatar

200_success 200_success

146k22 gold badges190 silver badges479 bronze badges

\$\endgroup\$

1

\$\begingroup\$ column does not align the fields, and does not allow you to order the fields neither. I am aware of CPAN but the idea is to keep it easy to install even when older version of Perl is installed. But thanks I appreciate your suggestions. \$\endgroup\$

Tiago
– Tiago

2014年05月26日 21:36:54 +00:00
Commented May 26, 2014 at 21:36

Add a comment |

toolic toolic 14.8k5 gold badges29 silver badges205 bronze badges · Answer 3 · 2024-04-10 17:11:47Z

Is there anybody willing to review the code ... ?

Yes, I am willing to do so :)

Layout

Much of the code posted into the question lacks indentation. It looks like there may have been a problem when posting the code. Regardless, perltidy can be used to automatically format the code with consistent indentation.

Do not place multiple statements on the same line:

my $lf; my $cf; my $rf; my $print;

Again, perltidy will split those onto separate lines.

Namespace

It is best to import only what is needed to avoid namespace pollution. Change:

use POSIX;

to:

use POSIX qw(ceil floor);

Unused code

The following is unused and can be deleted:

use Data::Dumper;

Switch

The Switch module is not intended for production code, and it's use is discouraged. Remove:

use Switch;

and replace switch/case with if/else or other standard constructs.

Our

It is more customary to use my instead of our to declare variables in code like this. Change:

our $nb = 0;

to:

my $nb = 0;

foreach

Since foreach is identical to for, you can just shorten it to for everywhere in your code.

Options

It is great that you provide usage help documentation and that you use GetOptions.

However, it is unusual to check $#ARGV before calling GetOptions, and the help information does not clearly cover this case. Typically, GetOptions is called unconditionally.

The code just hangs if the user mistakenly runs the code without providing the input file on the command line. It would be better if the code generated an error if no file was provided, or the usage information should explicitly mention if the file can be read from stdin (if that is the intent).

Consider adding a -h/-help option; it is standard practice.

Another approach is to store options in a hash variable as opposed to individual scalar variables. Refer to this answer.

Documentation

It is also standard practice to add usage help with plain old documentation (POD) in conjunction with the Pod::Usage module.

This gives you manpage-like help with perldoc:

perldoc ftable

stevenl stevenl 1854 bronze badges · Answer 4 · 2014-05-27 09:10:02Z

Since you prefer not to use CPAN modules, have a look at Perl formats. The format function is part of Perl 5. It allows you to specify a template for the table, and then you can fill it in with values.

Stack Exchange Network

Command line tools to format tables

4 Answers 4

Layout

Namespace

Unused code

Switch

Our

foreach

Options

Documentation

Your Answer

Sign up or log in

Post as a guest

Post as a guest

Linked

Hot Network Questions

Command line tools to format tables

4 Answers 4

Layout

Namespace

Unused code

Switch

Our

foreach

Options

Documentation

Your Answer

Sign up or log in

Post as a guest

Post as a guest

Linked

Related

Hot Network Questions