3
\$\begingroup\$

I want to parse some reports from multiple devices, reports looks like this:

VR Destination Mac Age Static VLAN VID Port
VR-Default 192.168.11.13 90:e2:ba:3c:95:c0 2 NO intra1 350 49
VR-Default 192.168.1.1 00:0e:a6:f7:b6:b5 0 NO main 602 1
VR-Default 192.168.1.2 00:0d:88:63:bf:d1 3 NO main 602 1
VR-Default 192.168.1.14 00:1c:f0:c7:d2:52 4 NO main 602 1
etc...
Dynamic Entries : 19 Static Entries : 0
Pending Entries : 1
In Request : 3888802 In Response : 4531
and some more data...
Rx Error : 0 Dup IP Addr : 0.0.0.0
and some more...

I need only vr, destination, mac, age, static, vlan, vid and port fields. I can parse it using split function and regexes, but split fails if one field (e.g. Age) is empty. perldoc says I can use unpack:

my $template = 'A13xA16xA18xA4xA7xA13xA5xA*'; 
for my $line ( split /\n/, $data ) {
 chomp $line;
 my ($vr, $destination, $mac, $age, $static, $vlan, $vid, $port) = unpack $template, $line;
...
}

But it dies on lines with length < 84. And I got to check string length every time (Or maybe using eval on unpack? Is it better?). And again I got to use regexes or index to find the end of main table and skip headers. The code will looks like:

#!/usr/bin/perl
use strict;
use warnings;
my $arp = <<'ARP';
VR Destination Mac Age Static VLAN VID Port
VR-Default 192.168.11.13 90:e2:ba:3c:95:c0 2 NO intra1 350 49
VR-Default 192.168.1.1 00:0e:a6:f7:b6:b5 0 NO main 602 1
VR-Default 192.168.1.2 00:0d:88:63:bf:d1 3 NO main 602 1
VR-Default 192.168.1.14 00:1c:f0:c7:d2:52 4 NO main 602 1
Dynamic Entries : 19 Static Entries : 0
Pending Entries : 1
In Request : 3888802 In Response : 4531
Rx Error : 0 Dup IP Addr : 0.0.0.0
ARP
my $template = 'A13xA16xA18xA4xA7xA13xA5xA*';
for my $line ( split /\n/, $arp ) {
 last if index( $line, 'Dynamic E' ) == 0;
 next if length $line < 84;
 chomp $line;
 my ($vr, $destination, $mac, $age, $static, $vlan, $vid, $port) = unpack $template, $line;
 next if $mac eq 'Mac';
 print "$mac - $destination\n";
}

My question is: how would you parse this data? split, regexes, unpack, substr or something else?

rolfl
98.1k17 gold badges219 silver badges419 bronze badges
asked Nov 5, 2013 at 14:03
\$\endgroup\$
3
  • \$\begingroup\$ This question does not deserve to be closed — the second program works. \$\endgroup\$ Commented Nov 5, 2013 at 14:57
  • \$\begingroup\$ perhaps CSV module? \$\endgroup\$ Commented Nov 5, 2013 at 16:28
  • \$\begingroup\$ @mpapec I thought CSV module will fails when some fields will be empty \$\endgroup\$ Commented Nov 5, 2013 at 16:44

1 Answer 1

2
\$\begingroup\$

If a problem has been encountered before by someone else, chances are that there is a CPAN module for that (DataExtract::FixedWidth).

If you don't want to use a CPAN module, then my next choice would be to use regular expressions.

use strict;
# Strips leading and trailing whitespace from all parameters
sub strip {
 for (@_) { s/^\s+//; s/\s+$//; }
 @_;
}
# Extracts data from lines of text in tabular format.
#
# First parameter is a regular expression for capturing fixed-width fields.
#
# Subsequent parameters are the lines of tabular data, the first of which holds
# the column headings. Any line that does not match the regular expression,
# as well as subsequent lines, are discarded.
#
# Returns a list (one element per input line) of hashes (keyed by column names).
sub extract_table {
 my ($fmt, $first_line) = (shift, shift);
 my (@headers) = strip($first_line =~ $fmt);
 my @table;
 for my $line (@_) {
 my (@fields) = $line =~ $fmt;
 last unless @fields;
 my %data;
 @data{@headers} = strip(@fields);
 push @table, \%data;
 }
 return @table;
}
my $fmt = qr/^(.{14})(.{17})(.{19})(.{5})(.{8})(.{14})(.{6})(.*)/;
# Take lines of input from a reasonable source (STDIN or a filename
# argument on the command line)
my @table = extract_table($fmt, <>);
use Data::Dumper;
print Dumper(\@table);

Note that chomp() is unnecessary since we're stripping whitespace characters anyway.

answered Nov 6, 2013 at 2:09
\$\endgroup\$
1
  • \$\begingroup\$ It might be worthwhile mentioning that strip(@fields); also alters @fields array. \$\endgroup\$ Commented Nov 6, 2013 at 6:22

Your Answer

Draft saved
Draft discarded

Sign up or log in

Sign up using Google
Sign up using Email and Password

Post as a guest

Required, but never shown

Post as a guest

Required, but never shown

By clicking "Post Your Answer", you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.