I want to parse some reports from multiple devices, reports looks like this:
VR Destination Mac Age Static VLAN VID Port VR-Default 192.168.11.13 90:e2:ba:3c:95:c0 2 NO intra1 350 49 VR-Default 192.168.1.1 00:0e:a6:f7:b6:b5 0 NO main 602 1 VR-Default 192.168.1.2 00:0d:88:63:bf:d1 3 NO main 602 1 VR-Default 192.168.1.14 00:1c:f0:c7:d2:52 4 NO main 602 1 etc... Dynamic Entries : 19 Static Entries : 0 Pending Entries : 1 In Request : 3888802 In Response : 4531 and some more data... Rx Error : 0 Dup IP Addr : 0.0.0.0 and some more...
I need only vr, destination, mac, age, static, vlan, vid and port fields.
I can parse it using split
function and regexes, but split fails if one field (e.g. Age) is empty.
perldoc says I can use unpack
:
my $template = 'A13xA16xA18xA4xA7xA13xA5xA*';
for my $line ( split /\n/, $data ) {
chomp $line;
my ($vr, $destination, $mac, $age, $static, $vlan, $vid, $port) = unpack $template, $line;
...
}
But it dies on lines with length < 84. And I got to check string length every time (Or maybe using eval
on unpack? Is it better?). And again I got to use regexes or index
to find the end of main table and skip headers.
The code will looks like:
#!/usr/bin/perl
use strict;
use warnings;
my $arp = <<'ARP';
VR Destination Mac Age Static VLAN VID Port
VR-Default 192.168.11.13 90:e2:ba:3c:95:c0 2 NO intra1 350 49
VR-Default 192.168.1.1 00:0e:a6:f7:b6:b5 0 NO main 602 1
VR-Default 192.168.1.2 00:0d:88:63:bf:d1 3 NO main 602 1
VR-Default 192.168.1.14 00:1c:f0:c7:d2:52 4 NO main 602 1
Dynamic Entries : 19 Static Entries : 0
Pending Entries : 1
In Request : 3888802 In Response : 4531
Rx Error : 0 Dup IP Addr : 0.0.0.0
ARP
my $template = 'A13xA16xA18xA4xA7xA13xA5xA*';
for my $line ( split /\n/, $arp ) {
last if index( $line, 'Dynamic E' ) == 0;
next if length $line < 84;
chomp $line;
my ($vr, $destination, $mac, $age, $static, $vlan, $vid, $port) = unpack $template, $line;
next if $mac eq 'Mac';
print "$mac - $destination\n";
}
My question is: how would you parse this data? split, regexes, unpack, substr or something else?
-
\$\begingroup\$ This question does not deserve to be closed — the second program works. \$\endgroup\$200_success– 200_success2013年11月05日 14:57:27 +00:00Commented Nov 5, 2013 at 14:57
-
\$\begingroup\$ perhaps CSV module? \$\endgroup\$mpapec– mpapec2013年11月05日 16:28:46 +00:00Commented Nov 5, 2013 at 16:28
-
\$\begingroup\$ @mpapec I thought CSV module will fails when some fields will be empty \$\endgroup\$Suic– Suic2013年11月05日 16:44:18 +00:00Commented Nov 5, 2013 at 16:44
1 Answer 1
If a problem has been encountered before by someone else, chances are that there is a CPAN module for that (DataExtract::FixedWidth
).
If you don't want to use a CPAN module, then my next choice would be to use regular expressions.
use strict;
# Strips leading and trailing whitespace from all parameters
sub strip {
for (@_) { s/^\s+//; s/\s+$//; }
@_;
}
# Extracts data from lines of text in tabular format.
#
# First parameter is a regular expression for capturing fixed-width fields.
#
# Subsequent parameters are the lines of tabular data, the first of which holds
# the column headings. Any line that does not match the regular expression,
# as well as subsequent lines, are discarded.
#
# Returns a list (one element per input line) of hashes (keyed by column names).
sub extract_table {
my ($fmt, $first_line) = (shift, shift);
my (@headers) = strip($first_line =~ $fmt);
my @table;
for my $line (@_) {
my (@fields) = $line =~ $fmt;
last unless @fields;
my %data;
@data{@headers} = strip(@fields);
push @table, \%data;
}
return @table;
}
my $fmt = qr/^(.{14})(.{17})(.{19})(.{5})(.{8})(.{14})(.{6})(.*)/;
# Take lines of input from a reasonable source (STDIN or a filename
# argument on the command line)
my @table = extract_table($fmt, <>);
use Data::Dumper;
print Dumper(\@table);
Note that chomp()
is unnecessary since we're stripping whitespace characters anyway.
-
\$\begingroup\$ It might be worthwhile mentioning that
strip(@fields);
also alters@fields
array. \$\endgroup\$mpapec– mpapec2013年11月06日 06:22:47 +00:00Commented Nov 6, 2013 at 6:22