I'm trying to generate a fresh XML file using Perl XML::Mini::Document
. It's working fine, but I don't know whether it's the right way to do it. Performance is the problem for me; it takes more time when the when record count increases.
Is there any other module better doing than this one in performance and easier way?
#!/usr/bin/perl -w
use warnings;
use XML::Mini::Document;
my $outfile = "D:/Test.xml";
my $nSequence = 1000;
my $sRandom_Name = "";
my $sRandom_Desc = "";
my $newDoc = XML::Mini::Document->new();
my $newDocRoot = $newDoc->getRoot();
my $xmlHeader = $newDocRoot->header('xml');
$xmlHeader->attribute('version', '1.0');
$xmlHeader->attribute('encoding', 'UTF-8');
my $records= $newDocRoot->createChild('records');
for(0..9) {
for(1..6) {
$sRandom_Name = $sRandom_Name.(chr(int(rand(25) + 65)));
}
for(1..15) {
$sRandom_Desc = $sRandom_Desc.(chr(int(rand(25) + 97)));
}
my $record = $records->createChild('record');
$record->createChild('ID')->text($nSequence=$nSequence+1);
$record->createChild('Name')->text($sRandom_Name);
$record->createChild('Desc')->text($sRandom_Desc);
print $newDoc->toFile($outfile);
}
My output should look like this one:
<?xml version="1.0" encoding="UTF-8" ?> <records> <record> <ID>1001</ID> <Name>ASDSDF</Name> <Desc>ASDFsdsdfcwefSC</Desc> </record> <record> <ID>1002</ID> <Name>KDFNND</Name> <Desc>WEFsdssccwefSC</Desc> </record> <record> <ID>1003</ID> <Name>PORJDX</Name> <Desc>XceFsdsdfcASmsd</Desc> </record> . . . </records>
2 Answers 2
Yes, that's pretty good. It's essential to include use strict
at the top of every Perl program you write, and use warnings
is preferable to -w
on the command line. You should also avoid capital letters in identifiers for lexical variables, as they are reserved for use in global identifiers such as package names
I would write something more Perlish like this
#!/usr/bin/perl
use strict;
use warnings;
use XML::Mini::Document;
use constant OUTFILE => 'D:/Test.xml';
sub rand_letter { chr(ord('A') + rand(26)) }
my $new_doc = XML::Mini::Document->new;
my $root = $new_doc->getRoot;
my $header = $root->header('xml');
$header->attribute(version => '1.0');
$header->attribute(encoding => 'UTF-8');
my $records= $root->createChild('records');
my $sequence = 1000;
for ( 0..9 ) {
my $record = $records->createChild('record');
my ($random_name, $random_desc);
$random_name .= rand_letter for 1 .. 6;
$random_desc .= rand_letter for 1 .. 15;
$record->createChild(ID => ++$sequence);
$record->createChild(Name => $random_name);
$record->createChild(Desc => $random_desc);
}
open my $fh, '>', OUTFILE or die $!;
select $fh;
print $new_doc->toString;
If what you're looking for is speed and low memory usage then you should look at XML::Writer
which will output the data directly to a file instead of building an in-memory structure
This program demonstrates. It will output a million records in a minute or two.
#!/usr/bin/perl
use strict;
use warnings;
use XML::Writer;
sub rand_letter { chr(ord('A') + rand(26)) }
use constant OUTFILE => 'D:/Test.xml';
open my $fh, '>', OUTFILE or die $!;
my $writer = XML::Writer->new(
OUTPUT => $fh,
ENCODING => 'utf-8',
DATA_MODE => 1,
DATA_INDENT => ' ' x 4,
);
$writer->xmlDecl;
$writer->startTag('records');
my $sequence = 1000;
for ( 0..9 ) {
my ($random_name, $random_desc);
$random_name .= rand_letter for 1 .. 6;
$random_desc .= rand_letter for 1 .. 15;
$writer->startTag('record');
$writer->dataElement(ID => ++$sequence);
$writer->dataElement(Name => $random_name);
$writer->dataElement(Desc => $random_desc);
$writer->endTag('record');
}
$writer->endTag('records');
toFile
is inside thefor
loop. That means the file is going to be overwritten ten times each time the program is run \$\endgroup\$XML::Twig
: stackoverflow.com/questions/29009370/assembling-xml-in-perl \$\endgroup\$toFile
outside of the loop it says Out of memory! for1 million
records \$\endgroup\$