[uf-discuss] Re: Perl microformat parsing

Tatsuhiko Miyagawa miyagawa at gmail.com
Sat Feb 23 02:46:40 PST 2008


On 2/22/08, Takatsugu Shigeta <takatsugu.shigeta at gmail.com> wrote:
> my $url = 'http://diveintomark.org/projects/greasemonkey/hcard/tests/2-4-2-vcard.xhtml';
>> my $fn = scraper {
> process '.vcard .fn', 'fn[]' => 'TEXT';
> process '.vcard .tel', 'tel[]' => 'TEXT';
> process '.vcard .title', 'title[]' => 'TEXT';
> result 'fn', 'tel', 'title';
> }->scrape(URI->new($url));

For a better nested output,
use strict;
use Web::Scraper;
use URI;
my $uri = URI->new("http://diveintomark.org/projects/greasemonkey/hcard/tests/2-4-2-vcard.xhtml");
my $scraper = scraper {
 process ".vcard", "vcards[]" => scraper {
 process ".email", email => '@href';
 process ".fn", fullname => "TEXT";
 process ".tel", tel => "TEXT";
 process ".title", title => "TEXT";
 };
};
my $result = $scraper->scrape($uri);
__END__
$VAR1 = {
 'vcards' => [
 {
 'email' => bless( do{\(my $o = 'mailto:jfriday at host.com')},
'URI::mailto' ),
 'tel' => '+1-919-555-7878',
 'fullname' => 'Joe Friday',
 'title' => 'Area Administrator, Assistant'
 },
 ]
};
Well, you get this vard twice because it has nester .vcard but I guess
that's fine :)
Thanks,
-- 
Tatsuhiko Miyagawa


More information about the microformats-discuss mailing list

AltStyle によって変換されたページ (->オリジナル) /