I used Nokogiri and a piece of ActiveSupport to parse an xml file from a given URL, format the data properly and return a JSON string. The script works as expected, so I'm only wondering if there are ways to make the code better from architecture/naming perspectives.
#!/usr/bin/env ruby
$:.unshift File.dirname(__FILE__)
require 'open-uri'
require 'nokogiri'
require 'active_support/core_ext'
class Scadenzario
def initialize(url)
@url = url
@events = parse_events
end
def xml
Nokogiri::XML::Document.parse(open(@url))
end
def to_json
@events.to_json
end
private
def parse_events
result = []
xml.xpath('//item').each do |node|
result << build_event(node)
end
result
end
def build_event(node)
event = Hash.new
event[:title] = node.at_xpath('title').content
event[:url] = node.at_xpath('link').content
event[:start] = format_event_time(node)
event[:allDay] = false
event
end
def format_event_time(node)
Time.parse(node.at_xpath('pubDate').content).iso8601
end
end
puts Scadenzario.new('servizi.seac.it/documenti/rss/rss_scadenzario_annuale.xml').to_json
The parse_events
method can be written like this, but I think it would make the code rather unclear.
def parse_events
xml.xpath('//item').inject([]) do |result, node|
result << build_event(node)
end
end
2 Answers 2
It all seems really nice, I would change few things:
Firstly, your format_event_time
takes a node object as a param, which is counter-intuitive as I would expect it to accept Time object. I would rename it to 'parse_xml_date` or similar and give it text instead:
def parse_xml_time(time_string)
Time.parse(time_string).iso8601
end
The main purpose of this change is to make your build_event
method the only method specifying the exact xml locations of the required properties, so it is easy to find if those changed.
Secondly,your build_event
method can be slightly simplified to (including first change):
def build_event(node)
{
title: node.at_xpath('title').content
url: node.at_xpath('link').content
start: parse_xml_time(node.at_xpath('pubDate').content)
allDay: false
}
end
And finally I would add a small private method get_param
:
def get_xml_param(attribute, node)
node.at_xpath(attribute).content
end
Which would simplify build_event
to:
def build_event(node)
{
title: get_xml_param('title', node)
url: get_xml_param('url', node)
start: parse_xml_time(get_xml_param('pubTime', node))
allDay: false
}
end
UPDATE:
Actually there might be few more changes.
Your parse_events method can be written like this:
def parse_events
xml.xpath('//item').map {|node| build_event(node) }
end
And I would consider caching an xml method:
def xml
@xml ||= Nokogiri::XML::Document.parse(open(@url))
end
-
\$\begingroup\$ I agree, overall your code looks good though. \$\endgroup\$Mike H-R– Mike H-R2014年05月12日 12:02:36 +00:00Commented May 12, 2014 at 12:02
-
\$\begingroup\$ Thanks for the suggestions, I really like the xml nodes limitations within build the
build_event
by refactoring theparse_xml_time
. \$\endgroup\$Adam– Adam2014年05月12日 12:15:24 +00:00Commented May 12, 2014 at 12:15 -
\$\begingroup\$ Cool! I've added the
parse_events
with#inject
above, but#map
looks much simpler. \$\endgroup\$Adam– Adam2014年05月12日 12:26:45 +00:00Commented May 12, 2014 at 12:26
@BroiStatse gave a very nice answer. I would like to give my observation about his two last points, and take them even further:
When delegating a map
to a method, you can use a shortened syntax using method
:
def parse_events
xml.xpath('//item').map(&method(:build_event))
end
Regarding @url
and xml
, @BroiStatse suggested caching @xml
, and I'll say - you don't need to cache @url
! Once you have used it, you don't need it, so if you load the XML on initialization - you don't need to save @url
at all:
class Scadenzario
def initialize(url)
@xml = Nokogiri::XML::Document.parse(open(url))
@events = parse_events
end
attr_reader :xml
#...
end
-
\$\begingroup\$ Thanks Uri! I didn't know about
&method
until now. \$\endgroup\$Adam– Adam2014年05月12日 18:53:15 +00:00Commented May 12, 2014 at 18:53