So, I have the boundaries of every states in the US, from the google map KML file.
This KML file is actually a XML file.
I'm converting this file to a JS file compatible with google map in order to draw polygons on my map.
Here is an excerpt from the file with all the coordinates:
<?xml version="1.0" encoding="UTF-8"?>
<kml xmlns="http://earth.google.com/kml/2.2">
<Document>
<Placemark>
<name>WA</name>
<description><![CDATA[]]></description>
<styleUrl>#style37</styleUrl>
<Polygon>
<outerBoundaryIs>
<LinearRing>
<tessellate>1</tessellate>
<coordinates>
-122.402015585862,48.2252165511906
-122.368333031748,48.1281417374007
-122.216991980112,48.0074395523983
-122.230120864997,47.9691133009971
-122.302922293019,47.9502148146411
-122.394492319816,47.7741760704507
-122.414815251142,47.6641799154458
-122.382220450337,47.5954090424224
-122.392633724016,47.510242430349
-122.319738644767,47.3901148739843
-122.325376306437,47.3443234291417
-122.420837154063,47.3188444009071
<!-- etc., 243 points altogether for WA state -->
</coordinates>
</LinearRing>
</outerBoundaryIs>
</Polygon>
</Placemark>
Here is my code to convert the KML file to the JS file.
require 'rubygems'
require 'xmlsimple'
def read_xmlfile(file_name)
file = File.open(file_name, "r")
data = file.read
file.close
return data
end
target = "states.js"
xml_data = read_xmlfile('Stateborders.xml')
data = XmlSimple.xml_in(xml_data, 'KeyToSymbol' => true)
content = "var stateBorders = {\n"
data[:document][0].each_with_index do |item, index|
states = item[1]
states.each_with_index do |state, index|
name = state[:name][0]
coord = state[:polygon][0][:outerboundaryis][0][:linearring][0][:coordinates][0].tr("\t", "").strip.gsub(/[\n]/,") ,new google.maps.LatLng(").gsub(" ) ,", "), ");
coord = state[:polygon][0][:outerboundaryis][0][:linearring][0][:coordinates][0].tr("\t", "").strip
coord2 = ""
coord.split("\n").each do |ll|
ll = ll.split(",")
lat = ll[1].to_s.strip
lat = lat.to_f.round(6).to_s
lon = ll[0].to_s.strip
lon = lon.to_f.round(6).to_s
coord2 << "new google.maps.LatLng("+lat+", "+lon+"), "
end
statecoord = "'#{name}' : [#{coord2}],".gsub(", ],","],")
content << statecoord+"\n"
end
end
content << "};"
File.open(target, "w+") do |f|
f.write(content)
end
The code itself is fast enough but i'm sure can be much simpler/cleaner. Also for the last state I have ],
instead of ]
.
-
1\$\begingroup\$ I've created the geospatial tag instead, as it has broader applicability. The combination of xml and geospatial should turn up some KML-related questions. \$\endgroup\$200_success– 200_success2014年04月09日 03:04:32 +00:00Commented Apr 9, 2014 at 3:04
-
\$\begingroup\$ @200_success Neat. Although maybe "gis" would be an even broader tag? \$\endgroup\$Flambino– Flambino2014年04月09日 03:14:53 +00:00Commented Apr 9, 2014 at 3:14
-
\$\begingroup\$ @Flambino Stack Overflow has both tags, both equally popular. I think "geospatial" is clearer to those who aren't familiar with the "gis" abbreviation. \$\endgroup\$200_success– 200_success2014年04月09日 03:17:49 +00:00Commented Apr 9, 2014 at 3:17
-
\$\begingroup\$ @200_success Good point \$\endgroup\$Flambino– Flambino2014年04月09日 08:54:57 +00:00Commented Apr 9, 2014 at 8:54
3 Answers 3
I'd suggest Nokogiri for parsing the KML/XML. It's a more full-featured library, which makes things easier. For instance, you can get rid of all those long lines of [:symbol][0][:another_symbol][0]...
(or maybe xmlsimple is brilliant, but, frankly, the docs are a rambling mess, so I'd steer clear)
I'd also suggest skipping the new google.maps.LatLng(...)
. All it does is guarantee a bunch of processing the moment the JS file is loaded. Possibly a lot of errors, too, in case GMaps hasn't loaded or isn't involved - perhaps you just want the actual coordinates, not LatLng
objects. Or perhaps you just want one state, and don't want to bother instantiating LatLng
objects for every state in the Union. And so on - much simpler to just write out the data
I'd also skip a lot of the "manual" string construction in the output, and use a JSON library to do the heavy lifting. In fact, I'd consider simply producing a JSON file - not a JS file. As with avoiding the new google.maps.LatLng
stuff above, the point would be to keep data and logic separate. Although the different between JSON and JS is a matter of adding var ... =
at the start and ;
at the end, it still changes the file from a pure data format to code.
But you know your needs better than I do, so I've mimicked your code's output in my code.
I've split everything up into methods to keep things clean. It also lets you use the parsed data for something other than JS/JSON output, if need be.
require 'rubygems'
require 'nokogiri'
require 'json/ext'
# Parse the file at the given path
# See #parse_kml_doc for return value
def parse_kml_file(file)
doc = File.open(file, 'r') { |io| Nokogiri::XML.parse(io) }
parse_kml_doc(doc)
end
# Parse the given Nokogiri doc
# Returns an array of 2-element arrays containing
# the name and coordinates of the states
def parse_kml_doc(kml_doc)
kml_doc.css("Placemark").map do |node| # CSS-type selectors
state = node.css("name").first.content.strip # get state abbreviation
coords = node.css("coordinates").first.content # get coords string
coords = coords.strip.each_line.map do |line| # parse the string
line.strip.split(/,/).map(&:to_f) # I've skipped the rounding
end
[ state, coords ] # return a tuple
end
end
# Write out a JavaScript file with the data
def write_js_file(data, file)
json = Hash[data].to_json
File.open(file, 'w+') do |io|
io << "var stateBorders = "
io << json
io << ";"
end
end
# Read a KML file and write a JS file
def kml_to_js(kml_file, js_file)
data = parse_kml_file(kml_file)
write_js_file(data, js_file)
end
# Do everything!
kml_to_js("stateborders.kml", "stateborders.js")
It runs about 1.25x faster than the original; nothing spectacular there, but the code really only needs to run once anyway (at least until a state secedes)
Update: Ok, so if you truely want LatLng
objects, I'd suggest doing this: Include a small snippet of JS in the file, which'll do the LatLng
instantion when the file's loaded. I.e. something like
var stateBorders = (function () {
var state, i, l, json = { .. plain json data .. };
for(state in json) {
if(json.hasOwnProperty(state)) {
for(i = 0, l = json[state].length ; i++) {
json[state][i] = new google.maps.LatLng(json[state][i][0], json[state][i][1]);
}
}
}
return json;
}());
It will cut the file size in half, to about 280kb (if you include the round(6)
).
Or, if you want a file with new google.map.LatLng(...)
repeating thousands of times, you can define a simple class:
LatLng = Struct.new(:lat, :lng) do
def to_json(*a)
"new google.maps.LatLng(#{lat.round(6)},#{lng.round(6)})"
end
end
Use it to model coordinates (changing a line in the parse_kml_doc
method):
# ...
coords = coords.strip.each_line.map do |line|
LatLng.new(*line.strip.split(/,/).map(&:to_f))
end
# ...
And done. Leave the rest as-is, and it'll write out a file like you want.
-
\$\begingroup\$ This is great! Thanks a lot! You're awesome! I'm going to keep the
new google.maps.LatLng(...)
. The main idea is to export the kml file downloadable from mapsengine.google.com/map/u/0/?hl=en&pli=1 (create custom maps) and to convert it to polygons on the google maps js api v3. I'm also going to keep the rounding, it saves +-250kb on the size of my file :) \$\endgroup\$bl0b– bl0b2014年04月09日 03:50:04 +00:00Commented Apr 9, 2014 at 3:50 -
\$\begingroup\$ @bl0b Updated my answer \$\endgroup\$Flambino– Flambino2014年04月09日 18:13:53 +00:00Commented Apr 9, 2014 at 18:13
-
\$\begingroup\$ I like the idea of the JS snippet. This is smart. Thanks a lot!!! If I would chose a more complete file like this one: code.google.com/apis/kml/documentation/us_states.kml where a state have multiple polygons(typically islands), I guess I would need an array with 3 dimensions right [AllStates][All polygons in state][All coordinates in polygons]. How would you do it in the code ? \$\endgroup\$bl0b– bl0b2014年04月09日 23:24:08 +00:00Commented Apr 9, 2014 at 23:24
-
\$\begingroup\$ @bl0b We're wandering off of the original question here, so I can't really answer. I'd just say try something, and see if it works. If it gives you trouble, put a question up on StackOverflow; if you got it working, but think it can be better, put it up for review here. \$\endgroup\$Flambino– Flambino2014年04月10日 08:42:23 +00:00Commented Apr 10, 2014 at 8:42
-
\$\begingroup\$ Thanks for your help! IT's is awesome, I have something, I will put it on codereview later, I have to move on with my work :) \$\endgroup\$bl0b– bl0b2014年04月10日 17:44:57 +00:00Commented Apr 10, 2014 at 17:44
@Flambino has already explained the main problems in your code. I'd only add: 1) why do you use each_with_index
if you don't actually use the indexes? 2) Ruby would have a terrible standard library if you needed 4 lines just to read the contents of a file: contents = File.read(path)
.
That's how I'd write it:
require 'nokogiri'
require 'json'
kml = Nokogiri::XML(open("Stateborders.xml"))
coordinates_by_state = kml.css("Placemark").map do |placemark|
state = placemark.at_css("name").text.strip
coordinates = placemark.at_css("coordinates").text.strip.lines.map do |line|
line.split(",").map { |coord| coord.to_f.round(6).to_s }
end
[state, coordinates]
end.to_h
jscode = "var stateBorders = #{coordinates_by_state.to_json};"
File.write("stateborders.js", jscode)
@Flambino and @tokland have both suggested good strategies on how your task should be done.
I want to talk about your code:
Naming conventions
XML file is two words - name the method accordingly - read_xml_file
Useful names
Names like coord
and coord2
are bad because they don't convey the true role of them, and might confuse the reader - raw_coordinates
and formatted_coordinates
would be better.
Any two letter variable which is used for anything less trivial than running index is not advisable, and variables named l
and ll
are even worse, since the letter l
might be indistinguishable from the digit 1
in some fonts, which might be very confusing...
Don't shadow variables
Yes, you should not use each_with_index
, you should use each
. But if we assume that that is the required API - don't name the inner and out arguments with the same name (index
).
If you know you are not going to use the argument in the body of the block, ruby conventions suggest you use the special variable name _
:
data[:document][0].each_with_index do |item, _|
states = item[1]
states.each_with_index do |state, _|
# ... snip ...
end
end
Redundant operations
split
returns a list of strings. There is no point in calling to_s
on its children.
String concatenation implicitly calls to_s
on all non-string elements you concat, so calling to_s
after the rounding is also redundant.
You assign coord
twice, totally ignoring the first assignment - that line is totally redundant.
Don't re-use variables by changing their meaning
One side-effect of meaningless variable names is that you might be tempted to re-use them
coord.split("\n").each do |ll| ll = ll.split(",")
when ll
is first assigned it is a string, but then you re-assign it as an array. You do the same with lat
and lon
. A reader who misses such a re-assignment will be confused about what you do with the variables after it.
When you know what elements you have in an array
When you have an array, and all you do with it is assign the first element to variable a
, the second element to variable b
, etc. you can use ruby's multiple assignment instead:
lat, lon = ll.split(',')
Ruby's map
chaining and keeping it DRY
To elaborate on the multiple assignment above, it is idiomatic in ruby to use map
to transform all elements in an array:
lat, lon = ll.split(',').map { |number| number.to_f.round(6) }
This removes the code duplication you have for lat
and lon
.
What join
does better
statecoord = "'#{name}' : [#{coord2}],".gsub(", ],","],")
This ugly code is needed, because inside the loop you don't know when the last element is reached. Ruby already solved this problem with join
:
coord2 = coord.split("\n").map do |ll|
# ...
"new google.maps.LatLng("+lat+", "+lon+")"
end
statecoord = "'#{name}' : [#{coord2.join(", ")}]"
Same is true for content
...
Explore related questions
See similar questions with these tags.