I am diving into Scala (using 2.10) headfirst and am trying out a simple task of taking bulk data and then doing something with that stream. The first part is generating that stream. I have seem to be able to cobble together what seems to work, but I think I am missing shortcuts in scala. Any pointers or code cleanup recommendations are welcome! I feel like getInnerPolygons
could be simplified using a match
statement, but I haven't gotten my head around those yet.
So, the process: With gdal, you retrieve a dataset, then retrieve the layer. From the layer, you can iterate over the features. From those features, you can get the geometries from each feature. Now, most features are just one polygon. If it is, take the outer ring of that polygon and pass it up the line. If the polygon is multi-part (think the hawaii islands as one "feature" where each island is a polygon), retrieve the polygons, then split that into their individual outer rings (this is that recursive call into getInnerPolygons
).
Again: any Scala best practices or shortcuts that I missed. I would love to hear about.
def getInnerPolygons(geometry: Geometry) : Iterator[Geometry] = {
Iterator
.range(0,geometry.GetGeometryCount())
.flatMap(i=>{
val innerGeometry = geometry.GetGeometryRef(i)
if(innerGeometry.GetGeometryName().equals("LINEARRING"))
Iterator.single(innerGeometry)
else
getInnerPolygons(innerGeometry)
})
}
def getPolygons(layer : Layer) : Iterator[Geometry] = {
Iterator.continually[Feature](layer.GetNextFeature())
.takeWhile(_!=null)
.flatMap(feature=>getInnerPolygons(feature.GetGeometryRef()))
}
def main(args: Array[String]){
ogr.RegisterAll()
val dataSet = ogr.Open("myshapefile.shp")
val layer = dataSet.GetLayer(0)
for(i <- getPolygons(layer)) println(i.GetGeometryName())
println("Done")
}
3 Answers 3
micro changes: use == instead of equals because "==" means "equals".
Scala does not follow the Java convention of prepending set/get to mutator and accessor methods.
Try to split different parts of work (abstraction)
def innerPolygons(geometry: Geometry) : Iterator[Geometry] =
innerGeometry(geometry).flatMap(inner)
def innerGeometry(geometry: Geometry): Iterator[Geometry]=
(0 to geometry.GetGeometryCount()).map(geometry.GetGeometryRef(_))
def inner:Geometry => Iterator = g => if ("LINEARRING"== g.GetGeometryName())
Iterator.single(g)
else
innerPolygons(g)
Two thoughts:
I doubt that
getInnerPolygons
is tail recursive, because the recursive call togetInnerPolygons
withinflatMap
is not the last statement to be executed. This isflatMap
itself (but I may be wrong). Recursive calls to functions which are not tail recursive can lead to a stack overflow, when recursion is going deeper. In this case recursion can't be eliminated by the compiler.You could write
(0 to geometry.GetGeometryCount())
instead of
Iterator.range(0, geometry.GetGeometryCount())
to create a range. This is slightly shorter.
I agree with the answer from @Adi Stadelmann. Here is my take on it.
I used List
instead of Iterator
since I believe it is a lot more common in Scala. I created "monkey patching" methods to create a List
for the sub-elements of both Layer
and Geometry
.
In the original implementation of getInnerPolygons
, the Geometry
which is passed as the argument can never be itself included in the returned List
. I kept that behavior, but I am not sure if that is really what the OP wanted.
object Gdal {
/**
* "Monkey patching" methods.
*/
object CustomSubElementsFetchers {
implicit class GeometryWithSubList(val geom: Geometry) extends AnyVal {
def subGeometries: List[Geometry] = {
for (i <- List.range(0, geom.GetGeometryCount))
yield geom.GetGeometryRef(i)
}
}
implicit class LayerWithSubList(val layer: Layer) extends AnyVal {
def features: List[Feature] = {
var list: List[Feature] = Nil
var feature = layer.GetNextFeature
while (feature != null) {
list = feature :: feature :: list
feature = layer.GetNextFeature
}
list.reverse
}
}
}
import CustomSubElementsFetchers._
// @tailrec // Not tail recursive: stack overflow might occur.
def getInnerPolygons(geometry: Geometry): List[Geometry] = {
geometry.subGeometries.flatMap(subGeom => subGeom.GetGeometryName match {
case "LINEARRING" => List(subGeom)
case _ => getInnerPolygons(subGeom)
})
}
def main(args: Array[String]) {
val linearGeometry = new Geometry("LINEARRING", Nil)
val compoundGeometry = new Geometry("NONLINEARRING", List(linearGeometry, linearGeometry))
val layer = new Layer(List(new Feature(linearGeometry), new Feature(compoundGeometry)))
val baseGeometries = layer.features.map(feature => feature.GetGeometryRef)
val polygons: List[Geometry] = baseGeometries.flatMap(getInnerPolygons)
polygons.map(polygon => polygon.GetGeometryName).foreach(println)
}
}
Note: if the GDAL data structures are huge and all the data cannot fit in List
s, then it might be preferable to use Stream
s instead.
And here are the dummy GDAL classes I used to test it:
/**
* Emulation of Gdal's Geometry.
*/
class Geometry(val name: String, val subGeometries: List[Geometry]) {
def GetGeometryName: String = name
def GetGeometryCount: Int = subGeometries.size
def GetGeometryRef(index: Int): Geometry = subGeometries(index)
}
/**
* Emulation of Gdal's Feature.
*/
class Feature(val geometry: Geometry) {
def GetGeometryRef: Geometry = geometry
}
/**
* Emulation of Gdal's Layer.
* (This would be a horrible Scala class.)
*/
class Layer(val features: List[Feature]) {
var currentIndex = -1
def GetNextFeature: Feature = {
currentIndex = currentIndex + 1
if (currentIndex >= features.size)
null
else
features(currentIndex)
}
}
Explore related questions
See similar questions with these tags.