I've been working on a data that is in .tif raster format. I need to convert that into a points shapefile (.shp) to proceed. I went through GDAL in Python but there's only gdal.Polygonize(). So how do I convert it into a points .shp? I've used this code so far:
from osgeo import gdal,ogr
import sys
import os
gdal.UseExceptions()
os.chdir(path)
src_ds = gdal.Open("IND_ppp_2015_v2.tif")
if src_ds is None:
print ("Unable to open Worldpop data")
sys.exit(1)
try:
srcband = src_ds.GetRasterBand(1)
drv = ogr.GetDriverByName("ESRI Shapefile")
dst_ds = drv.CreateDataSource(dst_layername + ".shp")
dst_layer = dst_ds.CreateLayer(dst_layername,srs = None)
gdal.Polygonize(srcband, None,dst_layer, -1,[],callback = None)
Can you suggest a way to convert into points .shp?
-
If GDAL/Python is not a requirement, the following link may help you. It's also an open source tool solution: gis.stackexchange.com/questions/171300/…Saleika– Saleika2018年01月18日 07:46:40 +00:00Commented Jan 18, 2018 at 7:46
-
If you want to use GDAL you could use the gdal2ogr tool: trac.osgeo.org/gdal/browser/trunk/gdal/apps/gdal2ogr.cSaleika– Saleika2018年01月18日 07:50:56 +00:00Commented Jan 18, 2018 at 7:50
-
I'm trying to implement that in PythonSrinath Radhakrishnan– Srinath Radhakrishnan2018年01月18日 09:13:21 +00:00Commented Jan 18, 2018 at 9:13
-
1In your comments you've wrote that your tif-file is about 1.5GB. It would be interesting: 1. Having more information about the raster itself (datatype, cellsize, etc.) , 2. What is your project about? ,3. is there the need to tile the "large" ("large" in the meaning of processing, not storing, visualizing ) raster-file and 4. why do you need a shapefile from it?Stefan– Stefan2018年01月19日 07:50:23 +00:00Commented Jan 19, 2018 at 7:50
-
1Given the size of your data, it's not a good idea to convert it to a point shapefile... this is the question! I'm agree with @Stefan ... if you explain us what is your final goal, I'm sure that this task is not necessary at all. In such case, I suggest you to post another question.Antonio Falciano– Antonio Falciano2018年01月19日 11:41:37 +00:00Commented Jan 19, 2018 at 11:41
2 Answers 2
I'd use GDAL and OGR utilities as library functions (if possible):
from osgeo import gdal
import os
filename='IND_ppp_2015_v2'
inDs = gdal.Open('{}.tif'.format(filename))
outDs = gdal.Translate('{}.xyz'.format(filename), inDs, format='XYZ', creationOptions=["ADD_HEADER_LINE=YES"])
outDs = None
try:
os.remove('{}.csv'.format(filename))
except OSError:
pass
os.rename('{}.xyz'.format(filename), '{}.csv'.format(filename))
os.system('ogr2ogr -f "ESRI Shapefile" -oo X_POSSIBLE_NAMES=X* -oo Y_POSSIBLE_NAMES=Y* -oo KEEP_GEOM_COLUMNS=NO {0}.shp {0}.csv'.format(filename))
First, I'd convert the GeoTIFF into the XYZ format (very close to CSV). Then I'd rename the .xyz file to .csv and finally convert the CSV file to ESRI Shapefile.
-
I tried this code. My raster size is 1.5GB and the xyz format goes beyond 15GB in my local disk and I had to terminate the processSrinath Radhakrishnan– Srinath Radhakrishnan2018年01月19日 05:58:49 +00:00Commented Jan 19, 2018 at 5:58
-
Ok, but why do you need to convert your data to a point shapefile? Maybe you'd like to vectorize only a tiny part of it.Antonio Falciano– Antonio Falciano2018年01月19日 11:45:11 +00:00Commented Jan 19, 2018 at 11:45
-
no. the file contains population data of India and I need to visualize it as points for further analysis. the method looks easy with arcpy but arcpy comes at a cost of installing arcgis (not free)Srinath Radhakrishnan– Srinath Radhakrishnan2018年01月19日 13:16:25 +00:00Commented Jan 19, 2018 at 13:16
A very good starting point to implement a raster to point script using Python and GDAL is the "Raster to vector line" recipe: https://pcjericks.github.io/py-gdalogr-cookbook/raster_layers.html#raster-to-vector-line
You just have to modify (simplify) the array2shp
method. I guess it's easier than to implement the gdal2org tool in Python.
EDIT:
I modified the "Raster to vector line" recipe to get a raster to point script. I tested the script with a 5000x5000 16bit gray scale Image and it's very slow. I would recommend the script only for small rasters.
import ogr, gdal, osr, os
import numpy as np
import itertools
def pixelOffset2coord(raster, xOffset,yOffset):
geotransform = raster.GetGeoTransform()
originX = geotransform[0]
originY = geotransform[3]
pixelWidth = geotransform[1]
pixelHeight = geotransform[5]
coordX = originX+pixelWidth*xOffset
coordY = originY+pixelHeight*yOffset
return coordX, coordY
def raster2array(rasterfn):
raster = gdal.Open(rasterfn)
band = raster.GetRasterBand(1)
array = band.ReadAsArray()
return array
def array2shp(array,outSHPfn,rasterfn):
# max distance between points
raster = gdal.Open(rasterfn)
geotransform = raster.GetGeoTransform()
pixelWidth = geotransform[1]
# wkbPoint
shpDriver = ogr.GetDriverByName("ESRI Shapefile")
if os.path.exists(outSHPfn):
shpDriver.DeleteDataSource(outSHPfn)
outDataSource = shpDriver.CreateDataSource(outSHPfn)
outLayer = outDataSource.CreateLayer(outSHPfn, geom_type=ogr.wkbPoint )
featureDefn = outLayer.GetLayerDefn()
outLayer.CreateField(ogr.FieldDefn("VALUE", ogr.OFTInteger))
# array2dict
point = ogr.Geometry(ogr.wkbPoint)
row_count = array.shape[0]
for ridx, row in enumerate(array):
if ridx % 100 == 0:
print "{0} of {1} rows processed".format(ridx, row_count)
for cidx, value in enumerate(row):
Xcoord, Ycoord = pixelOffset2coord(raster,cidx,ridx)
point.AddPoint(Xcoord, Ycoord)
outFeature = ogr.Feature(featureDefn)
outFeature.SetGeometry(point)
outFeature.SetField("VALUE", int(value))
outLayer.CreateFeature(outFeature)
outFeature.Destroy()
outDS.Destroy()
def main(rasterfn,outSHPfn):
array = raster2array(rasterfn)
array2shp(array,outSHPfn,rasterfn)
if __name__ == "__main__":
rasterfn = r'D:\test.tif'
outSHPfn = r'D:\test.shp'
main(rasterfn,outSHPfn)
-
the raster size is 1.5GBSrinath Radhakrishnan– Srinath Radhakrishnan2018年01月19日 05:57:21 +00:00Commented Jan 19, 2018 at 5:57
-
Assuming your raster is a one layer 8-bit Image, your output shape file will have about 1500 million Points. Which GIS software or library can handle such a large dataset?Saleika– Saleika2018年01月19日 08:59:58 +00:00Commented Jan 19, 2018 at 8:59