I'm trying to parse this string into a numpy ndarray
'POLYHEDRALSURFACE(((0 1 0,0 0 0,1 0 0,1 1 0,0 1 0)),((1 -0 1,0 0 1,0 1 1,1 1 1,1 -0 1)),((1 0 0,1 -0 1,1 1 1,1 1 0,1 0 0)),((0 0 0,0 0 1,1 -0 1,1 0 0,0 0 0)),((0 1 0,0 1 1,0 0 1,0 0 0,0 1 0)),((0 1 1,0 1 0,1 1 0,1 1 1,0 1 1)))'
I clean it up a little to make it easier to work with
s = geom.replace('POLYHEDRALSURFACE', '')
s = s.replace("(((","|")
s = s.replace(")))","|")
s = s.replace("((","|")
s = s.replace("))","|")
s = s.replace("|,|","|")
Result
'|0 1 0,0 0 0,1 0 0,1 1 0,0 1 0|1 -0 1,0 0 1,0 1 1,1 1 1,1 -0 1|1 0 0,1 -0 1,1 1 1,1 1 0,1 0 0|0 0 0,0 0 1,1 -0 1,1 0 0,0 0 0|0 1 0,0 1 1,0 0 1,0 0 0,0 1 0|0 1 1,0 1 0,1 1 0,1 1 1,0 1 1|'
I would like it to be nested as shown below, also the 4th entry should be removed since it is a duplicate.
[
[
[0 1 0],
[0 0 0],
[1 0 0],
[1 1 0],
[0 1 0],
]
]
...
asked Nov 4, 2019 at 22:10
1 Answer 1
You use the same solution as in Selecting geometry from PostGIS compatible with Open3D but you can also use regular expressions
pol = 'POLYHEDRALSURFACE(((0 1 0,0 0 0,1 0 0,1 1 0,0 1 0)),((1 -0 1,0 0 1,0 1 1,1 1 1,1 -0 1)),((1 0 0,1 -0 1,1 1 1,1 1 0,1 0 0)),((0 0 0,0 0 1,1 -0 1,1 0 0,0 0 0)),((0 1 0,0 1 1,0 0 1,0 0 0,0 1 0)),((0 1 1,0 1 0,1 1 0,1 1 1,0 1 1)))'
import re
# replace the blanck spaces by commas
pol = re.sub("\s+", ",", pol.strip())
# replace the double brackets by single
pol = re.sub("\(+", "(", pol)
pol = re.sub("\)+", ")", pol)
# get the text between brackets
coords = re.findall('\(.*?\)', pol)
print(coords)
['(0,1,0,0,0,0,1,0,0,1,1,0,0,1,0)', '(1,-0,1,0,0,1,0,1,1,1,1,1,1,-0,1)', '(1,0,0,1,-0,1,1,1,1,1,1,0,1,0,0)', '(0,0,0,0,0,1,1,-0,1,1,0,0,0,0,0)', '(0,1,0,0,1,1,0,0,1,0,0,0,0,1,0)', '(0,1,1,0,1,0,1,1,0,1,1,1,0,1,1)']
First Numpy array
# convert string to list
first = eval(coords[0])
# convert to numpy array without the duplicated last element
dd = np.array([pt for pt in zip(*[iter(first)]*3)][:-1])
print(dd)
[[0 1 0]
[0 0 0]
[1 0 0]
[1 1 0]]
answered Nov 5, 2019 at 14:08
lang-py