This script calculates points in reciprocal space for hexagonal 2D lattices, then uses the cartesian product from itertools to add each vector from one lattice to all of the vectors of the other in the line
np.array([a+b for a, b in list(itertools.product(p1.T, p2.T))])
It's slow right now because as written its instantiating millions of tiny numpy arrays.
I'm aware of:
- Cartesian product of x and y array points into single array of 2D points
- Using numpy to build an array of all combinations of two arrays
- cartesian products in numPy
and I suspect there's some way to do this, possibly using np.meshgrid
or np.mgrid
that's faster, uses less memory and looks cleaner, but I can not figure out how.
I will use the output in an optimization loop matching these positions to measured positions, so it needs to be callable several hundred times in a row, so reusing large array spaces rather than instantiating and garbage collecting them might have some advantages.
simulated diffraction spots click for larger
import numpy as np
import matplotlib.pyplot as plt
import itertools
def rotatem(xy, rot):
r3o2, twopi, to_degs, to_rads = np.sqrt(3)/2., 2*np.pi, 180/np.pi, np.pi/180
c, s = [f(to_rads*rot) for f in (np.cos, np.sin)]
x, y = xy
xr = c*x - s*y
yr = c*y + s*x
return np.vstack((xr, yr))
def get_points(a=1.0, nmax=5, rot=0):
r3o2, twopi, to_degs, to_rads = np.sqrt(3)/2., 2*np.pi, 180/np.pi, np.pi/180
g = twopi / (r3o2 * a)
i = np.arange(-nmax, nmax+1)
I, J = [thing.flatten() for thing in np.meshgrid(i, i)]
keep = np.abs(I + J) <= nmax
I, J = [thing[keep] for thing in (I, J)]
xy = np.vstack((I+0.5*J, r3o2*J))
return g * rotatem(xy, rot=rot)
r3o2, twopi, to_degs, to_rads = np.sqrt(3)/2., 2*np.pi, 180/np.pi, np.pi/180
a1, a2, rot = 1.0, 2**0.2, 22
p1 = get_points(a=a1, nmax=20)
p2 = get_points(a=a2, nmax=20, rot=rot)
p3 = get_points(a=a2, nmax=20, rot=-rot)
d12 = np.array([a+b for a, b in list(itertools.product(p1.T, p2.T))])
d13 = np.array([a+b for a, b in list(itertools.product(p1.T, p3.T))])
d12, d13 = [d[((d**2).sum(axis=1)<4.)] for d in (d12, d13)]
if True:
plt.figure()
for d in (d12, d13):
plt.plot(*d.T, 'o', ms=2)
plt.gca().set_aspect('equal')
plt.show()
2 Answers 2
You can replace:
d12 = np.array([a+b for a, b in list(itertools.product(p1.T, p2.T))])
with something like:
p1 = p1.T
p2 = p2.T
p3 = p3.T
d12 = p1[:,np.newaxis,:] + p2[np.newaxis,:,:]
d12 = my_d12.reshape((len(p1)*len(p2),2))
I find it most of the times easier to use the first index of an array for point_index and the second index for the dimensions, hence the .T
's
With the use of the magic index np.newaxis
at the right places you can create numpy array's of shape (M,N) with normal operators acting on arrays of shape (M) and (N).
With the reshape
method d12
changed again to the shape in your original solution.
-
\$\begingroup\$ Oh this is great! So simple; exactly what's needed. I've always used
None
in the place ofnp.newaxis
thinking they did the same thing but never checked. \$\endgroup\$uhoh– uhoh2020年09月02日 17:08:47 +00:00Commented Sep 2, 2020 at 17:08 -
1\$\begingroup\$ @uhoh, you can also use
None
becausenp.newaxis
is only an alias ofNone
. I prefer to usenp.newaxis
because it is a heads-up that there is something funny going on with indexes and broadcasting rules. \$\endgroup\$Jan Kuiken– Jan Kuiken2020年09月03日 16:40:36 +00:00Commented Sep 3, 2020 at 16:40 -
\$\begingroup\$ Okay got it! I think I will switch to that as well since I've been soundly reprimanded for poor readability ;-) \$\endgroup\$uhoh– uhoh2020年09月03日 17:18:50 +00:00Commented Sep 3, 2020 at 17:18
Your code is pretty cluttered and you seem fixated on a 'less lines of code equals better code' mindset.
Firstly lets move r3o2
and friends out into the global scope as constants - UPPER_SNAKE_CASE
variables in Python.
This gets rid of 2 lines of code. Additionally it and makes:
rotatem
less confusing as now you're not defining 3 things you never use.get_points
less confusing as now you're not defining 2 things you never use.
To get rotatem
to look cleaner I'd then go on to:
- Move the expression for defining
xr
andyr
into the return. - Get rid of the clunky tuple unpacked comprehension; just write it out.
- Use better names than
c
ands
;cos
andsin
could be far more readable.
I moved them to the latter side of multiplication so the equation doesn't look like \$\sin x - \cos y\$
This makes rotatem
pretty readable now.
def rotatem(xy, rot):
x, y = xy
cos = np.cos(TO_RADS * rot)
sin = np.sin(TO_RADS * rot)
return np.vstack((
x*cos - y*sin,
y*cos + x*sin,
))
To get get_points
to look cleaner I'd then go on to:
- Move the definition of
g
down to the bottom of the function. By defining it at the top of the function I'm having to read each and every line of code in the function to see ifg
is used. This is just a waste of time when you only use it in thereturn
. - Rename:
i
todomain
,I
toi
, andJ
toj
.
- Split the tuple in the expression for
xy
over multiple lines. - Change
i
toi[keep]
andj
toj[keep]
; remove the tuple comprehension before this. - Add some whitespace around operators.
This makes get_points
a bit more readable. But np.meshgrid
is hampering this a bit.
def get_points(a=1.0, nmax=5, rot=0):
domain = np.arange(-nmax, nmax + 1)
i, j = [thing.flatten() for thing in np.meshgrid(domain, domain)]
keep = np.abs(i + j) <= nmax
xy = np.vstack((
i[keep] + 0.5 * j[keep],
R3O2 * j[keep],
))
return (TWO_PI / (R3O2 * a)) * rotatem(xy, rot=rot)
I still don't really understand what your code is doing, what are rotatem
and get_points
doing? You can explain this at the top of each function by using a docstring.
It's taken me a while to understand your code this much, and I still don't see myself understanding it all any time soon. You should really try to improve the readability of your code to the best it can be in the future so others don't just get bored.
In case you think my changes will decrease performance by anything important, my changes have a negligible impact on performance.
$ python orig.py
0.0005879989994355128 4.213629218999813
$ python peil.py
0.0005172819992367295 4.236298889999489
import numpy as np
import matplotlib.pyplot as plt
import itertools
import timeit
R3O2 = np.sqrt(3) / 2.
TWO_PI = 2 * np.pi # You could just call this tau.
TO_DEGS = 180 / np.pi
TO_RADS = np.pi / 180
def rotatem(xy, rot):
"""Explain what rotatem does."""
x, y = xy
cos = np.cos(rot)
sin = np.sin(rot)
return np.vstack((
x*cos - y*sin,
y*cos + x*sin,
))
def get_points(a=1.0, nmax=5, rot=0):
"""Explain what get_points does."""
domain = np.arange(-nmax, nmax + 1)
i, j = [thing.flatten() for thing in np.meshgrid(domain, domain)]
keep = np.abs(i + j) <= nmax
xy = np.vstack((
i[keep] + 0.5 * j[keep],
R3O2 * j[keep],
))
return (TWO_PI / (R3O2 * a)) * rotatem(xy, rot=rot)
def main(a1, a2, rot):
timer = timeit.perfcounter()
start = timer()
p1 = get_points(a=a1, nmax=20)
p2 = get_points(a=a2, nmax=20, rot=TO_RADS * rot)
p3 = get_points(a=a2, nmax=20, rot=TO_RADS * -rot)
mid = timer()
d12 = np.array([a+b for a, b in list(itertools.product(p1.T, p2.T))])
d13 = np.array([a+b for a, b in list(itertools.product(p1.T, p3.T))])
d12, d13 = [d[((d**2).sum(axis=1)<4.)] for d in (d12, d13)]
stop = timer()
print(mid - start, stop - start)
if __name__ == '__main__':
main(1.0, 2**0.2, 22)
-
\$\begingroup\$ "I still don't really understand what your code is doing..." & "...my changes have a negligible impact on performance." Got it. \$\endgroup\$uhoh– uhoh2020年09月03日 12:36:10 +00:00Commented Sep 3, 2020 at 12:36
-
\$\begingroup\$ @uhoh If you'd have followed my answer before I'd posted it then you'd get a better answer. Instead due to you not caring about readability or explaining what your code does it's just a waste of our time. \$\endgroup\$2020年09月03日 12:52:43 +00:00Commented Sep 3, 2020 at 12:52
-
\$\begingroup\$ @uhoh You can also fix this by writing docstrings and improving the readability of your code so your code is accessible to anyone that reads it. Even more so when you put it up for critique. \$\endgroup\$2020年09月03日 13:12:41 +00:00Commented Sep 3, 2020 at 13:12
-
\$\begingroup\$ I'm revisiting this problem and rediscovered your answer, it's quite helpful now that I've had more "life experience", thanks! \$\endgroup\$uhoh– uhoh2023年05月18日 10:47:28 +00:00Commented May 18, 2023 at 10:47
-
1\$\begingroup\$ @uhoh Sorry I failed at making my answer helpful for the you in the past. I'm happy to hear my answer became more helpful now. \$\endgroup\$2023年05月19日 00:23:04 +00:00Commented May 19, 2023 at 0:23
Explore related questions
See similar questions with these tags.
np.array(a+b for a, b in itertools.product(p1.T, p2.T))
. The same point could be made about the creation ofd12
andd13
in your larger code example. \$\endgroup\$d12 = np.array(a+b for a, b in list(itertools.product(p1.T, p2.T)))
is accepted, the next stepd12 = np.array(d12[((d12**2).sum(axis=1)<4.)])
throws an exception "unsupported operand type(s) for ** or pow(): 'generator' and 'int" \$\endgroup\$