Profile
#Profile
TheThe first step of improving the performance is to profile it. I recommend running this using Xcode's profile option and see where the time is spent. I suspect (but don't know for sure) that it will be in the calls to sin()
and cos()
.
Avoid Casts
#Avoid Casts
OneOne thing that can slow down calculations is lots of casts between types. You're using k
, n
, and N
as integers in most of the code, but need to cast them to Double
to calculate q
. You could keep a parallel dk
, dn
, and dN
that are floating point copies of k
, n
, and N
to avoid the casts. You'll need to manually increment them in the loops, though.
Do more at once
#Do more at once
IfIf you look in <Accelerate/vfp.h>
, you'll find vsinf()
and vcosf()
and more importantly, vsincosf()
which calculate sine, cosine, and both at once for a whole vector of Float
s. The precision is less than Double
, so I don't know if it meets your precision needs, but I'd look into it. This should allow you to work on 16 elements at a time instead of only 1.
#Profile
The first step of improving the performance is to profile it. I recommend running this using Xcode's profile option and see where the time is spent. I suspect (but don't know for sure) that it will be in the calls to sin()
and cos()
.
#Avoid Casts
One thing that can slow down calculations is lots of casts between types. You're using k
, n
, and N
as integers in most of the code, but need to cast them to Double
to calculate q
. You could keep a parallel dk
, dn
, and dN
that are floating point copies of k
, n
, and N
to avoid the casts. You'll need to manually increment them in the loops, though.
#Do more at once
If you look in <Accelerate/vfp.h>
, you'll find vsinf()
and vcosf()
and more importantly, vsincosf()
which calculate sine, cosine, and both at once for a whole vector of Float
s. The precision is less than Double
, so I don't know if it meets your precision needs, but I'd look into it. This should allow you to work on 16 elements at a time instead of only 1.
Profile
The first step of improving the performance is to profile it. I recommend running this using Xcode's profile option and see where the time is spent. I suspect (but don't know for sure) that it will be in the calls to sin()
and cos()
.
Avoid Casts
One thing that can slow down calculations is lots of casts between types. You're using k
, n
, and N
as integers in most of the code, but need to cast them to Double
to calculate q
. You could keep a parallel dk
, dn
, and dN
that are floating point copies of k
, n
, and N
to avoid the casts. You'll need to manually increment them in the loops, though.
Do more at once
If you look in <Accelerate/vfp.h>
, you'll find vsinf()
and vcosf()
and more importantly, vsincosf()
which calculate sine, cosine, and both at once for a whole vector of Float
s. The precision is less than Double
, so I don't know if it meets your precision needs, but I'd look into it. This should allow you to work on 16 elements at a time instead of only 1.
#Profile
The first step of improving the performance is to profile it. I recommend running this using Xcode's profile option and see where the time is spent. I suspect (but don't know for sure) that it will be in the calls to sin()
and cos()
.
#Avoid Casts
One thing that can slow down calculations is lots of casts between types. You're using k
, n
, and N
as integers in most of the code, but need to cast them to Double
to calculate q
. You could keep a parallel dk
, dn
, and dN
that are floating point copies of k
, n
, and N
to avoid the casts. You'll need to manually increment them in the loops, though.
#Do more at once
If you look in <Accelerate/vfp.h>
, you'll find vsinf()
and vcosf()
and more importantly, vsincosf()
which calculate sine, cosine, and both at once for a whole vector of Float
s. The precision is less than Double
, so I don't know if it meets your precision needs, but I'd look into it. This should allow you to work on 16 elements at a time instead of only 1.