I am not quite sure how to attach my data, but it is 5737 observations of 3 variables. Variables V
and P
are strictly < 0, and variable R
is -2 < R
< 2.
The last function, correcttot
, will help me find the optimal constants, or ranges of constants, for this function. However, the function that it runs over, percentcorrect
, takes several seconds to run, which means that repeating it 10,000,000 times is not feasible.
setwd("~/Desktop")
dat<-read.csv(file="data.csv",sep=",",header=T)
attach(dat)
###################################################################################
voltoalph<-function(v,c){ # Where V is a data vector and C is a constant
alpha<-c(rep(0,length(v)))
for(i in 1:length(v)){
alpha[i+1]<-abs(c*(v[i]/max(v,na.rm=TRUE)))
}
alpha
}
voltoenen<-function(v,c){ # Where V is a data vector and C is a constant
enen<-c(rep(1,length(v)))
for(i in 1:length(v)){
enen[i+1]<-abs((v[i]/max(v,na.rm=TRUE)))
}
enhelp<-1/(enen)
enhelp2<-c*(enhelp/max(enhelp))
enhelp2
}
ema<-function(v,n,a){ # Where V is a data vector and n is a contant and a is a constant
avevec<-c(rep(0,n))
for(i in 1:n){
avevec[i]<-((1-a[i])^(n-i))*v[i]}
divvec<-c(rep(0,n))
for(i in 1:n){
divvec[i]<-((1-a[i])^(n-i))
}
sum(avevec)/sum(divvec)
}
betaema<-function(v,n,a,l){ # Where V is a data vector and n,a,l are constants
secondvec<-c(rep(0, length(v)))
for(i in l:length(v)){
secondvec[i]<-ema(v[(i-n[i]+1):i],n[i],a)
}
secondvec
}
#################################################################################################################
howright<-function(v,r,c,l){ # Where v and r are data vectors and c,l are constants
rightvec<-0
for(i in l:(length(r)-c)){
if((v[i]*mean(r[i+1]:r[i+c]))>0){
rightvec<-rightvec+1
}
else{
rightvec<-rightvec
}
}
rightvec/(length(r)-l-c)
}
#################################################################################################################
percentcorrect<-function(ca1,ca2,cn1,cn2,e,d,c,v,p,r){ ###V, P, and R are data vectors, rest constant
vol1<-voltoalph(v,ca1)
vol2<-voltoalph(v,ca2)
ens1<-voltoenen(v,cn1)
ens2<-voltoenen(v,cn2)
als<-c(vol1)
als2<-c(vol2)
n1<-c(ens1)
n2<-c(ens2)
anotherema1<-betaema(p,n1,als,max(cn1,cn2))
anotherema2<-betaema(p,n2,als2,max(cn1,cn2))
slope1<-c(rep(0,length(p)))
slope2<-c(rep(0,length(p)))
for(i in (max(cn1,cn2)+d):length(anotherema1)){
slope1[i]<-(anotherema1[i]-anotherema1[i-d])/d
}
for(i in (max(cn1,cn2)+e):length(anotherema2)){
slope2[i]<-(anotherema2[i]-anotherema2[i-e])/e
}
sig<-slope1-slope2
hvec<-howright(sig,r,c,max(cn1,cn2))
hvec
}
##########################################################################################
correcttot<-function(v,p,r){ ###Where v, p, and r are data vectors
correct3<-array(0,dim=c(10,10,10,10,10,10,10))
for(i in 1:10){
for(j in 1:10){
for(k in 1:10){
for(l in 1:10){
for(m in 2:10){
for(n in 2:10){
for(o in 1:10){
correct3[i,j,k,l,m,n,o]<-percentcorrect((i/10),(j/10),(20*k),(20*l),m,n,o,v,p,r)
}
}
}
}
}
}
}
print(correct3)
}
newvec<-correcttot(vl,p,rt) # run it on the vectors vl, p and rt
which(newvec==max(newvec2),arr.ind=TRUE)
-
\$\begingroup\$ Post some data too. Say 10-20 lines at random. Or if you can link to it over dropbox. \$\endgroup\$Mike Wise– Mike Wise2015年11月22日 17:21:42 +00:00Commented Nov 22, 2015 at 17:21
-
\$\begingroup\$ Added the first 20-something lines of the data. Let me know if it is better to link the whole file. Thank you again Mike! \$\endgroup\$user3678028– user36780282015年11月22日 17:28:12 +00:00Commented Nov 22, 2015 at 17:28
-
\$\begingroup\$ filehosting.org/file/details/523562/newdata.csv Here is a link to the full dataset \$\endgroup\$user3678028– user36780282015年11月22日 17:35:29 +00:00Commented Nov 22, 2015 at 17:35
1 Answer 1
You have several loops that are completely unnecessary. Remove them and you will see dramatic speedups. Take voltoalph
and voltoenen
for example:
voltoalph2 <- function(v,c) {
# v: a data vector
# c: a scalar constant
c(0, c[1L] * (v / max(v, na.rm=TRUE)))
}
voltoenen2 <- function(v,c) {
# v: a data vector
# c: a scalar constant
enhelp <- 1 / c(1, abs(v / max(v, na.rm=TRUE)))
c[1L] * (enhelp / max(enhelp))
}
require(microbenchmark)
v <- 1:10000
microbenchmark(voltoalph(v, 2), voltoalph2(v, 2))
# Unit: microseconds
# expr min lq mean median uq max neval cld
# voltoalph(v, 2) 105004.652 106038.7245 108767.0230 106725.827 108326.7975 151482.035 100 b
# voltoalph2(v, 2) 81.925 83.5825 141.3382 89.794 91.6065 5350.052 100 a
identical(voltoalph(v, 2), voltoalph2(v, 2))
# [1] TRUE
microbenchmark(voltoenen(v, 2), voltoenen2(v, 2))
# Unit: microseconds
# expr min lq mean median uq max neval cld
# voltoenen(v, 2) 103595.526 105433.803 108797.6308 106882.274 110160.65 148550.397 100 b
# voltoenen2(v, 2) 200.275 203.182 227.6582 207.932 220.57 1007.375 100 a
identical(voltoenen(v, 2), voltoenen2(v, 2))
# [1] TRUE
The vectorized (non-loop) versions are over 1000, and 500 times faster (for voltoalph
and voltoenen
, respectively), and provide identical output. percentcorrect
calls each of those functions two times, which makes the speed improvement have twice the impact.
-
\$\begingroup\$ Welcome to Code Review! Good job on your first answer. \$\endgroup\$SirPython– SirPython2015年11月22日 18:45:19 +00:00Commented Nov 22, 2015 at 18:45
-
\$\begingroup\$ Thats really helpful. Can that same paradigm be used for a function where the references are non constant? Say for betaema where n and a change? \$\endgroup\$user3678028– user36780282015年11月22日 18:53:32 +00:00Commented Nov 22, 2015 at 18:53
-
\$\begingroup\$ Or more importantly for ema, because I imagine the 2 loops in that set the time back. \$\endgroup\$user3678028– user36780282015年11月22日 19:09:42 +00:00Commented Nov 22, 2015 at 19:09
-
\$\begingroup\$ @user3678028:
ema
can be vectorized. I don't thinkbetaema
can without a significant tradeoff in the amount of memory used. That said, it would be very simple to code in C/C++. \$\endgroup\$Joshua Ulrich– Joshua Ulrich2015年11月24日 02:37:51 +00:00Commented Nov 24, 2015 at 2:37