I have a very big array which looks similar to this
var counts = ["gfdg 34243","jhfj 543554",....] //55268 elements long
this is my current loop
var replace = "";
var scored = 0;
var qgram = "";
var score1 = 0;
var len = counts.length;
function score(pplaintext1) {
qgram = pplaintext1;
for (var x = 0; x < qgram.length; x++) {
for (var a = 0, len = counts.length; a < len; a++) {
if (qgram.substring(x, x + 4) === counts[a].substring(0, 4)) {
replace = parseInt(counts[a].replace(/[^1-9]/g, ""));
scored += Math.log(replace / len) * Math.LOG10E;
} else {
scored += Math.log(1 / len) * Math.LOG10E;
}
}
}
score1 = scored;
scored = 0;
} //need to call the function 1000 times roughly
I have to loop through this array several times and my code is running slowly. My question is what the fastest way to loop through this array would be so I can save as much time as possible.
-
2Define "very big": 10? 100? 1M? 1P elements?Stefano Sanfilippo– Stefano Sanfilippo2014年07月26日 10:33:01 +00:00Commented Jul 26, 2014 at 10:33
-
@Qantas94Heavy is telling you that unless this portion of the code is the one that actually slows down the program as a whole, you should not care about it and go for the obvious way.Stefano Sanfilippo– Stefano Sanfilippo2014年07月26日 10:34:19 +00:00Commented Jul 26, 2014 at 10:34
-
Show use the code inside loop, there is probably a problem with performance. You could try with sorting array to get better results.Mateusz Nowak– Mateusz Nowak2014年07月26日 10:34:48 +00:00Commented Jul 26, 2014 at 10:34
-
@user3184807: a "bottleneck" means something that slows down a process or activity.Qantas 94 Heavy– Qantas 94 Heavy2014年07月26日 10:35:34 +00:00Commented Jul 26, 2014 at 10:35
-
so the first element seems like a key, and the number as an associated value. Are the keys unique ?GameAlchemist– GameAlchemist2014年07月26日 11:16:48 +00:00Commented Jul 26, 2014 at 11:16
1 Answer 1
Your counts array appears to be a list of unique strings and values associated with them. Use an object instead, keyed on the unique strings, e.g.:
var counts = { gfdg: 34243, jhfj: 543554, ... };
This will massively improve the performance by removing the need for the O(n) inner loop by replacing it with an O(1) object key lookup.
Also, avoid divisions - log(1 / n) = -log(n) - and move loop invariants outside the loops. Your log(1/len) * Math.LOG10E is actually a constant added in every pass, except that in the first if branch you also need to factor in Math.log(replace), which in log math means adding it.
p.s. avoid using the outer scoped state variables for the score, too! I think the below replicates your scoring algorithm correctly:
var len = Object.keys(counts).length;
function score(text) {
var result = 0;
var factor = -Math.log(len) * Math.LOG10E;
for (var x = 0, n = text.length - 4; x < n; ++x) {
var qgram = text.substring(x, x + 4);
var replace = counts[qgram];
if (replace) {
result += Math.log(replace) + factor;
} else {
result += len * factor; // once for each ngram
}
}
return result;
}
4 Comments
O(1) instead of O(n), and because it avoids the need to use substr on the count values.counts list. In mine I'm only adding len * 1/n if the qgram isn't found in the counts list at all, so if the qgram is found it's missing factor * (len - 1) / n