I'm starting with a large array containing nested objects (~34,000 items), and I'm polling every 5 seconds, receiving updates to the array (generally 2-3 updates at a time) and the new array only contains items that have changed (so for example, if only 3 items changed, I only get those 3 items in my updated array.)
I tried the following using underscore, but it takes up to 9 seconds, I need like a second or less.
getUniqueUnion(new, old) {
return uniq(union(new, old), false, function(item) {
return item.info.id;
});
}
Anyone have a suggestion for a fast approach? If it helps, the unique key would be info.id, and the only change I'm really looking to detect (reason I need to update the item) is if the inStock value changes. In other words, if the inStock value hasn't changed, I dont need to make any update to that array item.
Example arrays:
const originalData = [
{
info: {
id: 1
},
moreInfo: {
name: 'hamburger',
inStock: true
}
},
... // 33,999 more
]
const updatedData = [
{
info: {
id: 1
},
moreInfo: {
name: 'hamburger',
inStock: false
}
},
... // maybe 2 more
]
So the final array would include the first item to have moreInfo.inStock === false
2 Answers 2
After reading your situation, here is my suggestions:
- Sort your large array. You only have to do this once, so don't worry about it. Finding a value in a sorted array is always faster and easier than in an unsorted one.
- Loop through your
updatedData
, take an item, look for it in your sorted array (originalData
), using Binary Search. Update it as you want.
To see how Binary Search is implemented in javascript, you can refer to this post, or this one.
I was definitely over-complicating this. Here's a solution I came up with, but if anyone knows of something faster I'd appreciate it.
function updateArray(latest, current) {
for (let i = 0, len = current.length; i < len; i++) {
for (let g = 0, len = latest.length; g < len; g++) {
if (latest[g].module.id === current[i].module.id) {
current[i] = latest[g];
break;
}
}
}
return current;
}
-
if the length of
latest
andcurrent
are similar, this could still be pretty slowO(n^2)
(assuming n is close to m. If you keep thecurrent
array sorted byid
you could use binary search to find the element that you need to update inO(log n)
time instead ofO(n)
Hunter McMillen– Hunter McMillen2015年10月05日 20:45:51 +00:00Commented Oct 5, 2015 at 20:45 -
Since info.id is unique, why not use a hash instead. Then all you need to do is to go through the latest array and simply overwrite the matching entry in the current hash.subdigit– subdigit2015年10月06日 03:54:07 +00:00Commented Oct 6, 2015 at 3:54
-
latest
will be very small, say and array of 1-3 itemsBen– Ben2015年10月06日 21:37:49 +00:00Commented Oct 6, 2015 at 21:37
id
of the entry you need to update you should look for thatid
then update only that entry instead of re-writing the entire arrayoriginalData
. You get theupdatedData
by polling. Now, you want to update the data inoriginalData
base on new data fromupdatedData
. Is that right?