2

I have an array of arrays containing time-series data as follows:

var timeseries = [
 [Tue Dec 31 2013 00:00:00 GMT+0000 (GMT Standard Time), 3, 2],
 [Tue Dec 31 2013 01:00:00 GMT+0000 (GMT Standard Time), 1.2, 3],
 [Tue Dec 31 2013 02:00:00 GMT+0000 (GMT Standard Time), 4, 2]
]

The first item in the nested array is a javascript date.

I have a second array which contains a list of unique javascript dates, but that may or may not already exist in the timeseries array:

var newTimes = [
 Tue Dec 31 2013 00:13:00 GMT+0000 (GMT Standard Time), 
 Tue Dec 31 2013 01:00:00 GMT+0000 (GMT Standard Time), 
 Tue Dec 31 2013 01:40:00 GMT+0000 (GMT Standard Time)
]

For each item in the newTimes array, I need to check whether that date exists in the arrays stored with the timeseries array. If it does not, I would like to insert a new array in the correct place (chronological order) into the timeseries array, and - crucially - copy the other array values from the directly preceding array. For example, combining the two above array as described would yeild:

var newTimeseries = [
 [Tue Dec 31 2013 00:00:00 GMT+0000 (GMT Standard Time), 3, 2],
 [Tue Dec 31 2013 00:13:00 GMT+0000 (GMT Standard Time), 3, 2],
 [Tue Dec 31 2013 01:00:00 GMT+0000 (GMT Standard Time), 1.2, 3],
 [Tue Dec 31 2013 01:40:00 GMT+0000 (GMT Standard Time), 1.2, 3],
 [Tue Dec 31 2013 02:00:00 GMT+0000 (GMT Standard Time), 4, 2]
]

I should note both arrays are already sorted. Also, items in newTimes will exist within the range of timeseries. Finally, the original timeseries array has an entry every hour, on the hour.

I have tried a couple of different methods but I am interested in the computationally most efficient approach. I am happy to use any appropriate well adopted third party libraries such as underscore.js if helpful.

asked Jan 6, 2014 at 8:12
15
  • 1
    A linked list comes to mind... Commented Jan 6, 2014 at 8:16
  • 1
    Well, a linked list would give you O(1) complexity for insertions and O(n) for searching. I think you'll find that "most efficient" is going to depend on how often you get a hit/miss. Commented Jan 6, 2014 at 8:16
  • Depending on the list size, you could probably insert everything, then transverse the list once to either copy the extra data or remove the double entries. Just brainstorming here :) Commented Jan 6, 2014 at 8:21
  • Storing unix timestamps instead of a date objects in your array i guess will help speed it up (when it comes to looking up the correct position to insert) Commented Jan 6, 2014 at 8:22
  • The list will vary in size. Usual use case range might be 150 entries, up to about 4000 Commented Jan 6, 2014 at 8:23

1 Answer 1

1

Having considered a Binary Search (suggested by techfoobar in the comments), and the Bubble Sort (friendzis), the most optimal solution I have come up with so far is following this logic.

As opposed to using a Binary Search, consider the following.

Because we know that both arrays are already sorted; the dates within newTimes fall within the range of timeseries dates; and that the initial timeseries entries are on the hour, every hour, we can find the difference between timeseries[0][0] date and newTimes[0] in hours. This is the initial offset required to start the bubble sort approach (more like just a standard loop and slice as discussed in the comments above). This uses the Moment.js library.

Code presented below, please feel free to suggest changes:

var firstEvent = moment(newTimes[0]);
var firstTime = moment(timeseries[0][0]) ;
var offset = firstEvent.diff(firstTime, 'hours') - 1; 
for (var i = offset; i < timeseries.length;i++) { 
 while (newTimes.length > 0 && typeof moment(timeseries[i+1]) !== "undefined" && moment(newTimes[0]) > moment(timeseries[i][0]) && moment(newTimes[0]) < moment(timeseries[i+1][0])) {
 var newElement = timeseries[i].slice(0);
 newElement[0] = moment(newTimes[0]).toDate();
 timeseries.splice(i+1, 0, newElement);
 newTimes.splice(0,1);
 i++;
 } 
}

Importantly, this copes with the edge case when there are two or more entries in the newTimes array that all fall within within just one timeseries entry, for example:

var timeseries = [
 [Tue Dec 31 2013 00:00:00 GMT+0000 (GMT Standard Time), 3, 2],
 [Tue Dec 31 2013 01:00:00 GMT+0000 (GMT Standard Time), 1.2, 3]
]
var newTimes = [
 Tue Dec 31 2013 00:13:00 GMT+0000 (GMT Standard Time), 
 Tue Dec 31 2013 00:16:00 GMT+0000 (GMT Standard Time)
]
answered Jan 6, 2014 at 16:32

Comments

Your Answer

Draft saved
Draft discarded

Sign up or log in

Sign up using Google
Sign up using Email and Password

Post as a guest

Required, but never shown

Post as a guest

Required, but never shown

By clicking "Post Your Answer", you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.