Efficiently Merging Sorted (Key, Value) Lists

Question 1

I'm working with multiple ordered lists, each containing tuples of the form (key, value), where each list is sorted in ascending order by key. I want to merge these lists into a single ordered list that includes every distinct key from the input lists. The merged output should be a list of tuples in the form:

(key, value_from_list1, value_from_list2, ..., value_from_listK)

with the following behavior:

For each key present in the union of all keys, the output tuple should contain the "current" value from each list.
If an input list does not have an entry at that key, it should carry forward the most recent value from that list (i.e., use the value from the latest key less than or equal to the current key).

For example, given two lists:

A = [(1, a1), (2, a2), (10, a3)]
B = [(1, b1), (8, b2), (10, b3)]

the desired output is:

C = [
 (1, a1, b1), // At key 1, both lists provide values.
 (2, a2, b1), // At key 2, list A updates to a2; list B still uses b1.
 (8, a2, b2), // At key 8, list B updates to b2; list A remains at a2.
 (10, a3, b3) // At key 10, both lists update.
]

This merging behavior should extend to more than two lists as well (e.g., merging 5 lists into tuples with 6 elements, where the first element is the key).

All input lists are assumed to contain the element (0, null), but we omit this element (i.e. (0, null, null, ..., null)) from the final output. This ensures that, when merging, if a list doesn't have an entry for a given key, it carries forward the most recent value (initially null). For example, merging A = [(1, a1)] and B = [(2, b1)] yields [(1, a1, null), (2, a1, b1)].

My Questions:

Algorithm Efficiency: What is the most efficient algorithm for performing this merge?

I've considered a solution using a sweeping line algorithm with multiple pointers—one for each list—to keep track of the most recent value as I iterate through the union of keys. However, I'm looking for insights or more efficient methods, especially in the general case with k lists.

Edit 2025年03月13日

Pseudocode

I ́ve created pseudocode based on the suggestion from @Smylic.

function mergeLists(lists):
 k = number of lists
 pointers = array of size k, all initialized to 0
 lastValues = array of size k, initially set to None (or some default)
 output = empty list
 while true:
 minKey = None
 // Find the minimum key among the current pointers in all lists.
 for i from 0 to k-1:
 if pointers[i] < length(lists[i]):
 currentKey = lists[i][pointers[i]].key
 if minKey is None or currentKey < minKey:
 minKey = currentKey
 // If no minimum key was found, all lists are exhausted.
 if minKey is None:
 break
 // Update lastValues for each list that has the current minKey
 for i from 0 to k-1:
 if pointers[i] < length(lists[i]) and lists[i][pointers[i]].key == minKey:
 lastValues[i] = lists[i][pointers[i]].value
 pointers[i] = pointers[i] + 1
 // Append the merged tuple (minKey, lastValues[0], lastValues[1], ..., lastValues[k-1])
 output.append( (minKey, lastValues[0], lastValues[1], ..., lastValues[k-1]) )
 return output

Question 2

I do not see a significant difference from a "standard $k$-way merge"(en.wikipedia).

Question 3

@greybeard This is similar to a k-way merge but not quite the same. A standard k-way merge only merges existing elements, while here, I need to carry forward the most recent value from each list when a key is missing. This adds a state-tracking requirement, making it a bit more complex.

Question 4

Keeping $k$ values, one for each list? Insignificant difference! A more useful approach may characterise the operations required of the union of lists, and find a way how to support that without needing up to $k$ times the memory. And possibly save time ($o(kn)$).

Question 5

@greybeard Thanks, I've updated with post with some pseudocode

Question 6

Please don't edit your post to include an answer (or implementation of an answer) in the post. We're not a forum, and we work differently from other sites you might be used to. For example, we're not looking for you to edit your post and add "answered: ..." with the answer. Instead, you could edit the answer to add your proposed pseudocode, so it is organized together with the description of the algorithm, in the answer "zone".

Question 7

You can't do it faster than linear of the output size. I. e., if you need to output $k$ values for every key, the most efficient algorithm takes $\Theta(k \cdot u)$ of time, where $u$ is the number of unique keys. Supposing that key comparison takes $\mathrm O(1)$ of time, you can even avoid using a heap for getting the next key: just look at every pointer once to find minimum key, then look at every pointer one more time either to get the current value and move pointer, or to get the previous value. However heap of keys may decrease the underlying constant.

Question 8

Thanks a lot for your insights! I made pseudocode—please let me know if it's what you had in mind. Its added to my original post. I am not using a heap for it.

Question 9

@Isak Yes, it is what I'm talking about

Smylic 1,1085 silver badges15 bronze badges · Accepted Answer · 2025-03-13 05:10:24Z

You can't do it faster than linear of the output size. I. e., if you need to output $k$ values for every key, the most efficient algorithm takes $\Theta(k \cdot u)$ of time, where $u$ is the number of unique keys. Supposing that key comparison takes $\mathrm O(1)$ of time, you can even avoid using a heap for getting the next key: just look at every pointer once to find minimum key, then look at every pointer one more time either to get the current value and move pointer, or to get the previous value. However heap of keys may decrease the underlying constant.

Thanks a lot for your insights! I made pseudocode—please let me know if it's what you had in mind. Its added to my original post. I am not using a heap for it.

Stack Exchange Network

Efficiently Merging Sorted (Key, Value) Lists

Edit 2025年03月13日

Pseudocode

1 Answer 1

Your Answer

Sign up or log in

Post as a guest

Post as a guest

Hot Network Questions

Efficiently Merging Sorted (Key, Value) Lists

Edit 2025年03月13日

Pseudocode

1 Answer 1

Your Answer

Sign up or log in

Post as a guest

Post as a guest

Related

Hot Network Questions