6
$\begingroup$

I came up with the following data structure and I would like to know whether this data structure has been described in computer science literature and what it is called.

The data structure has a fixed size $N$ and it stores information about a set of $N$ elements which are identified by an index $i \in \{1,...,N\}$, specifically it stores whether each element is active or inactive. It supports the following operations in $O(1)$:

  • set an element as active
  • set an element as inactive
  • query the state of an element (active/inactive)
  • set all elements as active
  • set all elements as inactive

Additionally, the data structure allows you to iterate over the active elements in $O(\text{number of active elements})$ and iterate over the inactive elements in $O(\text{number of inactive elements})$. This data structure seems quite useful for a range of use cases.

The way this is implemented is by storing two arrays arr1 and arr2 of size $N$ and the number of active elements. Each array represents the elements of the set but it only stores the index of the element in the other array, so arr2[arr1[i]] == i and arr1[arr2[i]] == i. The difference between the arrays is that arr1 stores the elements in order and arr2 stores them such that active elements are ordered before inactive elements. In order to check if an element i is active we just have to read arr1[i] which gives us the index in the second sorted array. The element is inactive if and only if that index is greater than the number of active elements.

Let me give you an example to show how activation and iteration works. Let's start with a set of 5 inactive elements:

index (not stored): 1 2 3 4 5
──────────────────────────────────
 arr1: 1 2 3 4 5
 arr2: 1 2 3 4 5
 active_elements: 0

Say we now want to activate element 2, so we swap element 2 with the first inactive element (which in this case is element 1) and increment active_elements:

index (not stored): 1 2 3 4 5
──────────────────────────────────
 arr1: 2 1 3 4 5
 arr2: 2 1 3 4 5
 active_elements: 1

Since active_elements is now 1 and we know that arr2 is sorted, we know that the first element of arr2 is active and all the other elements are inactive. The first inactive element is element 1. Let's say we now want to activate element 4, so we need to swap it with the first inactive element (element 1). So in arr1 we just swap the elements at the indices 1 and 4, then we look at arr1 to find the indices of arr2 (in this case 2 and 4) and swap those in arr2. Finally, we increment active_elements.

index (not stored): 1 2 3 4 5
──────────────────────────────────
 arr1: 4 1 3 2 5
 arr2: 2 4 3 1 5
 active_elements: 2

Since active_elements is now 2 and we know that arr2 is sorted, we know that the elements 2 and 4 are active and the elements 3, 1, and 5 are inactive.

Deactivating an element works similarly to activating, by just swapping it with the last active element and decrementing active_elements.

Setting all elements to (in)active is just setting active_elements to 0 or $N$.

I can describe the algorithms in more detail but I'm assuming this is already a known data structure so I'll just refer to the papers that will hopefully be mentioned in the answers to this question.

asked Sep 28, 2024 at 8:43
$\endgroup$
3
  • 1
    $\begingroup$ you can refer this site - en.wikipedia.org/wiki/List_of_data_structures , if it isn't there then i doubt if this data-structure is well-known $\endgroup$ Commented Sep 29, 2024 at 18:09
  • $\begingroup$ @AdityaMishra I'm quite sure it is not a well-known data structure. The question is whether it is known at all in computer science. Or maybe there is a closely related data structure. $\endgroup$ Commented Sep 29, 2024 at 18:57
  • 1
    $\begingroup$ For many applications it's not necessary to set or unset all elements in constant time, for which we can simply use a bitset. $\endgroup$ Commented Sep 30, 2024 at 1:23

2 Answers 2

2
$\begingroup$

This is essentially a set (or two, of active/inactive elements). If the number of elements in the universe is small, a bit array is an efficient implementation, giving all operations (except setting all active/inactive) in $O(1)$ time. Setting all is $O(N)$, where $N$ is the size of the universe (constant in a particular use).

answered Oct 1, 2024 at 12:11
$\endgroup$
1
  • $\begingroup$ What if the number of elements in the universe isn't small? Also iteration is always $O(N)$ with a bit array, even if only a small number of elements are active. $\endgroup$ Commented Oct 2, 2024 at 7:29
0
$\begingroup$

Would a Roaring Bitmap work for your use case? They're essentially a very space-efficient, but also high-performance, sparse bit vector (i.e. a SortedSet<int>). Very widely used in big data circles.

answered Jan 9 at 21:22
$\endgroup$

Your Answer

Draft saved
Draft discarded

Sign up or log in

Sign up using Google
Sign up using Email and Password

Post as a guest

Required, but never shown

Post as a guest

Required, but never shown

By clicking "Post Your Answer", you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.