Does this data structure have a name?

Question 1

I came up with the following data structure and I would like to know whether this data structure has been described in computer science literature and what it is called.

The data structure has a fixed size $N$ and it stores information about a set of $N$ elements which are identified by an index $i \in \{1,...,N\}$, specifically it stores whether each element is active or inactive. It supports the following operations in $O(1)$:

set an element as active
set an element as inactive
query the state of an element (active/inactive)
set all elements as active
set all elements as inactive

Additionally, the data structure allows you to iterate over the active elements in $O(\text{number of active elements})$ and iterate over the inactive elements in $O(\text{number of inactive elements})$. This data structure seems quite useful for a range of use cases.

The way this is implemented is by storing two arrays arr1 and arr2 of size $N$ and the number of active elements. Each array represents the elements of the set but it only stores the index of the element in the other array, so arr2[arr1[i]] == i and arr1[arr2[i]] == i. The difference between the arrays is that arr1 stores the elements in order and arr2 stores them such that active elements are ordered before inactive elements. In order to check if an element i is active we just have to read arr1[i] which gives us the index in the second sorted array. The element is inactive if and only if that index is greater than the number of active elements.

Let me give you an example to show how activation and iteration works. Let's start with a set of 5 inactive elements:

index (not stored): 1 2 3 4 5
──────────────────────────────────
 arr1: 1 2 3 4 5
 arr2: 1 2 3 4 5
 active_elements: 0

Say we now want to activate element 2, so we swap element 2 with the first inactive element (which in this case is element 1) and increment active_elements:

index (not stored): 1 2 3 4 5
──────────────────────────────────
 arr1: 2 1 3 4 5
 arr2: 2 1 3 4 5
 active_elements: 1

Since active_elements is now 1 and we know that arr2 is sorted, we know that the first element of arr2 is active and all the other elements are inactive. The first inactive element is element 1. Let's say we now want to activate element 4, so we need to swap it with the first inactive element (element 1). So in arr1 we just swap the elements at the indices 1 and 4, then we look at arr1 to find the indices of arr2 (in this case 2 and 4) and swap those in arr2. Finally, we increment active_elements.

index (not stored): 1 2 3 4 5
──────────────────────────────────
 arr1: 4 1 3 2 5
 arr2: 2 4 3 1 5
 active_elements: 2

Since active_elements is now 2 and we know that arr2 is sorted, we know that the elements 2 and 4 are active and the elements 3, 1, and 5 are inactive.

Deactivating an element works similarly to activating, by just swapping it with the last active element and decrementing active_elements.

Setting all elements to (in)active is just setting active_elements to 0 or $N$.

I can describe the algorithms in more detail but I'm assuming this is already a known data structure so I'll just refer to the papers that will hopefully be mentioned in the answers to this question.

Question 2

you can refer this site - en.wikipedia.org/wiki/List_of_data_structures , if it isn't there then i doubt if this data-structure is well-known

Question 3

@AdityaMishra I'm quite sure it is not a well-known data structure. The question is whether it is known at all in computer science. Or maybe there is a closely related data structure.

Question 4

For many applications it's not necessary to set or unset all elements in constant time, for which we can simply use a bitset.

Question 5

This is essentially a set (or two, of active/inactive elements). If the number of elements in the universe is small, a bit array is an efficient implementation, giving all operations (except setting all active/inactive) in $O(1)$ time. Setting all is $O(N)$, where $N$ is the size of the universe (constant in a particular use).

Question 6

What if the number of elements in the universe isn't small? Also iteration is always $O(N)$ with a bit array, even if only a small number of elements are active.

Question 7

Would a Roaring Bitmap work for your use case? They're essentially a very space-efficient, but also high-performance, sparse bit vector (i.e. a SortedSet<int>). Very widely used in big data circles.

vonbrand 14.3k3 gold badges42 silver badges52 bronze badges · Answer 1 · 2024-10-01 12:11:20Z

This is essentially a set (or two, of active/inactive elements). If the number of elements in the universe is small, a bit array is an efficient implementation, giving all operations (except setting all active/inactive) in $O(1)$ time. Setting all is $O(N)$, where $N$ is the size of the universe (constant in a particular use).

What if the number of elements in the universe isn't small? Also iteration is always $O(N)$ with a bit array, even if only a small number of elements are active.

Alex Cruise 1011 bronze badge · Answer 2 · 2025-01-09 21:22:43Z

Would a Roaring Bitmap work for your use case? They're essentially a very space-efficient, but also high-performance, sparse bit vector (i.e. a SortedSet<int>). Very widely used in big data circles.

Stack Exchange Network

Does this data structure have a name?

2 Answers 2

Your Answer

Sign up or log in

Post as a guest

Post as a guest

Hot Network Questions

Does this data structure have a name?

2 Answers 2

Your Answer

Sign up or log in

Post as a guest

Post as a guest

Related

Hot Network Questions