Subset Component Task

Question 1

Problem

You are given an array with n 64-bit integers: d[0],d[1],....,d[n-1].

BIT(x, i) = (x >> i) & 1. (where B(x,i) is the i-th lower bit of x in binary form.)

If we regard every bit as a vertex of a graph G, there exists one undirected edge between vertex i and vertex j if there exists at least one k such that BIT(d[k], i) == 1 && BIT(d[k], j) == 1.

For every subset of the input array, how many connected-components are there in that graph? The number of connected components in a graph are the sets of nodes, which are accessible to each other, but not to/from the nodes in any other set.

For example if a graph has 6 nodes, labelled {1,2,3,4,5,6}. And contains the edges (1,2),(2,4),(3,5). There are 3 connected components: {1,2,4}, {3,5}, {6}. Because {1,2,4} can be accessed from each other through one or more edges, {3,5} can access each other and {6} is isolated from everyone else.

You only need to output the sum of the number of connected components in every graph.

Input Format
n
d[0] d[1] ... d[n - 1]
Constraints
1 <= n <= 20
0 <= d[i] <= 2^63 - 1
Output Format

Print the value of S.

Sample Input
3
2 5 9
Sample Output
504
Explanation

There are 8 subsets of {2,5,9}.

{} => We don't have any number in this subset => no edge in the graph => Every node is a component by itself => Number of connected-components = 64.

{2} => The Binary Representation of 2 is 10. There is a bit at only one position. => So there is no edge in the graph, every node is a connected-component by itself => Number of connected-components = 64.

{5} => The Binary Representation of 5 is 101. There is a bit at the 0th and 2nd position. => So there is an edge: (0, 2) in the graph => There is one component with a pair of nodes (0,2) in the graph. Apart from that, all remaining 62 vertices are indepenent components of one node each (1,3,4,5,6...63) => Number of connected-components = 63.

{9} => The Binary Representation of 9 is 1001. => There is a 1-bit at the 0th and 3rd position in this binary representation. => edge: (0, 3) in the graph => Number of components = 63

{2, 5} => This will contain the edge (0, 2) in the graph which will form one component => Other nodes are all independent components => Number of connected-component = 63

{2, 9} => This has edge (0,3) in the graph => Similar to examples above, this has 63 connected components

{5, 9} => This has edges (0, 2) and (0, 3) in the graph => Similar to examples above, this has 62 connected components

{2, 5, 9} => This has edges(0, 2) (0, 3) in the graph. All three vertices (0,2,3) make one component => Other 61 vertices are all independent components => Number of connected-components = 62

S = 64 +たす 64 +たす 63 +たす 63 +たす 63 +たす 63 +たす 62 +たす 62 =わ 504

Basic Idea

I've come up with a O(N*(2^N)) solution that meets the constraint requirements fine and passes all the test cases when submitted. The summary of the main idea here:

Go through each subset configuration {d1,...,dk} and calculate subsetConcat - the number of 1’s in d1|...|dk
65 - ones(subsetConcat) is the number of connected components for the subset.
Add it to the total number of connected components.

Code

#include <bitset>
#include <vector>
#include <unordered_map>
#include <iostream>
int main() {
 std::bitset<64> bitSet;
 int n;
 std::cin >> n;
 std::vector<uint64_t> d;
 for (uint8_t i = 0; i < n; i++) {
 uint64_t value;
 std::cin >> value;
 d.push_back(value);
 }
 std::unordered_map<uint64_t, uint8_t> dMap;
 for (uint8_t i = 0; i < n; i++) {
 bitSet = d[i];
 dMap[d[i]] = bitSet.count();
 }
 uint32_t subsetNumber = (1U << n);
 uint32_t connectedComponents = 64U;
 for (uint32_t i = 1; i < subsetNumber; i++) {
 uint64_t subsetConcat = 0;
 for (uint8_t j = 0; j < n; j++) {
 bool isDjPresent = (i >> j) & 1;
 if (isDjPresent && dMap[d[j]] > 1) {
 subsetConcat |= d[j];
 }
 }
 if (!subsetConcat) {
 connectedComponents += 64U; 
 } else {
 bitSet = subsetConcat;
 connectedComponents += (65U - bitSet.count());
 }
 }
 std::cout << connectedComponents << std::endl;
 return 0;
}

Questions

How readable is the code?
Can it be improved by using some nice features from C++14 or C++17?

Question 2

The code is semi readable, it would be more readable if there was some blank lines to separate the different parts of the logic. It would be even more readable if the code was broken up into functions, one function to get the input, another function to setup the bitset and the unordered map and a third function to calculate connectedComponents. The code seems to be misusing the bitset, and it would be interesting to know if the tests are passing or failing.

Question 3

@pacmaninbw Thanks for your feedback!

I've come up with a O(N*(2^N)) solution that meets the constraint requirements fine and passes all the test cases when submitted

.

Question 4

Clarify your code's structure

As pacmaninbw already mentioned, adding blank lines and moving bits into their own functions would help make the structure of your code more clear.

Avoid using `std::endl`

Just use '\n'; std::endl is equivalent to '\n' plus a flush of the output, which is usually not needed and will only decrease the performance of your program. See: https://stackoverflow.com/questions/213907/c-stdendl-vs-n

Create a function to count bits in a set

As pacmaninbw already mentioned, you are misusing std::bitset<> to count the number of bits. Instead of using the pattern bitSet = value; foo = bitSet.count(), create a function to count the number of bits set of a given value. You can start by just moving the std::bitset approach to this function:

static inline auto count_bits(uint64_t value) {
 return std::bitset<64>(value).count();
}

And then use it in other places in the code where appropriate. Then later you could change this function to something better. However, an optimizing compiler will turn the std::bitset approach into something nice and efficient anyway: https://godbolt.org/z/HwvqMb

C++20 will bring you the function std::popcount(), which is exactly what you want to use here. The main advantage is that it's constepxr, which std::bitset::count() unfortunately isn't.

Don't store what can be trivially (re)computed

Your dMap basically is there just to convert from a given value to the number of bits set in that value. Since CPUs are really fast at this kind of computation, it's not worth storing this into a map. The map approach has a huge overhead compared to the single popcount CPU instruction: it dynamically allocates memory to hold pairs of uint64_t and uint8_t, and it has to compute the hash of the input every time you access the map. So don't use it, and instead anywhere you wrote dMap[foo], write count_bits(foo).

Using nice features

There are lots of features that you can choose from, even some from before C++11, that would make your code a bit shorter. In particular, a problem like this screams "algorithms", so maybe there are some STL algorithms to help you out. Also, some C++11 features could be used in your code as well. The question is whether it makes the code more readable: some features are not used commonly, and so they are harder to understand for most people. So look for things that foremost improve the readability of the code.

Below is a selection of things I think you could have reasonably used.

Range-based for

Whenever you are looping over the items of a container, range-based for is usually something you should use. For example, you could read in the array from std::cin this way:

int n;
std::cin >> n;
std::vector<uint64_t> d(n);
for (auto &value: d)
 std::cin >> value;

Question 5

There is no single convention for C++ code. You can choose, just be consistent. You can also call it popcount() to match what you will get in C++20.

G. Sliepen G. Sliepen 68.7k3 gold badges74 silver badges179 bronze badges · Accepted Answer · 2019-12-20 22:28:32Z

Clarify your code's structure

As pacmaninbw already mentioned, adding blank lines and moving bits into their own functions would help make the structure of your code more clear.

Avoid using `std::endl`

Just use '\n'; std::endl is equivalent to '\n' plus a flush of the output, which is usually not needed and will only decrease the performance of your program. See: https://stackoverflow.com/questions/213907/c-stdendl-vs-n

Create a function to count bits in a set

As pacmaninbw already mentioned, you are misusing std::bitset<> to count the number of bits. Instead of using the pattern bitSet = value; foo = bitSet.count(), create a function to count the number of bits set of a given value. You can start by just moving the std::bitset approach to this function:

static inline auto count_bits(uint64_t value) {
 return std::bitset<64>(value).count();
}

And then use it in other places in the code where appropriate. Then later you could change this function to something better. However, an optimizing compiler will turn the std::bitset approach into something nice and efficient anyway: https://godbolt.org/z/HwvqMb

C++20 will bring you the function std::popcount(), which is exactly what you want to use here. The main advantage is that it's constepxr, which std::bitset::count() unfortunately isn't.

Don't store what can be trivially (re)computed

Your dMap basically is there just to convert from a given value to the number of bits set in that value. Since CPUs are really fast at this kind of computation, it's not worth storing this into a map. The map approach has a huge overhead compared to the single popcount CPU instruction: it dynamically allocates memory to hold pairs of uint64_t and uint8_t, and it has to compute the hash of the input every time you access the map. So don't use it, and instead anywhere you wrote dMap[foo], write count_bits(foo).

Using nice features

There are lots of features that you can choose from, even some from before C++11, that would make your code a bit shorter. In particular, a problem like this screams "algorithms", so maybe there are some STL algorithms to help you out. Also, some C++11 features could be used in your code as well. The question is whether it makes the code more readable: some features are not used commonly, and so they are harder to understand for most people. So look for things that foremost improve the readability of the code.

Below is a selection of things I think you could have reasonably used.

Range-based for

Whenever you are looping over the items of a container, range-based for is usually something you should use. For example, you could read in the array from std::cin this way:

int n;
std::cin >> n;
std::vector<uint64_t> d(n);
for (auto &value: d)
 std::cin >> value;

There is no single convention for C++ code. You can choose, just be consistent. You can also call it popcount() to match what you will get in C++20.

Stack Exchange Network

Subset Component Task

1 Answer 1

Clarify your code's structure

Avoid using `std::endl`

Create a function to count bits in a set

Don't store what can be trivially (re)computed

Using nice features

Range-based for

Your Answer

Sign up or log in

Post as a guest

Post as a guest

Hot Network Questions

Subset Component Task

1 Answer 1

Clarify your code's structure

Avoid using std::endl

Create a function to count bits in a set

Don't store what can be trivially (re)computed

Using nice features

Range-based for

Your Answer

Sign up or log in

Post as a guest

Post as a guest

Related

Hot Network Questions

Avoid using `std::endl`