I have a list of binary strings where each binary string max length is 15. I need to find list of integers which is count of similar (should differ by max 1 bit position) binary strings present in the list for each of the binary strings present in the list. This is current solution with time complexity of O(N2) but getting TLE for larger inputs.
#include <iostream>
#include <string>
#include <vector>
bool is_similar(const long long num) {
return num == 0 || (num & (num - 1)) == 0;
}
int main() {
const std::vector<std::string> str_nums {
"100", "110", "010", "011", "100"
};
std::vector<long long> nums;
for (const auto& str_num : str_nums)
nums.push_back(std::stoll(str_num, nullptr, 2));
const int n = nums.size();
std::vector<int> answers(n);
for (int i = 0; i < n - 1; ++i) {
for (int j = i + 1; j < n; ++j) {
if (is_similar(nums[i] ^ nums[j])) {
++answers[i];
++answers[j];
}
}
}
for (const auto answer : answers)
std::cout << answer << ", ";
std::cout << std::endl;
}
Is there an efficient way to do this?
-
3\$\begingroup\$ I have rolled back Rev 3 → 2. Please see What to do when someone answers. \$\endgroup\$Sᴀᴍ Onᴇᴌᴀ– Sᴀᴍ Onᴇᴌᴀ ♦2022年09月14日 16:17:33 +00:00Commented Sep 14, 2022 at 16:17
2 Answers 2
Toby Speight's answer already addresses improvements in your code style. This answer presents an improved algorithm that makes use of the fact that the strings are short.
Consider, instead of iterating over all existing strings to check if they are similar, we can iterate over all similar strings to check if they exist.
To do this, we need to store a mapping from each string to the number of times it appears (in my solution an unordered_map
).
Where n
is the number of strings and l
is the maximum length of each string, this solution has complexity O(n*l)
, compared to your original O(n*n+n*l)
.
#include <iostream>
#include <string>
#include <vector>
#include <unordered_map>
int main() {
const std::vector<std::string> str_nums {
"100", "110", "010", "011", "100"
};
std::vector<int> nums;
std::unordered_map<int, int> nums_cnt;
for (const auto& str_num : str_nums) {
const int encoded_value = std::stoll(str_num, nullptr, 2);
nums.push_back(encoded_value);
nums_cnt[encoded_value] += 1;
}
const int n = nums.size();
std::vector<int> answers(n);
for (int i = 0; i < n; i++) {
// add strings that are exactly the same (but don't include this one)
answers[i] += nums_cnt[nums[i]] - 1;
for (int bit = 0; bit <= 15; bit++) {
// add strings which differ exactly in this bit
answers[i] += nums_cnt[nums[i] ^ (1 << bit)];
}
}
for (const auto answer : answers)
std::cout << answer << ", ";
std::cout << std::endl;
}
-
\$\begingroup\$ On my system, replacing
std::unordered_map
withstd::map
improves speed by about 20%. \$\endgroup\$Toby Speight– Toby Speight2022年09月17日 15:01:43 +00:00Commented Sep 17, 2022 at 15:01
We're told that the maximum string length is 15, yet we're storing values as long long
. We could safely use a std::uint_fast16_t
instead. Alternatively, consider using a std::bitset<15>
, which handily comes with a constructor that converts from string.
The is_similar()
test is misleadingly named. I expected it to take two arguments, but it actually takes one argument and tests whether its bit count is one or less. It can be made both shorter and clearer using std::popcount()
:
static constexpr is_similar(unsigned difference) {
return std::popcount(difference) <= 1;
}
We're using signed types in many places where unsigned types are more suitable (e.g. I'd expect n
to be a std::size_t
, and consequently i
and j
too).
If we have repeated values in the input, we might be able to get a speed-up by using a Bag type (a histogram of values), and use that to reduce the pairings we need to consider.
Beyond that, you'll need to think of a better algorithm. One algorithm (suggested in comment by Deduplicator - thank you!) that's more likely to reduce the search space is to divide into buckets according to population count (i.e. number of set bits). If we use a std::bitset
as the storage type, that information is readily accessible as .count()
. Then we only need to compare adjacent buckets.
Another way that might work is to divide the input into buckets according to the top N bits (for some good choice of N). Then, we only have to consider pairs of buckets that are 1 step apart, and count how many values are common to both buckets, and within each bucket count how many values are 1 step apart (we could make this recursive, using multiple levels of bit-slicing).
I'm going to develop this idea, but first we need to split the program so that we have an independently testable (and measurable) function:
#include <cstdint>
#include <string>
#include <vector>
using Number = std::uint_fast16_t;
bool is_similar(const long long num) {
return num == 0 || (num & (num - 1)) == 0;
}
auto count_similar_pairs(const std::vector<Number>& nums)
{
const std::size_t n = nums.size();
std::vector<std::size_t> answers(n);
for (std::size_t i = 0; i < n - 1; ++i) {
for (std::size_t j = i + 1; j < n; ++j) {
if (is_similar(nums[i] ^ nums[j])) {
++answers[i];
++answers[j];
}
}
}
return answers;
}
auto count_similar_pairs(const std::vector<std::string>& str_nums)
{
std::vector<Number> nums;
for (const auto& str_num : str_nums)
nums.push_back(std::stoll(str_num, nullptr, 2));
return count_similar_pairs(nums);
}
#include <algorithm>
#include <iostream>
#include <iterator>
int main()
{
const std::vector<std::string> str_nums {
"100", "110", "010", "011", "100"
};
auto answers = count_similar_pairs(str_nums);
std::ranges::copy(answers, std::ostream_iterator<std::size_t>(std::cout, ", "));
std::cout << '\n';
}
Now we can replace main()
with some tests (I'll use Google Test because it's convenient for me; other test frameworks will do equally well):
#include <gtest/gtest.h>
#include <numeric>
TEST(CountPairs, FromQuestion)
{
const std::vector<std::string> str_nums {
"100", "110", "010", "011", "100"
};
const std::vector<std::size_t> expected {
2, 3, 2, 1, 2
};
EXPECT_EQ(count_similar_pairs(str_nums), expected);
}
TEST(CountPairs, BigInput)
{
std::vector<Number> nums(20000);
std::iota(nums.begin(), nums.end(), 20);
// No assertion here - we're just interested in execution time
count_similar_pairs(nums);
}
Now we can start implementing the algorithm. How many bits should we consider at a time? Taking things to the limit, I suggest one bit at a time. Then the algorithm looks like this:
- Divide the N-bit inputs into two sets those that begin with 0 and those that begin with 1, and mask them down to N-1 bits.
- Count numbers that are present in both sets (we can do this in a single pass if both sets are sorted, just like how we merge two sorted lists)
- For each set, count how many pairs in that set (sharing the same prefix) are near-neighbours. This is a recursive call into our algorithm. If the bit-width reaches zero, then we just count each element.
If we sort the numbers into ascending order, then they are naturally arranged how we want them, and we can divide into sets using subranges. So let's sort them, remembering which answer count is associated with each number:
using count_iter = std::vector<std::size_t>::iterator;
struct num_pair {
Number value;
count_iter output;
};
using num_vector = std::vector<num_pair>;
auto count_similar_pairs(std::vector<Number> const& nums)
{
auto const n = nums.size();
std::vector<Number> counts(n);
num_vector nv;
nv.reserve(n);
auto it = counts.begin();
for (auto n: nums) {
nv.emplace_back(n, it++);
}
std::ranges::sort(nv, {}, &num_pair::value);
/// TODO: call our recursive function here
return counts;
}
With this structure in place, we can now implement our recursive function.
void count_similar_pairs(std::ranges::random_access_range auto& nums,
Number mask)
{
if (!mask) {
// We've reached the end of the recursion.
// We just need to account for the identical pairs in input.
auto it = nums.begin();
while ((it = std::ranges::adjacent_find(it, nums.end(), {}, &num_pair::value)) != nums.end()) {
auto const last = std::ranges::adjacent_find(it, nums.end(), std::not_equal_to<>{}, &num_pair::value);
// each of the N values in this range is similar to the other N-1
auto dup_count = std::distance(it, last) - 1;
for (auto i = it; i != last; ++i) {
*i->output += dup_count;
}
it = last;
}
return;
}
assert(std::has_single_bit(mask));
// Create two views, one each for the mask bit unset and set
// i.e. n < mask and n >= mask
auto split_point = std::ranges::lower_bound(nums, mask, {}, &num_pair::value);
auto view_0 = std::ranges::subrange(nums.begin(), split_point);
auto view_1 = std::ranges::subrange(split_point, nums.end());
// strip top bit off each number
std::ranges::for_each(nums, [mask](auto& e) { e.value &= ~mask; });
// find matching values
// We can't use std::ranges::set_intersection() for this, because
// it undercounts when a value exists multiple times in either set.
for (auto const& e: view_0) {
auto values_1 = std::ranges::equal_range(view_1, e.value, {}, &num_pair::value);
*e.output += values_1.size();
std::ranges::for_each(values_1, [](auto const& e) { ++*e.output; });
}
count_similar_pairs(view_0, mask >> 1);
count_similar_pairs(view_1, mask >> 1);
return;
}
And invoke it:
std::ranges::sort(nv, {}, &num_pair::value);
count_similar_pairs(nv, 1u << 14); // magic constant from question
return counts;
This ought to scale as O(n log n), albeit with higher constant factor than the original O(n2) code, since we call equal_range
(O(log n)) for each element of the view (O(n)). We can add some more tests to demonstrate this. The results I get are:
Input size | Original code | apilat code | Improved code |
---|---|---|---|
5 | 0 ms | 0 ms | 0 ms |
20000 | 157 ms | 7 ms | 3 ms |
200000 | 16,050 ms | 50 ms | 15 ms |
2000000 | timed out | 521 ms | 124 ms |
20000000 | timed out | bad_alloc |
1421 ms |
Additionally, the divide-and-conquer nature of the improved algorithm lends itself to parallelisation (though this answer is already long enough, so I won't demonstrate that).
Here's the full code I ended up with, including tests and main()
function:
#include <algorithm>
#include <bit>
#include <cassert>
#include <cstdint>
#include <ranges>
#include <string>
#include <vector>
using Number = std::uint_fast16_t;
using count_iter = std::vector<std::size_t>::iterator;
struct num_pair {
Number value;
count_iter output;
};
using num_vector = std::vector<num_pair>;
void count_similar_pairs(std::ranges::random_access_range auto& nums,
Number mask)
{
if (!mask) {
// We've reached the end of the recursion.
// We just need to account for the identical pairs in input.
auto it = nums.begin();
while ((it = std::ranges::adjacent_find(it, nums.end(), {}, &num_pair::value)) != nums.end()) {
auto const last = std::ranges::adjacent_find(it, nums.end(), std::not_equal_to<>{}, &num_pair::value);
// each of the N values in this range is similar to the other N-1
auto dup_count = std::distance(it, last) - 1;
for (auto i = it; i != last; ++i) {
*i->output += dup_count;
}
it = last;
}
return;
}
assert(std::has_single_bit(mask));
// Create two views, one each for the mask bit unset and set
// i.e. n < mask and n >= mask
auto split_point = std::ranges::lower_bound(nums, mask, {}, &num_pair::value);
auto view_0 = std::ranges::subrange(nums.begin(), split_point);
auto view_1 = std::ranges::subrange(split_point, nums.end());
// strip top bit off each number
std::ranges::for_each(nums, [mask](auto& e) { e.value &= ~mask; });
// find matching values
// We can't use std::ranges::set_intersection() for this, because
// it undercounts when a value exists multiple times in either set.
if (view_0.size() > view_1.size()) {
// Reduce number of calls to equal_range()
std::swap(view_0, view_1);
}
for (auto const& e: view_0) {
auto values_1 = std::ranges::equal_range(view_1, e.value, {}, &num_pair::value);
*e.output += values_1.size();
std::ranges::for_each(values_1, [](auto const& e) { ++*e.output; });
}
count_similar_pairs(view_0, mask >> 1);
count_similar_pairs(view_1, mask >> 1);
return;
}
auto count_similar_pairs(std::ranges::random_access_range auto const& nums)
{
auto const n = nums.size();
std::vector<std::size_t> counts(n);
num_vector nv;
nv.reserve(n);
auto it = counts.begin();
for (auto n: nums) {
nv.emplace_back(n, it++);
}
std::ranges::sort(nv, {}, &num_pair::value);
count_similar_pairs(nv, 1u << 14); // magic constant from question
return counts;
}
auto count_similar_pairs(const std::vector<std::string>& str_nums)
{
std::vector<Number> nums;
for (const auto& str_num : str_nums)
nums.push_back(std::stoll(str_num, nullptr, 2));
return count_similar_pairs(nums);
}
#define UNIT_TEST
#ifdef UNIT_TEST
static bool is_similar(const long long num) {
return num == 0 || (num & (num - 1)) == 0;
}
auto original(const std::vector<Number>& nums)
{
const std::size_t n = nums.size();
std::vector<Number> answers(n);
for (std::size_t i = 0; i < n - 1; ++i) {
for (std::size_t j = i + 1; j < n; ++j) {
if (is_similar(nums[i] ^ nums[j])) {
++answers[i];
++answers[j];
}
}
}
return answers;
}
#include <gtest/gtest.h>
#include <numeric>
TEST(CountPairs, FromQuestion)
{
const std::vector<std::string> str_nums {
"100", "110", "010", "011", "100"
};
const std::vector<Number> expected {
2, 3, 2, 1, 2
};
EXPECT_EQ(count_similar_pairs(str_nums), expected);
}
TEST(CountPairs, BigInput_Values)
{
std::vector<Number> nums(20000);
std::iota(nums.begin(), nums.end(), 20);
EXPECT_EQ(count_similar_pairs(nums), original(nums));
}
TEST(CountPairs, BigInput_Slow)
{
std::vector<Number> nums(20000);
std::iota(nums.begin(), nums.end(), 20);
// No assertion here - we're just interested in execution time
original(nums);
}
TEST(CountPairs, BigInput_Fast)
{
std::vector<Number> nums(20000);
std::iota(nums.begin(), nums.end(), 20);
count_similar_pairs(nums);
}
TEST(CountPairs, VeryBigInput_Slow)
{
std::vector<Number> nums(200000);
std::iota(nums.begin(), nums.end(), 20);
original(nums);
}
TEST(CountPairs, VeryBigInput_Fast)
{
std::vector<Number> nums(200000);
std::iota(nums.begin(), nums.end(), 20);
count_similar_pairs(nums);
}
TEST(CountPairs, HugeInput_Fast)
{
std::vector<Number> nums(2000000);
std::iota(nums.begin(), nums.end(), 20);
count_similar_pairs(nums);
}
#else
#include <iostream>
#include <iterator>
int main()
{
const std::vector<std::string> str_nums {
"100", "110", "010", "011", "100"
};
auto answers = count_similar_pairs(str_nums);
std::ranges::copy(answers, std::ostream_iterator<std::size_t>(std::cout, ", "));
std::cout << '\n';
}
#endif
I came back to this question, and refined the implementation further to make it more readable and performant.
The changes I made:
- Provide conversions from the
num_pair
type, and a comparison operator, so it appears as aNumber
in most contexts (this means we no longer need to pass a projection to standard algorithms). - Replace the
equal_range()
calls with linearfind_adjacent()
, giving better memory locality. - Recognised that in the final recursive step, all the values are equal, so removed a loop.
- Terminate recursion early when we have fewer than two elements. (this gives no benefit on the larger test-cases, where every single 15-bit value is present many times over).
struct num_pair {
Number value;
std::reference_wrapper<Number> count;
constexpr operator Number const&() const { return value; };
constexpr operator Number&() { return value; };
constexpr auto operator<=>(const num_pair& other) const
{
return value <=> other.value;
}
};
// Return the longest initial subrange whose elements all have the same value
auto equal_prefix(std::ranges::forward_range auto range)
{
auto it = std::ranges::adjacent_find(range, std::not_equal_to<Number>{});
if (it != range.end()) {
++it;
}
return std::ranges::subrange(range.begin(), it);
}
void count_similar_pairs(std::ranges::random_access_range auto nums,
Number mask)
{
namespace r = std::ranges;
if (nums.size() <= 1) {
// not enough elements to have any duplicates
return;
}
if (!mask) {
// We've reached the end of the recursion.
// each of the N values in this range is similar to the other N-1
auto const dup_count = nums.size() - 1;
for (auto i: nums) {
i.count += dup_count;
}
return;
}
assert(std::has_single_bit(mask));
// Create two views, one each for the mask bit unset and set
// i.e. n < mask and n >= mask
auto split_point = r::lower_bound(nums, mask);
auto const view_a = r::subrange(nums.begin(), split_point);
auto const view_b = r::subrange(split_point, nums.end());
// strip top bit off each number
r::for_each(nums, [mask](auto& e) { e &= ~mask; });
// find matching values
// We can't use r::set_intersection() for this, because
// it undercounts when a value exists multiple times in either set.
{
// From C++23, we can use chunk_by view to walk the distinct
// values in views a and b.
auto a = view_a;
auto b = view_b;
while (a && b) {
if (b.front() < a.front()) {
std::swap(a, b);
}
if (a.front() < b.front()) {
a = r::subrange(r::lower_bound(a, b.front().value), a.end());
} else {
// values are equal - count them
auto ca = equal_prefix(a);
auto cb = equal_prefix(b);
for (auto i: ca) {
i.count += cb.size();
}
for (auto i: cb) {
i.count += ca.size();
}
// advance both views to next element
a = r::subrange(ca.end(), a.end());
b = r::subrange(cb.end(), b.end());
}
}
}
// recurse to find any similar pairs in each sub set
mask >>= 1;
count_similar_pairs(view_a, mask);
count_similar_pairs(view_b, mask);
}
Input size | Original code | First version | Second Version |
---|---|---|---|
5 | 0 ms | 0 ms | 0 ms |
20000 | 157 ms | 3 ms | 1 ms |
200000 | 16,050 ms | 15 ms | 10 ms |
2000000 | timed out | 124 ms | 134 ms |
20000000 | 1421 ms | 1417 ms |
-
\$\begingroup\$ That's a prefix-tree., or suffix-tree Or a trie, if you prefer. All those cache-misses due to jumping around though... \$\endgroup\$Deduplicator– Deduplicator2022年09月16日 10:00:04 +00:00Commented Sep 16, 2022 at 10:00
-
\$\begingroup\$ So it is! Answering this one has been educational for me. :) I hope to get time to compare this to sorting by popcount. \$\endgroup\$Toby Speight– Toby Speight2022年09月16日 10:52:17 +00:00Commented Sep 16, 2022 at 10:52
-
\$\begingroup\$ Thank you so much for this detailed analysis and solution. \$\endgroup\$Harry– Harry2022年09月18日 10:39:08 +00:00Commented Sep 18, 2022 at 10:39
Explore related questions
See similar questions with these tags.