I made a solution to this problem from HackerRank :
You are given a list of N people who are attending ACM-ICPC World Finals. Each of them are either well versed in a topic or they are not. Find out the maximum number of topics a 2-person team can know. And also find out how many teams can know that maximum number of topics.
Note Suppose a, b, and c are three different people, then (a,b) and (b,c) are counted as two different teams.
Input Format
The first line contains two integers, N and M, separated by a single space, where N represents the number of people, and M represents the number of topics. N lines follow. Each line contains a binary string of length M. If the ith line's jth character is 1, then the ith person knows the jth topic; otherwise, he doesn't know the topic.
My code is considered too slow (I think I am allowed 10 seconds, the code below takes more than 20 seconds when N and M are both 500):
import random
import itertools
import time
# n number of people
# m number of topics
n = 500
m = 500
"""
binary_string(n) and random_list(n_people, n_topic) just help
to generate test cases, irrelevant otherwise.
"""
def binary_string(n):
my_string = "".join(str(random.randint(0, 1)) for _ in range(n))
return my_string
def random_list(n_people, n_topic):
my_list = [binary_string(n_topic) for _ in range(n_people)]
return my_list
"""
the essential part is item_value(e1, e2)
and find_couples(comb_list)
"""
def item_value(e1, e2):
c = bin(int(e1, 2) | int(e2, 2))
return sum(int(i) for i in c[2:])
def find_couples(comb_list):
max_value = 0
counter = 0
for pair in itertools.combinations(comb_list, 2):
value = item_value(pair[0], pair[1])
if value == max_value:
counter += 1
elif value > max_value:
max_value = value
counter = 1
print(max_value)
print(counter)
return
topic = random_list(n, m)
print(topic)
start = time.time()
find_couples(topic)
end = time.time()
print(end - start)
In the context of the above algorithm, in what ways can I make it more efficient?
Is there a better algorithm?
-
\$\begingroup\$ Your code is considered too slow you say. How fast is it and how fast does it need to be? \$\endgroup\$Mast– Mast ♦2015年12月01日 10:09:43 +00:00Commented Dec 1, 2015 at 10:09
-
1\$\begingroup\$ It would help reviewers to include the sample inputs and outputs here \$\endgroup\$janos– janos2015年12月01日 10:27:31 +00:00Commented Dec 1, 2015 at 10:27
-
\$\begingroup\$ @janos On each run, the above code will generate highest order test case. \$\endgroup\$blackened– blackened2015年12月01日 10:28:56 +00:00Commented Dec 1, 2015 at 10:28
2 Answers 2
You're spending a lot of time in item_value
converting numbers to a binary string back to an int to get the sum. It would be a lot more expedient to just use str.count
to get the number of 1s in the string. That saves you a lot of type conversions as well as a call to sum
with a generator.
def item_value(e1, e2):
c = bin(int(e1, 2) | int(e2, 2))
return c[2:].count('1')
My results from this caused a reduction from 58.9s to 0.91s.
-
\$\begingroup\$ Thank you. Just out of curiosity, is there a one-liner equivalent of your code? \$\endgroup\$blackened– blackened2015年12月01日 10:46:09 +00:00Commented Dec 1, 2015 at 10:46
-
2\$\begingroup\$ @blackened If you really want you could
return bin(int(e1, 2) | int(e2, 2))[2:].count('1')
but that feels cluttered to me. \$\endgroup\$SuperBiasedMan– SuperBiasedMan2015年12月01日 10:47:04 +00:00Commented Dec 1, 2015 at 10:47
If you even have to add a comment:
# n number of people
Just rename n
to NUMBER_OF_PEOPLE
, prefer meaningful names over comments, comments may get obsolete as there is no automatic check on them.
Explore related questions
See similar questions with these tags.