I have built a program that handles the basic scheduling requests for CodinGame's Super Computer
challenge.
Specifications (taken from CodinGame)
The Goal
In the Computer2000 data center, you are responsible for planning the usage of a supercomputer for scientists. Therefore you've decided to organize things a bit by planning everybody’s tasks. The logic is simple: the higher the number of calculations which can be performed, the more people you can satisfy.
Rules
Scientists give you the starting day of their calculation and the number of consecutive days they need to reserve the calculator.
Example
Calculation Starting Day Duration
A 2 5
B 9 7
C 15 6
D 9 3
Calculation A starts on day 2 and ends on day 6
Calculation B starts on day 9 and ends on day 15
Calculation C starts on day 15 and ends on day 20
Calculation D starts on day 9 and ends on day 11
In this example, it’s not possible to carry out all the calculations because the periods for B and C overlap. 3 calculations maximum can be carried out: A, D and C.
Input
Line 1: The number N of calculations.
The N following lines: on each line, the starting day J and the duration D of reservation, separated by a blank space.
Output
The maximum number of calculations that can be carried out.
Example
Input:
4
2 5
9 7
15 6
9 3
Output: 3
Thing's I am looking for in this review
- Optimizations (memory/speed)
- Bugs
A couple more things
Which calculations removed (duped out) is irrelevant, I only care about # possible calculations
Regenerating of the list for
requested_days
may be slow, but having that list stored on the object becomes a memory issue.The tests written really weren't meant to be comprehensive, just something to get by for big mistakes. My real testing was done on CodinGame.
This code doesn't entirely pass, it says I need more optimizations to pass it's basic test cases. Plus I think there are a couple more that don't pass upon submitting. I will need to figure those out later once I figure out optimizing it more for speed.
sys.stderr
is a CodinGame thing to allow debugging statements without it counting as an answer.
Code
import sys
class Calculation(object):
def __init__(self, id, start_day, running_days):
self.id = id
self.start_day = start_day
self.running_days = running_days
self.conflicts = 0
@property
def requested_days(self):
return list(range(self.start_day, self.start_day + self.running_days))
def __repr__(self):
return "ID: {}, Conflicts {}, Requested Days {}".format(self.id, self.conflicts, self.requested_days)
def conflicts_with(self, iterable):
"""
An attempt to reduce iterations for cross checking two Calculation dates
"""
requested_days = self.requested_days
for x in iterable:
if x in requested_days:
return True
return False
def find_conflicts(calculations):
conflicts = list()
conflict_ids = set()
for calc in calculations:
for calc_2 in calculations:
if calc.id != calc_2.id:
if calc.conflicts_with(calc_2.requested_days):
calc.conflicts += 1
# To reduce iterations keeping track of ids and only adding the one conflict to the conflicts.
# If either are already in there already one will be removed solving the conflict.
# Since the end result is not which calculations can run but how many this works perfectly.
if calc_2.id not in conflict_ids and calc.id not in conflict_ids:
conflicts.append((calc, calc_2))
conflict_ids.add(calc.id)
conflict_ids.add(calc_2.id)
return conflicts
def remove_conflicts(conflicts):
removed = set()
for conflict in conflicts:
calc_1_removed = conflict[0].id in removed
calc_2_removed = conflict[1].id in removed
print("Checking conflict between {} and {}".format(conflict[0], conflict[1]), file=sys.stderr)
if not calc_1_removed and not calc_2_removed:
if conflict[0].conflicts == conflict[1].conflicts:
# Removing the one that was checked later because it may have another conflict and that conflict may
# have less causing both to removed.
remove_id = max([conflict[0], conflict[1]], key=lambda x: x.id).id
else:
remove_id = max([conflict[0], conflict[1]], key=lambda x: x.conflicts).id
removed.add(remove_id)
print("Removing {}".format(remove_id), file=sys.stderr)
return removed
def main():
calculations = list()
total_calculations = int(input())
for i in range(total_calculations):
start, duration = [int(j) for j in input().split()]
calculations.append(Calculation(i, start, duration))
conflicts = find_conflicts(calculations)
removed = remove_conflicts(conflicts)
print("Removed {}".format(removed), file=sys.stderr)
print(total_calculations - len(removed))
if __name__ == '__main__':
main()
# Tests
# import down here for easy of copying into program
from unittest import TestCase
class ScheduleTester(TestCase):
"""
Basic tests to verify that the conflict generation is working correctly
"""
def test_main(self):
"""
Input:
4
2 5
9 7
15 6
9 3
:return: 3
"""
calculations = list()
total_calculations = 4
calculations.append(Calculation(0, 2, 5))
calculations.append(Calculation(1, 9, 7))
calculations.append(Calculation(2, 15, 6))
calculations.append(Calculation(3, 9, 3))
conflicts = find_conflicts(calculations)
removed = remove_conflicts(conflicts)
self.assertEqual(3, total_calculations - len(removed))
def gen_reusable_calculations(self):
"""
input:
4
9 5
8 5
2 3
15 5
1 1
:return:
"""
calculations = list()
calculations.append(Calculation(0, 9, 5))
calculations.append(Calculation(1, 8, 5))
calculations.append(Calculation(2, 2, 3))
calculations.append(Calculation(3, 15, 5))
calculations.append(Calculation(4, 1, 1))
calculations.append(Calculation(5, 14, 17))
return calculations
def test_find_conflicts(self):
calculations = self.gen_reusable_calculations()
conflicts = find_conflicts(calculations)
self.assertEqual([(calculations[0], calculations[1]), (calculations[3], calculations[5])], conflicts)
def test_remove_conflicts(self):
calculations = self.gen_reusable_calculations()
conflicts = find_conflicts(calculations)
removed = remove_conflicts(conflicts)
self.assertEqual({1, 5}, removed)
2 Answers 2
Your conflicts_with
in the Calculation
class could be more Pythonic. First of all, there's no real reason to set a temporary requested_days
value, when you can just iterate over it directly:
def conflicts_with(self, iterable):
"""
An attempt to reduce iterations for cross checking two Calculation dates
"""
for x in iterable:
if x in self.requested_days:
return True
return False
Testing for value in iterable
is quite slow for a list. Sets are faster at running that test. It might be quicker to create a set like this:
requested = set(self.requested_days)
for x in iterable:
if x in requested:
But that's highly dependent on the data, it would be best to test if this works any better for you, and then apply it if it helps.
Secondly, since you're iterating over a list just to get a boolean result from a simple test. You have a perfect opportunity to use the any
function. It takes what's known as a generator expression. Generators are like a for loop collapsed into a single expression. In your case, it's easy to collapse down like this:
any(item in self.requested_days for item in iterable)
This returns a boolean that's True
if item in self.requested_days
is True
for any
of the values of item for item in iterable
. This matches your original case exactly, and is optimised to run faster than constructing a loop in Python.
Calling list()
in find_conflicts
to create an empty list is strange syntax, people usually just use []
. If you wanted to match your set()
call, then it's fine but just be aware that you don't need to do it for list
although you do for set
since {}
will default to an empty dictionary, not a set.
-
\$\begingroup\$ My concern with not doing a temp variable is that it would have to go recalculate and rebuild the list every iteration. Is that not true? Is there some caching going on behind the scenes? I originally had it as an any with a generator but thought that reducing the iterations this way would be faster, I will need to test and see which is. \$\endgroup\$Jared Mackey– Jared Mackey2016年01月18日 15:00:46 +00:00Commented Jan 18, 2016 at 15:00
-
\$\begingroup\$ Also, you were totally right about using
list()
I just wanted it to matchset()
lol \$\endgroup\$Jared Mackey– Jared Mackey2016年01月18日 15:04:10 +00:00Commented Jan 18, 2016 at 15:04 -
\$\begingroup\$ Ah, no it will only calculate once before iterating. Otherwise if the length of the result changed between iterations Python would be confused. What do you mean by reducing iterations? If you meant returning as soon as you hit
return True
, thenany
does that too. It's called short circuiting, and I should have noted it. \$\endgroup\$SuperBiasedMan– SuperBiasedMan2016年01月18日 15:08:02 +00:00Commented Jan 18, 2016 at 15:08 -
\$\begingroup\$ That is exactly what I meant by reducing iterations. Didn't know there was a name for it. Hehe. \$\endgroup\$Jared Mackey– Jared Mackey2016年01月18日 15:23:14 +00:00Commented Jan 18, 2016 at 15:23
-
\$\begingroup\$ My other massive concern is simply with the design of the algorithm. The double
for
loop over the same data plus anotherfor
loop over the conflicts is A TON of iterations. Is there a way to reduce this or optimize it more? I was thinking instead of the doublefor
loop I could useitertools.combinations
then just iterate over each and see if they conflict. I am sure it is doing essentially the same thing but mine is more of a permutation than combination. \$\endgroup\$Jared Mackey– Jared Mackey2016年01月18日 15:26:27 +00:00Commented Jan 18, 2016 at 15:26
Repetition
The docstring and the code contain the same information.
"""
input:
4
9 5
8 5
2 3
15 5
1 1
:return:
"""
calculations = list()
calculations.append(Calculation(0, 9, 5))
calculations.append(Calculation(1, 8, 5))
calculations.append(Calculation(2, 2, 3))
calculations.append(Calculation(3, 15, 5))
calculations.append(Calculation(4, 1, 1))
calculations.append(Calculation(5, 14, 17))
Either remove the docstring or the code (and make the docstring runnable).
I also suggest a list comprehension to avoid repeating calculations.append(Calculation
Explore related questions
See similar questions with these tags.