Basic Scheduler, codingame 'Super Computer' Challenge

Question 1

I have built a program that handles the basic scheduling requests for CodinGame's Super Computer challenge.

Specifications (taken from CodinGame)

The Goal

In the Computer2000 data center, you are responsible for planning the usage of a supercomputer for scientists. Therefore you've decided to organize things a bit by planning everybody’s tasks. The logic is simple: the higher the number of calculations which can be performed, the more people you can satisfy.

Rules

Scientists give you the starting day of their calculation and the number of consecutive days they need to reserve the calculator.

Example

Calculation Starting Day Duration
A 2 5
B 9 7
C 15 6
D 9 3 
Calculation A starts on day 2 and ends on day 6
Calculation B starts on day 9 and ends on day 15
Calculation C starts on day 15 and ends on day 20
Calculation D starts on day 9 and ends on day 11

In this example, it’s not possible to carry out all the calculations because the periods for B and C overlap. 3 calculations maximum can be carried out: A, D and C.

Input

Line 1: The number N of calculations.

The N following lines: on each line, the starting day J and the duration D of reservation, separated by a blank space.

Output

The maximum number of calculations that can be carried out.

Example

Input:

Output: 3

Thing's I am looking for in this review

Optimizations (memory/speed)
Bugs

A couple more things

Which calculations removed (duped out) is irrelevant, I only care about # possible calculations
Regenerating of the list for requested_days may be slow, but having that list stored on the object becomes a memory issue.
The tests written really weren't meant to be comprehensive, just something to get by for big mistakes. My real testing was done on CodinGame.
This code doesn't entirely pass, it says I need more optimizations to pass it's basic test cases. Plus I think there are a couple more that don't pass upon submitting. I will need to figure those out later once I figure out optimizing it more for speed.
sys.stderr is a CodinGame thing to allow debugging statements without it counting as an answer.

Code

import sys
class Calculation(object):
 def __init__(self, id, start_day, running_days):
 self.id = id
 self.start_day = start_day
 self.running_days = running_days
 self.conflicts = 0
 @property
 def requested_days(self):
 return list(range(self.start_day, self.start_day + self.running_days))
 def __repr__(self):
 return "ID: {}, Conflicts {}, Requested Days {}".format(self.id, self.conflicts, self.requested_days)
 def conflicts_with(self, iterable):
 """
 An attempt to reduce iterations for cross checking two Calculation dates
 """
 requested_days = self.requested_days
 for x in iterable:
 if x in requested_days:
 return True
 return False
def find_conflicts(calculations):
 conflicts = list()
 conflict_ids = set()
 for calc in calculations:
 for calc_2 in calculations:
 if calc.id != calc_2.id:
 if calc.conflicts_with(calc_2.requested_days):
 calc.conflicts += 1
 # To reduce iterations keeping track of ids and only adding the one conflict to the conflicts.
 # If either are already in there already one will be removed solving the conflict.
 # Since the end result is not which calculations can run but how many this works perfectly.
 if calc_2.id not in conflict_ids and calc.id not in conflict_ids:
 conflicts.append((calc, calc_2))
 conflict_ids.add(calc.id)
 conflict_ids.add(calc_2.id)
 return conflicts
def remove_conflicts(conflicts):
 removed = set()
 for conflict in conflicts:
 calc_1_removed = conflict[0].id in removed
 calc_2_removed = conflict[1].id in removed
 print("Checking conflict between {} and {}".format(conflict[0], conflict[1]), file=sys.stderr)
 if not calc_1_removed and not calc_2_removed:
 if conflict[0].conflicts == conflict[1].conflicts:
 # Removing the one that was checked later because it may have another conflict and that conflict may
 # have less causing both to removed.
 remove_id = max([conflict[0], conflict[1]], key=lambda x: x.id).id
 else:
 remove_id = max([conflict[0], conflict[1]], key=lambda x: x.conflicts).id
 removed.add(remove_id)
 print("Removing {}".format(remove_id), file=sys.stderr)
 return removed
def main():
 calculations = list()
 total_calculations = int(input())
 for i in range(total_calculations):
 start, duration = [int(j) for j in input().split()]
 calculations.append(Calculation(i, start, duration))
 conflicts = find_conflicts(calculations)
 removed = remove_conflicts(conflicts)
 print("Removed {}".format(removed), file=sys.stderr)
 print(total_calculations - len(removed))
if __name__ == '__main__':
 main()
# Tests
# import down here for easy of copying into program
from unittest import TestCase
class ScheduleTester(TestCase):
 """
 Basic tests to verify that the conflict generation is working correctly
 """
 def test_main(self):
 """
 Input:
 4
 2 5
 9 7
 15 6
 9 3
 :return: 3
 """
 calculations = list()
 total_calculations = 4
 calculations.append(Calculation(0, 2, 5))
 calculations.append(Calculation(1, 9, 7))
 calculations.append(Calculation(2, 15, 6))
 calculations.append(Calculation(3, 9, 3))
 conflicts = find_conflicts(calculations)
 removed = remove_conflicts(conflicts)
 self.assertEqual(3, total_calculations - len(removed))
 def gen_reusable_calculations(self):
 """
 input:
 4
 9 5
 8 5
 2 3
 15 5
 1 1
 :return:
 """
 calculations = list()
 calculations.append(Calculation(0, 9, 5))
 calculations.append(Calculation(1, 8, 5))
 calculations.append(Calculation(2, 2, 3))
 calculations.append(Calculation(3, 15, 5))
 calculations.append(Calculation(4, 1, 1))
 calculations.append(Calculation(5, 14, 17))
 return calculations
 def test_find_conflicts(self):
 calculations = self.gen_reusable_calculations()
 conflicts = find_conflicts(calculations)
 self.assertEqual([(calculations[0], calculations[1]), (calculations[3], calculations[5])], conflicts)
 def test_remove_conflicts(self):
 calculations = self.gen_reusable_calculations()
 conflicts = find_conflicts(calculations)
 removed = remove_conflicts(conflicts)
 self.assertEqual({1, 5}, removed)

Question 2

Your conflicts_with in the Calculation class could be more Pythonic. First of all, there's no real reason to set a temporary requested_days value, when you can just iterate over it directly:

def conflicts_with(self, iterable):
 """
 An attempt to reduce iterations for cross checking two Calculation dates
 """
 for x in iterable:
 if x in self.requested_days:
 return True
 return False

Testing for value in iterable is quite slow for a list. Sets are faster at running that test. It might be quicker to create a set like this:

 requested = set(self.requested_days)
 for x in iterable:
 if x in requested:

But that's highly dependent on the data, it would be best to test if this works any better for you, and then apply it if it helps.

Secondly, since you're iterating over a list just to get a boolean result from a simple test. You have a perfect opportunity to use the any function. It takes what's known as a generator expression. Generators are like a for loop collapsed into a single expression. In your case, it's easy to collapse down like this:

 any(item in self.requested_days for item in iterable)

This returns a boolean that's True if item in self.requested_days is True for any of the values of item for item in iterable. This matches your original case exactly, and is optimised to run faster than constructing a loop in Python.

Calling list() in find_conflicts to create an empty list is strange syntax, people usually just use []. If you wanted to match your set() call, then it's fine but just be aware that you don't need to do it for list although you do for set since {} will default to an empty dictionary, not a set.

Question 3

My concern with not doing a temp variable is that it would have to go recalculate and rebuild the list every iteration. Is that not true? Is there some caching going on behind the scenes? I originally had it as an any with a generator but thought that reducing the iterations this way would be faster, I will need to test and see which is.

Question 4

Also, you were totally right about using list() I just wanted it to match set() lol

Question 5

Ah, no it will only calculate once before iterating. Otherwise if the length of the result changed between iterations Python would be confused. What do you mean by reducing iterations? If you meant returning as soon as you hit return True, then any does that too. It's called short circuiting, and I should have noted it.

Question 6

That is exactly what I meant by reducing iterations. Didn't know there was a name for it. Hehe.

Question 7

My other massive concern is simply with the design of the algorithm. The double for loop over the same data plus another for loop over the conflicts is A TON of iterations. Is there a way to reduce this or optimize it more? I was thinking instead of the double for loop I could use itertools.combinations then just iterate over each and see if they conflict. I am sure it is doing essentially the same thing but mine is more of a permutation than combination.

Question 8

Repetition

The docstring and the code contain the same information.

 """
 input:
 4
 9 5
 8 5
 2 3
 15 5
 1 1
 :return:
 """
 calculations = list()
 calculations.append(Calculation(0, 9, 5))
 calculations.append(Calculation(1, 8, 5))
 calculations.append(Calculation(2, 2, 3))
 calculations.append(Calculation(3, 15, 5))
 calculations.append(Calculation(4, 1, 1))
 calculations.append(Calculation(5, 14, 17))

Either remove the docstring or the code (and make the docstring runnable).

I also suggest a list comprehension to avoid repeating calculations.append(Calculation

score 2 · Accepted Answer · 2016-01-18 10:18:59Z

Your conflicts_with in the Calculation class could be more Pythonic. First of all, there's no real reason to set a temporary requested_days value, when you can just iterate over it directly:

def conflicts_with(self, iterable):
 """
 An attempt to reduce iterations for cross checking two Calculation dates
 """
 for x in iterable:
 if x in self.requested_days:
 return True
 return False

Testing for value in iterable is quite slow for a list. Sets are faster at running that test. It might be quicker to create a set like this:

 requested = set(self.requested_days)
 for x in iterable:
 if x in requested:

But that's highly dependent on the data, it would be best to test if this works any better for you, and then apply it if it helps.

Secondly, since you're iterating over a list just to get a boolean result from a simple test. You have a perfect opportunity to use the any function. It takes what's known as a generator expression. Generators are like a for loop collapsed into a single expression. In your case, it's easy to collapse down like this:

 any(item in self.requested_days for item in iterable)

This returns a boolean that's True if item in self.requested_days is True for any of the values of item for item in iterable. This matches your original case exactly, and is optimised to run faster than constructing a loop in Python.

Calling list() in find_conflicts to create an empty list is strange syntax, people usually just use []. If you wanted to match your set() call, then it's fine but just be aware that you don't need to do it for list although you do for set since {} will default to an empty dictionary, not a set.

My concern with not doing a temp variable is that it would have to go recalculate and rebuild the list every iteration. Is that not true? Is there some caching going on behind the scenes? I originally had it as an any with a generator but thought that reducing the iterations this way would be faster, I will need to test and see which is.
Also, you were totally right about using list() I just wanted it to match set() lol
Ah, no it will only calculate once before iterating. Otherwise if the length of the result changed between iterations Python would be confused. What do you mean by reducing iterations? If you meant returning as soon as you hit return True, then any does that too. It's called short circuiting, and I should have noted it.
That is exactly what I meant by reducing iterations. Didn't know there was a name for it. Hehe.
My other massive concern is simply with the design of the algorithm. The double for loop over the same data plus another for loop over the conflicts is A TON of iterations. Is there a way to reduce this or optimize it more? I was thinking instead of the double for loop I could use itertools.combinations then just iterate over each and see if they conflict. I am sure it is doing essentially the same thing but mine is more of a permutation than combination.

Stack Exchange Network

Basic Scheduler, codingame 'Super Computer' Challenge

2 Answers 2

Your Answer

Sign up or log in

Post as a guest

Post as a guest

Hot Network Questions

Basic Scheduler, codingame 'Super Computer' Challenge

2 Answers 2

Your Answer

Sign up or log in

Post as a guest

Post as a guest

Related

Hot Network Questions