Assigning Tasks to Entities (or some other entity) based on Priority

Question 1

Main Questions:

Is my runtime analysis correct?
Is my approach optimal or close to it?
Is there a better way and/or improvements?

Objective:

Write an algorithm to assign n jobs to k entities (servers, cpus, whatever) in a round robin fashion. The job that takes longest will be assign first, the shortest last. In the case of a tie, the job with lowest id will be assigned first.
return the job ids assigned to each entity, where an entity can be represented as a list

sample input:

jobs= [40, 80,40, 4, 2]
entities = 2
where jobs[i] is the length of time the task, and i is its identifier.

sample output:

[[1, 2, 4], [0, 3]]

I came up with two algorithms and wanted to see if my runtime analysis is correct, and if there is a better way to approach this

Implementation 1: Linear Search

Pseudocode:

linear search for the longest job, from left to right, saving its job id (index)
only update the index if we find a larger job in list
once we find it assign to another list, which will contain the jobs in order
iterate over the ordered list, assigning to each server in round robin fashion

Hypothesis for runtime:

time proportional to n ^ 2, where n are the number of jobs

Code


def findMax(jobs):
 ind_of_max = 0
 ### iterate over jobs
 for k in range(len(jobs)):
 # if current job we are processing is greater than the previous lomgest job
 if jobs[k] > jobs[ind_of_max]:
 ind_of_max = k
 # sinonimouns to removing the job from the PQ
 jobs[ind_of_max] = -1
 return ind_of_max
def determineJobOrder(jobs):
 
 # list of jobIds
 order_of_jobs = []
 num_of_jobs = len(jobs)
 
 #### TOTAL RUNTIME ANALYSIS:
 # - runs in time proportional to N ^ 2
 ####
 # runs in time proportional to N
 while num_of_jobs > 0:
 # removing jobs from list (aka performing dequeue operations until we run out of jobs)
 # runs in time proportional to N
 ind_of_max = findMax(jobs)
 
 order_of_jobs.append(ind_of_max) 
 num_of_jobs -= 1
 
 return order_of_jobs
def assignJobsToServers(servers, jobs):
 
 jobs = determineJobOrder(jobs)
 
 # create a list of lists, where the number of inner lists = number of servers
 assignment = [[] for i in range(0, servers)]
 
 
 num_jobs = len(jobs)
 job_ind = 0
 server_ind = 0
 
 ###
 ### RUNTIME ANALYSIS:
 # - runs in time proportional to n
 #while there are jobs left to assign
 while num_jobs > 0:
 assignment[server_ind % servers].append(jobs[job_ind])
 # update server index, job index, and number of jobs left to assign
 num_jobs -= 1
 job_ind += 1
 server_ind += 1
 return assignment

Implementation 2: Using Heap

Pseudocode:

Iterate over all jobs, creating a Job object, Job(job_id, time)
Since python's heapq class is a min heap, I use heapq.nlargest, passing a key for comparison, the list of Jobs

Hypothesis of runtime:

time proportional to n log n, where n are the number of jobs

Code:


## To enncapsulate the jobs and be able to compare by multiple attributes
class Job:
 def __init__(self, job_id, time):
 self.job_id = job_id
 self.time = time
 def __repr__(self):
 return 'Job(id: {}, time: {})'.format(self.job_id, self.time)
import heapq
def assignJobsToServers(servers, jobs):
 
 for k, exec_time in enumerate(jobs):
 jobs[k] = Job(k, exec_time)
 
 # resource used to figure out how to sort by two attributes, one in natural order one in reverse:
 # https://www.techiedelight.com/sort-list-of-objects-by-multiple-attributes-python/
 
 #### Runtime Analysis:
 # - I am assuming that what this will do is build the max heap, and then remove the n items to get them in max order.
 # - Inserts into heap take log (n).
 # - removals take log(n)
 # - if we are enqueueing n items, this will take n log n
 # - if we are removing n items, this will take n log n for removals
 # - hence: runtime should be proportional n log n, ignoring constants
 ####
 # key defines that that the larger the time the larger the, and on a tie, the smallest the job id the largest the item 
 jobs_ordered = heapq.nlargest(len(jobs), jobs, key = lambda job: (job.time, -job.job_id))
 # create a list of lists, where the number of inner lists = number of servers
 assignments = [[] for i in range(0, servers)]
 
 
 num_jobs = len(jobs)
 server_ind = 0
 job_ind = 0
 
 ###### RUNTIME ANALYSIS:
 # - at this point we already did all the difficult computation. We now just iterate over all jobs which will be in order and assign
 # - this takes time proportional to n, the number of jobs
 while num_jobs > 0:
 
 assignments[server_ind % servers].append(jobs_ordered[job_ind].job_id)
 
 # update server index, and number of jobs left to assign
 num_jobs -= 1
 server_ind += 1
 job_ind += 1
 return assignments

Question 2

Your runtime analysis seems fine and your heap-based approach is optimal in the narrow sense that it achieves O(n log n) performance. However, the software itself is needlessly complex.

The path to simplicity is found by remembering the most famous O(n log n) operation: sorting. You have a list of job durations that you want to sort from largest to smallest, breaking ties via their ID/index. There are a variety of reasonable ways to achieve that. Here's one illustration:

# Some job durations and a generator to produce (DURATION, ID) pairs.
durations = [40, 80, 40, 4, 2]
pairs = enumerate(durations)
# A function to take a pair and return a modified pair for sorting purposes.
sort_key = lambda pair: (-pair[1], pair[0])
# Sort the pairs and extract job IDs from them.
job_ids = [jid for jid, dur in sorted(pairs, key = sort_key)]
# Allocate the job IDs to N workers/entities in round-robin fashion.
n = 2
allocated_job_ids = [job_ids[i::n] for i in range(n)]
print(allocated_job_ids) # [[1, 2, 4], [0, 3]]

Question 3

got it. not sure why that did not come to mind but that would definitely be simpler in this case, Thank you. Would a PQ implemented with Heap be preferred in the case our list of jobs is not static , meaning, let's say, two jobs can be added to the initial list, then 1 remove, followed by more additions and removals interleaved? it seems like in that case we couldn't achieve n log n for both removals and additions. What do you think?

Question 4

@MPC Correct. The superpower of a heap is its log-n ability to maintain sorted order in the face of inserts/removals.

FMc FMc 13.1k2 gold badges29 silver badges40 bronze badges · Answer 1 · 2021-11-14 20:50:52Z

Your runtime analysis seems fine and your heap-based approach is optimal in the narrow sense that it achieves O(n log n) performance. However, the software itself is needlessly complex.

The path to simplicity is found by remembering the most famous O(n log n) operation: sorting. You have a list of job durations that you want to sort from largest to smallest, breaking ties via their ID/index. There are a variety of reasonable ways to achieve that. Here's one illustration:

# Some job durations and a generator to produce (DURATION, ID) pairs.
durations = [40, 80, 40, 4, 2]
pairs = enumerate(durations)
# A function to take a pair and return a modified pair for sorting purposes.
sort_key = lambda pair: (-pair[1], pair[0])
# Sort the pairs and extract job IDs from them.
job_ids = [jid for jid, dur in sorted(pairs, key = sort_key)]
# Allocate the job IDs to N workers/entities in round-robin fashion.
n = 2
allocated_job_ids = [job_ids[i::n] for i in range(n)]
print(allocated_job_ids) # [[1, 2, 4], [0, 3]]

got it. not sure why that did not come to mind but that would definitely be simpler in this case, Thank you. Would a PQ implemented with Heap be preferred in the case our list of jobs is not static , meaning, let's say, two jobs can be added to the initial list, then 1 remove, followed by more additions and removals interleaved? it seems like in that case we couldn't achieve n log n for both removals and additions. What do you think?
@MPC Correct. The superpower of a heap is its log-n ability to maintain sorted order in the face of inserts/removals.

Stack Exchange Network

Assigning Tasks to Entities (or some other entity) based on Priority

Implementation 1: Linear Search

Implementation 2: Using Heap

1 Answer 1

Your Answer

Sign up or log in

Post as a guest

Post as a guest

Hot Network Questions

Assigning Tasks to Entities (or some other entity) based on Priority

Implementation 1: Linear Search

Implementation 2: Using Heap

1 Answer 1

Your Answer

Sign up or log in

Post as a guest

Post as a guest

Related

Hot Network Questions