7
\$\begingroup\$

I have a multilayered dictionary that contains information about classes. I am using this to code an automatic schedule builder that I will eventually add to a separate Tkinter application I made that contains similar programs.

I spent a lot of time streamlining the code and improving its performance. When I started writing the code it took over an hour to load a few sections, now I can load 5 whole classes in a little less than 5 minutes, which was a relief to me as a scheduler that takes hours for a few sections is not useful at all. The main reason it takes so long is due to the fact that saving an image using PIL takes about 0.7 seconds per save and the process I use to build the image used to take 10 seconds but now takes about 0.5 seconds, making the total process 1.4 seconds per image, which really isn't that bad, but when you make 100 images it does take an uncomfortably long time. Now I can't think of any way of lowering this time any farther.

The area that takes the most time (Not including the image saving area which, again, takes 0.7 seconds per image) :

I used multiprocessing and itertools to speed up the process, taking the time needed to find overlaps from an hour to around 30 seconds, but it is still a but slow. When working with a lot of possibilities, usually in the hundreds of thousands, the following code takes 20-40 seconds to go through on a good day. Is there any way to make this run faster/more efficiently? (Side note, I have found that the more frequent I run the program, the quicker it runs through the code, like from 35 seconds to 24 seconds... could this have something to do with the cores being more malleable the more frequently they are used?)

 cores = mp.cpu_count()
 splitSchedules = chunkify(PossibleSchedules, cores)
 pool = mp.Pool(processes=cores)
 result = pool.map(removeOverlaps, splitSchedules)
 TruePossibleSchedules = []
 for x in range(cores):
 TruePossibleSchedules = TruePossibleSchedules + result[x]
 TruePossibleSchedules.sort()
 sortedTruePossibleSchedules = list(TruePossibleSchedules for TruePossibleSchedules,_ in itertools.groupby(TruePossibleSchedules))
def chunkify(lst,n):
 return [ lst[i::n] for i in xrange(n) ]
def removeOverlaps(PossibleSchedules):
 first = False
 if PossibleSchedules[-1] == "First":
 cores = mp.cpu_count()
 print "Commandeering your %s cores..."%(cores)
 del PossibleSchedules[-1]
 first = True
 TruePossibleSchedules = []
 if first:
 for schedule in range(0, len(PossibleSchedules)):
 overlapping = [[s,e] for s in PossibleSchedules[schedule] for x in s for e in PossibleSchedules[schedule] for y in e if s is not e and x[2]==y[2] and (int(x[0])<=int(y[1]) and int(x[1])>=int(y[0]))]
 good = True
 if overlapping:
 good = False
 if good:
 TruePossibleSchedules.append(PossibleSchedules[schedule])
 sys.stdout.write("\rCalculating real schedules: " + str( float("{0:.2f}".format(( float(schedule+1)/float(len(PossibleSchedules))) *100) )) + "% ")
 sys.stdout.flush()
 sys.stdout.write("\rThanks for letting me borrow those ")
 sys.stdout.flush()
 else:
 for schedule in range(0, len(PossibleSchedules)):
 overlapping = [[s,e] for s in PossibleSchedules[schedule] for x in s for e in PossibleSchedules[schedule] for y in e if s is not e and x[2]==y[2] and (int(x[0])<=int(y[1]) and int(x[1])>=int(y[0]))]
 good = True
 if overlapping:
 good = False
 if good:
 TruePossibleSchedules.append(PossibleSchedules[schedule])
 return TruePossibleSchedules 

Full code:

Referenced picture used: Schedule Grid.png

Database used to pull information

# coding: utf-8
'''
Created on Jul 31, 2017
@author: Jake
This is a bit sloppy and unorganized, I am still working on it and it is not going to be stand alone, it will be put into a Tkinter application I made.
'''
from bs4 import BeautifulSoup
from HTMLParser import HTMLParser
import urllib
import shlex
import re
import time
from PIL import Image, ImageDraw, ImageFont
import itertools
import os
import shutil
import colorsys
import copy
import random
import multiprocessing as mp
import sys
class Vars():
 global vari
 vari = {}
 def GetVars(self, var):
 return vari.get(str(var))
 def SendVars(self, var, val):
 vari[str(var)] = val
runStart = time.time()
# Fluid Dynamics is a little ducked, it has multiple different section numbers, so if you want CHE, CE, or ME Fluid Mechanics then do not add it as this will not produce a correct result.
designators = {
 "CC": "Co-Req required ",
 "CS": "Freshman quiz/ Next Class ",
 "CA": "Activity needed ",
 "RQ": "Pre-Req required ",
 "R&": "Pre-Req required ",
 "RQM": "Pre-Req course reqd w/ min grade ",
 "RM&": "(cont.) Pre-Req reqd w/ min grade ",
 "RQT": "Pre-Req test required ",
 "RT&": "(cont.) Pre-Req test required ",
 "NQ": "Pre-Req course required ",
 "N&": "Pre-Req course required ",
 "NQM": "Concur Pre-Req reqd w/ min grade ",
 "NM&": "(cont.) Concur Pre-Req w/ min grade ",
 "MB": "By Application Only ",
 "MP": "Pre-Req Required ",
 "MC": "Co-Req Required ",
 "ML": "Lab Fee Required ",
 "MA": "Permission of Advisor Required ",
 "MI": "Permission of Instructor Required ",
 "MH": "Department Head Approval Required ",
 "MN": "No Credit Course for Departmental Majors ",
 "MS": "Studio course; No general Humanities credit ",
 "PAU": "Auditors need instructor permission ",
 "PCG": "Permission needed from Continuing ED ",
 "PDP": "Permission needed from department ",
 "PIN": "Permission needed from instructor ",
 "PUN": "Undergrads need instructor permission ",
 "PUA": "UGs need permission of Dean of UG Academics ", 
 "LEC": "lecture",
 "L/L": "lecture/lab",
 "LAB": "laboratory",
 "PSI": "personalized self-paced instruction",
 "QUZ": "quiz",
 "RCT": "recitation",
 "SEM": "seminar",
 "PRA": "practicum",
 "HSG": "housing (dorm)",
 "MCE": "Multiple Course Entry base course",
 "WSP": "Work Shop"
}
if os.path.exists((os.path.dirname(os.path.realpath(__file__)) + "/Schedules")):
 shutil.rmtree((os.path.dirname(os.path.realpath(__file__)) + "/Schedules"))
if not os.path.exists((os.path.dirname(os.path.realpath(__file__)) + "/Schedules")): 
 os.makedirs(os.path.dirname(os.path.realpath(__file__)) + "/Schedules")
ScheduleGrid = Image.open('Schedule Grid.png').convert('RGBA')
ClassBlocks = Image.new('RGBA', ScheduleGrid.size, (255,255,255,0))
out = Image.alpha_composite(ScheduleGrid, ClassBlocks)
out.save("Schedule.png")
h = HTMLParser()
page = urllib.urlopen('https://web.stevens.edu/scheduler/core/2017F/2017F.xml').read() # Get to database
soup = BeautifulSoup(page, "lxml")
while True:
 try:
 RawClassData = soup.contents[10].contents[0].contents[0].contents
 break
 except:
 print 'Trying again'
classes = {}
backupClasses = {}
selectedClasses = {}
var = Vars()
var.SendVars("color", 30)
def makeDatabase():
 for i in range(0, len(RawClassData)): # Parse through each class
 sys.stdout.write("\rLoading classes: " + str( float("{0:.2f}".format(( float(i)/float(len(RawClassData))) *100) )) + "% ")
 sys.stdout.flush()
 try:
 ClassDict = {}
 MeetingsDict = {}
 RequirementsDict = {}
 #For meetings
 numMeetings = str(RawClassData[i]).split().count("<meeting")
 seper = str(RawClassData[i]).split("meeting") # Split string by meeting to get subject name and value
 try:
 for line in range(0, len(seper)):
 if seper[line] == ">\n<":
 del seper[line]
 except:
 pass
 for x in range(0, numMeetings):
 subMeetingsDict = {}
 MeetingInfo = shlex.split(h.unescape(str(seper[x+1]).replace(">", " "))) # sort into a list grouping string in quotes and getting rid of unnecessary symbols 
 for item in MeetingInfo: # Go through list of meeting info
 try:
 thing = item.split("=") # Split string by = to get subject name and value
 name = thing[0]
 if any(char.isdigit() for char in thing[1]): # Get rid of annoying Z at the end of numbers
 for char in thing[1]:
 if "-" == char:
 thing[1] = re.sub("[Z]","",thing[1])
 break
 value = re.sub(' +',' ', thing[1])
 if value: # If subject has a value, store it
 try:
 subMeetingsDict[str(name)] = str(designators[str(value)]) # Store value converted to designator in a dictionary with the subject as the key
 except KeyError:
 subMeetingsDict[str(name)] = str(value) # Store value in a dictionary with the subject as the key
 except:
 pass
 MeetingsDict["meeting" + str(x)] = subMeetingsDict
 ClassDict["meetings"] = MeetingsDict
 #For requirements
 numRequirements = str(RawClassData[i]).split().count("<requirement")
 seper = str(RawClassData[i]).split("requirement") # Split string by requirements to get subject name and value
 try:
 for line in range(0, len(seper) - 1):
 if seper[line] == ">\n<":
 del seper[line]
 except:
 pass
 for x in range(0, numRequirements):
 subRequirementsDict = {}
 RequirementsInfo = shlex.split(h.unescape(str(seper[-2 - x]).replace(">", " "))) # sort into a list grouping string in quotes and getting rid of unnecessary symbols 
 for item in RequirementsInfo: # Go through list of meeting info
 try:
 thing = item.split("=") # Split string by = to get subject name and value
 name = thing[0]
 if any(char.isdigit() for char in thing[1]): # Get rid of annoying Z at the end of numbers
 for char in thing[1]:
 if "-" == char:
 thing[1] = re.sub("[Z]","",thing[1])
 break
 value = re.sub(' +',' ', thing[1])
 if value: # If subject has a value, store it
 try:
 subRequirementsDict[str(name)] = str(designators[str(value)]) # Store value converted to designator in a dictionary with the subject as the key
 except KeyError:
 subRequirementsDict[str(name)] = str(value) # Store value in a dictionary with the subject as the key
 except:
 pass
 RequirementsDict["requirement" + str(x)] = subRequirementsDict
 ClassDict["requirements"] = RequirementsDict
 AllCourseInfo = shlex.split(h.unescape(str(RawClassData[i]).replace(">", " "))) # sort into a list grouping string in quotes and getting rid of unnecessary symbols 
 for item in AllCourseInfo: # Go through list of class info
 try:
 thing = item.split("=") # Split string by = to get subject name and value
 name = thing[0]
 if any(char.isdigit() for char in thing[1]): # Get rid of annoying Z at the end of numbers
 for char in thing[1]:
 if "-" == char:
 thing[1] = re.sub("[Z]","",thing[1])
 break
 value = re.sub(' +',' ', thing[1])
 if value: # If subject has a value, store it
 try:
 ClassDict[str(name)] = str(designators[str(value)]) # Store value converted to designator in a dictionary with the subject as the key
 except KeyError:
 ClassDict[str(name)] = str(value) # Store value in a dictionary with the subject as the key
 except:
 pass
 classes[str(ClassDict["section"])] = ClassDict
 except Exception:
 #logging.exception("message")
 pass
 sys.stdout.write("\rLoading classes: Done ")
 sys.stdout.flush()
def printDic():
 with open("Classes", "w") as f:
 for key in classes:
 f.write("\n-------------%s------------" %key)
 for classkey in classes[key]:
 f.write( "\n%s : %s" %(classkey, classes[key][classkey]))
 f.write("\n")
def printSection(selection):
 print "\n-------------%s------------" %selection
 for classkey in classes[selection]:
 print "%s : %s" %(classkey, classes[selection][classkey])
def printClass(selection):
 prntSel = True
 for key in classes:
 if classes[key]["title"] == selection:
 prntSel = False
 print "\n-------------%s------------" %key
 for classkey in classes[key]:
 print "%s : %s" %(classkey, classes[key][classkey])
 if prntSel:
 print "\n-------------%s------------" %selection
 for classkey in classes[selection]:
 print "%s : %s" %(classkey, classes[selection][classkey])
 #Backup classes if section closed
 for key in classes: 
 if (classes[key]["title"] == classes[selection]["title"]) and (classes[key] != classes[selection]):
 print "\n-----Backup--------%s------------" %key
 for classkey in classes[key]:
 print "%s : %s" %(classkey, classes[key][classkey])
def printSelectedClasses():
 for key in selectedClasses:
 print "\n-------------%s------------" %key
 for classkey in selectedClasses[key]:
 print "%s : %s" %(classkey, selectedClasses[key][classkey])
def pickClass(selection):
 oneSel = True
 classToSort = {}
 var = Vars()
 colorStep = var.GetVars("color")
 for key in classes:
 ClassDict = {}
 if classes[key]["title"] == selection:
 repeat = False
 oneSel = False
 for classkey in classes[key]:
 ClassDict[str(classkey)] = classes[key][classkey]
 for selectedClass in selectedClasses:
 for section in selectedClasses[selectedClass]:
 if ClassDict["activity"] == selectedClasses[selectedClass][section]["activity"] and ClassDict["title"] == selectedClasses[selectedClass][section]["title"]:
 repeat = True
 if repeat == False:
 ClassDict["variable"] = "True"
 h, l, s = colorStep, 50, 100
 r, g, b = colorsys.hls_to_rgb(h/360.0, l/100.0, s/100.0)
 r, g, b = [x*255 for x in r, g, b]
 ClassDict["color"] = int(r),int(g),int(b) # Changing color
 classToSort[str(ClassDict["section"])] = ClassDict #Put selected class in a dictionary
 classes[str(ClassDict["section"])] = ClassDict
 if oneSel:
 classToSort[str(classes[selection]["section"])] = classes[selection] #Put selected section in a dictionary
 classToSort[str(classes[selection]["section"])]["variable"] = "False" #Not changing
 # Add activities
 activityHeads = ["LEC", "PRA", "L/L", "SEM", "PSI", "WSP"]
 for activityType in activityHeads:
 if str(classes[selection]["activity"]) == designators[str(activityType)]:
 Quiz = False
 Activity = False
 for requirement in classes[selection]["requirements"]:
 for requirementInfo in classes[selection]["requirements"][requirement]:
 # Add required activities
 if str(classes[selection]["requirements"][requirement][requirementInfo]) == "Activity needed ":
 Activity = True
 # Add Recitation
 if Activity == True and ("recitation" in str(classes[selection]["requirements"][requirement][requirementInfo])):
 isRecIn = False
 RecDic = {}
 for recitSection in classes:
 if classes[recitSection]["title"] == classes[selection]["title"]:
 if classes[recitSection]["activity"] == "recitation":
 RecDic[str(classes[recitSection]["section"])] = classes[recitSection]
 RecDic[str(classes[recitSection]["section"])]["variable"] = "True" # Changing
 h, l, s = colorStep, 50, 100
 r, g, b = colorsys.hls_to_rgb(h/360.0, l/100.0, s/100.0)
 r, g, b = [x*255 for x in r, g, b]
 RecDic[str(classes[recitSection]["section"])]["color"] = int(r),int(g),int(b) # Changing color
 for selectedClassTitle in selectedClasses:
 for selectedClass in selectedClasses[selectedClassTitle]:
 for selectedRec in RecDic:
 if selectedClasses[selectedClassTitle][selectedClass] == RecDic[selectedRec]:
 isRecIn = True
 if isRecIn == False: # Only adds recitation if a recitation not is already given.
 classToSort.update(RecDic) 
 ''' Add this functionality for when a title is given'''
 # Add Lab
 if Activity == True and ("laboratory" in str(classes[selection]["requirements"][requirement][requirementInfo])):
 isLabIn = False
 LabDic = {}
 for labSection in classes:
 if classes[labSection]["title"] == classes[selection]["title"]:
 if classes[labSection]["activity"] == "laboratory":
 LabDic[str(classes[labSection]["section"])] = classes[labSection]
 LabDic[str(classes[labSection]["section"])]["variable"] = "True" # Changing
 h, l, s = colorStep, 50, 100
 r, g, b = colorsys.hls_to_rgb(h/360.0, l/100.0, s/100.0)
 r, g, b = [x*255 for x in r, g, b]
 LabDic[str(classes[labSection]["section"])]["color"] = int(r),int(g),int(b) # Changing color
 for selectedClassTitle in selectedClasses:
 for selectedClass in selectedClasses[selectedClassTitle]:
 for selectedRec in LabDic:
 if selectedClasses[selectedClassTitle][selectedClass] == LabDic[selectedRec]:
 isLabIn = True
 if isLabIn == False: # Only adds recitation if a recitation not is already given.
 classToSort.update(LabDic) # Add this functionality for when a title is given
 #Backup classes if section closed
 for key in classes: 
 ClassDict = {}
 if (classes[key]["title"] == classes[selection]["title"]) and (classes[key] != classes[selection]):
 for classkey in classes[key]:
 ClassDict[str(classkey)] = classes[key][classkey]
 backupClasses[str(ClassDict["section"])] = ClassDict #Put extra sections with the same title in a dictionary
 if classToSort:
 var.SendVars("color", colorStep + 30)
 activities = ["LEC", "L/L", "LAB", "PSI", "QUZ", "RCT", "SEM", "PRA", "HSG", "MCE", "WSP"]
 activitiesDict = {"LEC": {}, "L/L": {}, "LAB": {}, "PSI": {}, "QUZ": {}, "RCT": {}, "SEM": {}, "PRA": {}, "HSG": {}, "MCE": {}, "WSP": {}}
 for activity in activities:
 for key in classToSort:
 ClassDict = {}
 if classToSort[key]["activity"] == designators[str(activity)]:
 for classkey in classToSort[key]:
 ClassDict[str(classkey)] = classToSort[key][classkey]
 activitiesDict[activity][str(ClassDict["section"])] = ClassDict #Put selected class section in a dictionary
 #"CS": "Freshman quiz/ Next Class "
 #"CA": "Activity needed ", 
 # LEC, PRA, L/L, SEM, PSI, WSP are the only ones that need to look for CS and CA
 activityHeads = ["LEC", "PRA", "L/L", "SEM", "PSI", "WSP"]
 # Build dictionary to add to selectedClasses
 for actClass in activitiesDict:
 if actClass:
 for classSec in activitiesDict[actClass]:
 selectedClasses[ str(activitiesDict[actClass][classSec]["title"]) + " " + str(activitiesDict[actClass][classSec]["activity"])] = activitiesDict[actClass] # Add all activities of each class
 # Add Freshman Quiz's
 for key in activityHeads:
 Quiz = False
 for requirement in activitiesDict[actClass][classSec]["requirements"]:
 for requirementInfo in activitiesDict[actClass][classSec]["requirements"][requirement]:
 if str(activitiesDict[actClass][classSec]["requirements"][requirement][requirementInfo]) == "Freshman quiz/ Next Class ":
 Quiz = True
 if Quiz == True and ("D 110" in str(activitiesDict[actClass][classSec]["requirements"][requirement][requirementInfo])):
 quiz = {}
 quiz[ str(activitiesDict[actClass][classSec]["requirements"][requirement][requirementInfo]) ] = classes[str(activitiesDict[actClass][classSec]["requirements"][requirement][requirementInfo])]
 quiz[ str(activitiesDict[actClass][classSec]["requirements"][requirement][requirementInfo]) ]["variable"] = "False" #Not changing
 selectedClasses[ str(activitiesDict[actClass][classSec]["title"]) + " Quiz " + str(activitiesDict[actClass][classSec]["requirements"][requirement][requirementInfo])[-1] ] = quiz # Add freshman quiz
def CreateScheduleImage(possibleSchedules):
 try:
 startTest = time.time() # Start timeing the test
 scheduleNum = 0
 if len(possibleSchedules) > 3:
 for x in range(2):
 schedule = possibleSchedules[0]
 ScheduleGrid = Image.open('Schedule.png').convert('RGBA')
 ClassBlocks = Image.new('RGBA', ScheduleGrid.size, (255,255,255,0))
 fnt = ImageFont.truetype('Library/Fonts/Tahoma.ttf', 8*2)
 fnt2 = ImageFont.truetype('Library/Fonts/Tahoma.ttf', 7*2)
 d = ImageDraw.Draw(ClassBlocks)
 for section in schedule:
 meetings = schedule[section]["meetings"]
 for meeting in meetings:
 days = schedule[str(section)]["meetings"][str(meeting)]["day"]
 for day in days:
 cltimeS = schedule[section]["meetings"][meeting]["starttime"]
 cltimeF = schedule[section]["meetings"][meeting]["endtime"]
 classStart = (cltimeS.split(":"))
 del classStart[-1]
 starttime = ( (int(classStart[0]) - 8)*60 + int(classStart[1]))/15 *19
 classEnd = (cltimeF.split(":"))
 del classEnd[-1]
 endtime = ( (int(classEnd[0]) - 8)*60 + int(classEnd[1]))/15 *19 - starttime
 if day == "M":
 dayNum = 0
 elif day == "T":
 dayNum = 1
 elif day == "W":
 dayNum = 2
 elif day == "R":
 dayNum = 3
 elif day == "F":
 dayNum = 4
 x1 = 80 + (190 + 1)*dayNum
 y1 = 32 + starttime + (16*19) #Add 4 hours because weird bug
 x2 = x1 + 190
 y2 = y1 + endtime
 BoxPosition = [((x1 +2)*2, (y1 +2)*2), ((x2)*2), ((y2 -1)*2)]
 BoxOutlinePosition1 = [((x1 +1.5)*2, (y1 +1.5)*2), ((x2+0.5)*2), ((y2 - 0.5)*2)]
 BoxOutlinePosition2 = [((x1 +1)*2, (y1 +1)*2), ((x2+1)*2), ((y2)*2)]
 d.rectangle(BoxOutlinePosition2, fill=(90,190,120,0), outline="darkred")
 d.rectangle(BoxOutlinePosition1, fill=(90,190,120,0), outline="grey")
 if schedule[section]["variable"] == "False":
 d.rectangle(BoxPosition, fill=(90,190,120,180), outline="darkred")
 else:
 d.rectangle(BoxPosition, fill=(schedule[section]["color"] + (180,)), outline="darkred")
 d.text([(x1 + 5)*2, (y1 + 1 +9*1)*2], schedule[section]["title"], font=fnt, fill=(0,0,0,255))
 d.text([(x1 + 5)*2, (y1 + 1)*2], schedule[section]["section"], font=fnt, fill=(0,0,0,255))
 d.text([(x1 + 5)*2, (y1 + 1 +9*2)*2], schedule[section]["instructor1"], font=fnt, fill=(0,0,0,255))
 d.text([(x1 + 5)*2, (y1 + 1 +9*3)*2], schedule[section]["callnumber"], font=fnt, fill=(0,0,0,255))
 requirements = schedule[section]["requirements"]
 count = 1
 for requirement in requirements:
 control = str(schedule[section]["requirements"][requirement]["control"])
 values = []
 for x in range(0, str(schedule[section]["requirements"][requirement]).count("value")):
 values.append(str(schedule[section]["requirements"][requirement]["value" + str(x + 1)]))
 if values:
 msg = control + ": " + str(values)
 else:
 msg = control
 width, height = d.textsize(msg)
 y2 = y2 -5
 d.text([(x2)*2 - width-10, (y2 -(height-5)*count)*2], msg, font=fnt2, fill=(200,0,0,255))
 count = count + 0.5
 out = Image.alpha_composite(ScheduleGrid, ClassBlocks)
 out.save((os.path.dirname(os.path.realpath(__file__)) + "/Schedules/Schedule" + str(scheduleNum) + ".png") )
 print "Preparing..."
 scheduleNum = scheduleNum + 1 #
 endTest = time.time() # End timing the test
 if os.path.exists((os.path.dirname(os.path.realpath(__file__)) + "/Schedules")):
 shutil.rmtree((os.path.dirname(os.path.realpath(__file__)) + "/Schedules"))
 if not os.path.exists((os.path.dirname(os.path.realpath(__file__)) + "/Schedules")): 
 os.makedirs(os.path.dirname(os.path.realpath(__file__)) + "/Schedules")
 photoTime = (endTest - startTest)/2
 else:
 photoTime = 1.4
 scheduleNum = 0
 estimate = str( (len(possibleSchedules)*photoTime) / 60).split(".")
 print "\n\nEstimated time to load %s images: %s minutes and %s seconds"%(len(possibleSchedules), int(estimate[0]), float("." + estimate[1])*60 )
 sys.stdout.write("\rTime left " + str( float("{0:.2f}".format((len(possibleSchedules))*photoTime - scheduleNum*photoTime)) ) + " seconds ")
 sys.stdout.flush()
 startPhotos = time.time()
 for schedule in possibleSchedules:
 ScheduleGrid = Image.open('Schedule.png').convert('RGBA')
 ClassBlocks = Image.new('RGBA', ScheduleGrid.size, (255,255,255,0))
 fnt = ImageFont.truetype('Library/Fonts/Tahoma.ttf', 8*2)
 fnt2 = ImageFont.truetype('Library/Fonts/Tahoma.ttf', 7*2)
 d = ImageDraw.Draw(ClassBlocks)
 for section in schedule:
 meetings = schedule[section]["meetings"]
 for meeting in meetings:
 days = schedule[str(section)]["meetings"][str(meeting)]["day"]
 for day in days:
 cltimeS = schedule[section]["meetings"][meeting]["starttime"]
 cltimeF = schedule[section]["meetings"][meeting]["endtime"]
 classStart = (cltimeS.split(":"))
 del classStart[-1]
 starttime = ( (int(classStart[0]) - 8)*60 + int(classStart[1]))/15 *19
 classEnd = (cltimeF.split(":"))
 del classEnd[-1]
 endtime = ( (int(classEnd[0]) - 8)*60 + int(classEnd[1]))/15 *19 - starttime
 if day == "M":
 dayNum = 0
 elif day == "T":
 dayNum = 1
 elif day == "W":
 dayNum = 2
 elif day == "R":
 dayNum = 3
 elif day == "F":
 dayNum = 4
 x1 = 80 + (190 + 1)*dayNum
 y1 = 32 + starttime + (16*19) #Add 4 hours because weird bug
 x2 = x1 + 190
 y2 = y1 + endtime
 BoxPosition = [((x1 +2)*2, (y1 +2)*2), ((x2)*2), ((y2 -1)*2)]
 BoxOutlinePosition1 = [((x1 +1.5)*2, (y1 +1.5)*2), ((x2+0.5)*2), ((y2 - 0.5)*2)]
 BoxOutlinePosition2 = [((x1 +1)*2, (y1 +1)*2), ((x2+1)*2), ((y2)*2)]
 # draw text, half opacity
 d.rectangle(BoxOutlinePosition2, fill=(90,190,120,0), outline="darkred")
 d.rectangle(BoxOutlinePosition1, fill=(90,190,120,0), outline="grey")
 if schedule[section]["variable"] == "False":
 d.rectangle(BoxPosition, fill=(90,190,120,180), outline="darkred")
 else:
 d.rectangle(BoxPosition, fill=(schedule[section]["color"] + (180,)), outline="darkred")
 # draw text, full opacity
 d.text([(x1 + 5)*2, (y1 + 1 +9*1)*2], schedule[section]["title"], font=fnt, fill=(0,0,0,255))
 d.text([(x1 + 5)*2, (y1 + 1)*2], schedule[section]["section"], font=fnt, fill=(0,0,0,255))
 d.text([(x1 + 5)*2, (y1 + 1 +9*2)*2], schedule[section]["instructor1"], font=fnt, fill=(0,0,0,255))
 d.text([(x1 + 5)*2, (y1 + 1 +9*3)*2], schedule[section]["callnumber"], font=fnt, fill=(0,0,0,255))
 #Print out required classes to bottom right corner in red
 requirements = schedule[section]["requirements"]
 count = 1
 for requirement in requirements:
 control = str(schedule[section]["requirements"][requirement]["control"])
 values = []
 for x in range(0, str(schedule[section]["requirements"][requirement]).count("value")):
 values.append(str(schedule[section]["requirements"][requirement]["value" + str(x + 1)]))
 if values:
 msg = control + ": " + str(values)
 else:
 msg = control
 width, height = d.textsize(msg)
 y2 = y2 -5
 d.text([(x2)*2 - width-10, (y2 -(height-5)*count)*2], msg, font=fnt2, fill=(200,0,0,255))
 count = count + 0.5
 out = Image.alpha_composite(ScheduleGrid, ClassBlocks)
 #timeToSave = time.time() 
 out.save((os.path.dirname(os.path.realpath(__file__)) + "/Schedules/Schedule" + str(scheduleNum) + ".png") ) # Takes about 0.75 sec to save
 #print "Time to save photo" + str(time.time() - timeToSave) 
 sys.stdout.write("\rLoading schedules: " + str( float("{0:.2f}".format(( float(scheduleNum+1)/float(len(possibleSchedules))) *100) )) + "% ")
 sys.stdout.flush()
 '''sys.stdout.write("\rTime left " + str( float("{0:.2f}".format((len(possibleSchedules)-1)*photoTime - scheduleNum*photoTime)) ) + " seconds")
 sys.stdout.flush()'''
 scheduleNum = scheduleNum + 1 # Takes about 1.4 sec per photo
 print "\n\nEstimated time to load %s images: %s minutes and %s seconds"%(len(possibleSchedules), int(estimate[0]), float("." + estimate[1])*60 )
 actual = str( (time.time() - startPhotos) / 60).split(".")
 print "Actual time to load %s images: %s minutes and %s seconds"%(len(possibleSchedules), int(actual[0]), float("." + actual[1])*60 )
 print "Diff = " + str( abs((time.time() - startPhotos) - (len(possibleSchedules)*photoTime)) ) + " seconds"
 print "Error in guess = " + str( float("{0:.2f}".format(((abs((time.time() - startPhotos) - (len(possibleSchedules)*photoTime))) / (time.time() - startPhotos)) * 100 )) ) + "%" + "\n\n"
 except KeyboardInterrupt:
 print "Bye"
def possibleCombos(ClassDic):
 try:
 AllClasses = []
 SectionMeetingTimes = []
 startSchedules = time.time()
 condencedClassTimeDic = {}
 ignore = {}
 for classType in ClassDic:
 condenceSectionDic = {}
 for section in ClassDic[classType]:
 for otherMeeting in ClassDic[classType][section]["meetings"]:
 for meeting in ClassDic[classType][section]["meetings"]:
 notIn = True
 if (ClassDic[classType][section]["meetings"][meeting] != ClassDic[classType][section]["meetings"][otherMeeting]) and (ClassDic[classType][section]["meetings"][meeting]["starttime"] == ClassDic[classType][section]["meetings"][otherMeeting]["starttime"]) and (ClassDic[classType][section]["meetings"][meeting]["endtime"] == ClassDic[classType][section]["meetings"][otherMeeting]["endtime"]):
 for alreadyIn in ignore:
 if ignore[alreadyIn] == ClassDic[classType][section]["meetings"][otherMeeting]:
 notIn = False
 if notIn:
 ignore[str(ClassDic[classType][section]["meetings"])] = copy.deepcopy(ClassDic[classType][section]["meetings"][meeting])
 ignore[str(ClassDic[classType][section]["section"])] = copy.deepcopy(ClassDic[classType][section])
 sectionDay = str(ClassDic[classType][section]["meetings"][meeting]["day"]) + str(str(ClassDic[classType][section]["meetings"][otherMeeting]["day"]))
 sectionName = str(ClassDic[classType][section]["section"])
 condenceSectionDic[sectionName] = copy.deepcopy(ClassDic[classType][section])
 meetingsDic = {}
 meetingsDic[str(meeting)] = copy.deepcopy(ClassDic[classType][section]["meetings"][meeting])
 meetingsDic[meeting]["day"] = sectionDay
 condenceSectionDic[sectionName]["meetings"] = meetingsDic
 for section in ClassDic[classType]:
 notIn = True
 for alreadyIn in ignore:
 if ignore[alreadyIn] == ClassDic[classType][section]:
 notIn = False
 if notIn:
 condenceSectionDic[str(ClassDic[classType][section]["section"])] = copy.deepcopy(ClassDic[classType][section])
 condencedClassTimeDic[str(classType)] = copy.deepcopy(condenceSectionDic)
 condencedClassDic = {}
 condencedSectionListDic = {}
 ignore = {}
 for classType in condencedClassTimeDic:
 condenceSectionDic = {}
 for section in condencedClassTimeDic[classType]:
 for otherSection in condencedClassTimeDic[classType]:
 notIn = True
 if (condencedClassTimeDic[classType][section] != condencedClassTimeDic[classType][otherSection]) and (condencedClassTimeDic[classType][section]["meetings"]["meeting0"]["starttime"] == condencedClassTimeDic[classType][otherSection]["meetings"]["meeting0"]["starttime"]) and (condencedClassTimeDic[classType][section]["meetings"]["meeting0"]["endtime"] == condencedClassTimeDic[classType][otherSection]["meetings"]["meeting0"]["endtime"]) and (condencedClassTimeDic[classType][section]["meetings"]["meeting0"]["day"] == condencedClassTimeDic[classType][otherSection]["meetings"]["meeting0"]["day"]) and (condencedClassTimeDic[classType][section]["activity"] == condencedClassTimeDic[classType][otherSection]["activity"]):
 for alreadyIn in ignore:
 if ignore[alreadyIn] == condencedClassTimeDic[classType][section]:
 notIn = False
 if notIn:
 ignore[str(condencedClassTimeDic[classType][section])] = condencedClassTimeDic[classType][otherSection]
 sectionName = str(condencedClassTimeDic[classType][section]["section"]) + "/" + str(((str(condencedClassTimeDic[classType][otherSection]["section"]).split(" "))[1][3:]))
 sectionProf = str(condencedClassTimeDic[classType][section]["instructor1"]) + "/" + str(str(condencedClassTimeDic[classType][otherSection]["instructor1"]))
 sectionNum = str(condencedClassTimeDic[classType][section]["callnumber"]) + "/" + str(str(condencedClassTimeDic[classType][otherSection]["callnumber"]))
 condenceSectionDic[sectionName] = condencedClassTimeDic[classType][section]
 condenceSectionDic[sectionName]["section"] = sectionName
 condenceSectionDic[sectionName]["instructor1"] = sectionProf
 condenceSectionDic[sectionName]["callnumber"] = sectionNum
 condencedSectionListDic[sectionName] = condencedClassTimeDic[classType][section]
 condencedSectionListDic[sectionName]["section"] = sectionName
 condencedSectionListDic[sectionName]["instructor1"] = sectionProf
 condencedSectionListDic[sectionName]["callnumber"] = sectionNum
 for section in condencedClassTimeDic[classType]:
 notIn = True
 for alreadyIn in ignore:
 if ignore[alreadyIn] == condencedClassTimeDic[classType][section]:
 notIn = False
 if notIn:
 condenceSectionDic[str(condencedClassTimeDic[classType][section]["section"])] = condencedClassTimeDic[classType][section]
 condencedSectionListDic[str(condencedClassTimeDic[classType][section]["section"])] = condencedClassTimeDic[classType][section]
 condencedClassDic[str(classType)] = condenceSectionDic
 # Create list of All classes
 for classToAdd in condencedClassDic:
 ClassTimes = []
 for classSection in condencedClassDic[classToAdd]:
 meetings = condencedClassDic[classToAdd][classSection]["meetings"]
 SectionMeetingTimes = []
 overlap = False
 for meeting in meetings:
 days = condencedClassDic[classToAdd][classSection]["meetings"][str(meeting)]["day"]
 for day in days:
 cltimeS = condencedClassDic[classToAdd][classSection]["meetings"][meeting]["starttime"]
 cltimeF = condencedClassDic[classToAdd][classSection]["meetings"][meeting]["endtime"]
 classStart = (cltimeS.split(":"))
 del classStart[-1]
 starttime = ( str(classStart[0]) + str(classStart[1]) ) 
 classEnd = (cltimeF.split(":"))
 del classEnd[-1]
 endtime = ( str(classEnd[0]) + str(classEnd[1]) )
 for times in timeConstraint:
 if times[2] == day:
 if ((int(starttime) + 400) < (int(times[0])) or (int(endtime) + 400) > (int(times[1])+1200)):
 overlap = True
 if not overlap: 
 SectionMeetingTimes.append([starttime,endtime,day,condencedClassDic[classToAdd][classSection]["title"],condencedClassDic[classToAdd][classSection]["section"]])
 if SectionMeetingTimes:
 ClassTimes.append(SectionMeetingTimes)
 AllClasses.append(ClassTimes)
 start = time.time()
 # Save time and space by getting rid of all duplicates from the list of classes.
 sortedAllClasses = []
 for section in AllClasses:
 section.sort()
 sortedAllClasses.append( list(section for section,_ in itertools.groupby(section)) )
 sortedAllClassList = []
 for section in sortedAllClasses:
 sortedAllClassTimes = []
 for times in section:
 times.sort()
 sortedAllClassTimes.append( list(times for times,_ in itertools.groupby(times)) )
 sortedAllClassList.append(sortedAllClassTimes)
 # Calculate how many possible schedules there are.
 possibilities = 1
 for title in sortedAllClassList:
 possibilities = possibilities* len(title)
 # Make sure there aren't too many schedules to go through, set limit to about how long it takes to go through 6 minutes of possible schedules.
 if possibilities <= 1959552:
 PossibleSchedules = list((list(tup) for tup in itertools.product(*sortedAllClassList))) # List of all possible schedules generates a lot of schedules.
 '''
 startTime1 = time.time() 
 TruePossibleSchedules = []
 for schedule in range(0, len(PossibleSchedules)): # Goes through the massive list of schedules at about 10000 per second.
 overlapping = [[s,e] for s in PossibleSchedules[schedule] for x in s for e in PossibleSchedules[schedule] for y in e if s is not e and x[2]==y[2] and (int(x[0])<=int(y[1]) and int(x[1])>=int(y[0]))]
 good = True
 if overlapping:
 good = False
 if good:
 TruePossibleSchedules.append(PossibleSchedules[schedule])
 print "Iterations left: " + str(len(PossibleSchedules) - schedule - 1)
 TruePossibleSchedules.sort()
 sortedTruePossibleSchedules = list(TruePossibleSchedules for TruePossibleSchedules,_ in itertools.groupby(TruePossibleSchedules)) 
 end1 = time.time() - startTime1
 startTime1 = time.time()
 '''
 #Takes a while:
 cores = mp.cpu_count()
 splitSchedules = chunkify(PossibleSchedules, cores)
 splitSchedules[0].append("First")
 pool = mp.Pool(processes=cores)
 result = pool.map(removeOverlaps, splitSchedules)
 TruePossibleSchedules = []
 for x in range(len(result)):
 TruePossibleSchedules = TruePossibleSchedules + result[x]
 TruePossibleSchedules.sort()
 sortedTruePossibleSchedules = list(TruePossibleSchedules for TruePossibleSchedules,_ in itertools.groupby(TruePossibleSchedules))
 '''
 end2 = time.time() - startTime1
 print "Origional: " + str(end1)
 print "MultiPross: " + str(end2)
 print "DIff: " + str(abs(end1 - end2))
 print "MultiPross is faster by: " + str( float("{0:.2f}".format(( (end1 - end2)/end2) *100) )) + "%" + "\n\n"
 '''
 end = time.time()
 if len(sortedTruePossibleSchedules) <= 600:
 # Turn into a list of dicts of the class sections 
 selectList = []
 for schedule in sortedTruePossibleSchedules:
 selectDict = {}
 for classSection in schedule:
 selectDict[str(classSection[0][-1])] = condencedSectionListDic[str(classSection[0][-1])]
 selectList.append(selectDict)
 print "\n\nTime to calculate and store all possible true schedules: " + str(time.time() - startSchedules)
 print "True Schedules: " + str(len(sortedTruePossibleSchedules))
 print "Possibilities: " + str(possibilities)
 return selectList
 else:
 print "That is too many ducking possibilities, it will take over 10 minutes to load the schedules, use less variable classes"
 print "Schedules: " + str(len(sortedTruePossibleSchedules))
 return "Bad"
 print "Time taken to process Possible True Schedules: " + str(end - start)
 else:
 print "That is too many ducking possibilities, it will take over 10 minutes just to run the calculations, use less variable classes"
 print "Possibilities: " + str(possibilities)
 return "Bad"
 except KeyboardInterrupt:
 print "Bye"
def chunkify(lst,n):
 return [ lst[i::n] for i in xrange(n) ]
def removeOverlaps(PossibleSchedules):
 first = False
 if PossibleSchedules[-1] == "First":
 cores = mp.cpu_count()
 print "Commandeering your %s cores..."%(cores)
 del PossibleSchedules[-1]
 first = True
 TruePossibleSchedules = []
 if first:
 for schedule in range(0, len(PossibleSchedules)):
 overlapping = [[s,e] for s in PossibleSchedules[schedule] for x in s for e in PossibleSchedules[schedule] for y in e if s is not e and x[2]==y[2] and (int(x[0])<=int(y[1]) and int(x[1])>=int(y[0]))]
 good = True
 if overlapping:
 good = False
 if good:
 TruePossibleSchedules.append(PossibleSchedules[schedule])
 sys.stdout.write("\rCalculating real schedules: " + str( float("{0:.2f}".format(( float(schedule+1)/float(len(PossibleSchedules))) *100) )) + "% ")
 sys.stdout.flush()
 sys.stdout.write("\rThanks for letting me borrow those ")
 sys.stdout.flush()
 else:
 for schedule in range(0, len(PossibleSchedules)):
 overlapping = [[s,e] for s in PossibleSchedules[schedule] for x in s for e in PossibleSchedules[schedule] for y in e if s is not e and x[2]==y[2] and (int(x[0])<=int(y[1]) and int(x[1])>=int(y[0]))]
 good = True
 if overlapping:
 good = False
 if good:
 TruePossibleSchedules.append(PossibleSchedules[schedule])
 return TruePossibleSchedules 
try:
 start = time.time()
 makeDatabase()
 end = time.time()
 print "\nTime to create database of every section of every class offered: " + str(end - start)
 #printClass("MA 123A")
 #printClass("Differential Equations")
 pickClass("Electricity & Magnetism")
 pickClass("Differential Equations")
 pickClass("CAL 103B")
 pickClass("Mechanics of Solids")
 pickClass("Engineering Design III")
 pickClass("Circuits and Systems")
 startMon = "9:00"
 endMon = "6:00"
 startTus = "9:00"
 endTus = "6:00"
 startWen = "9:00"
 endWen = "9:00"
 startThu = "9:00"
 endThu = "6:00"
 startFri = "8:00"
 endFri = "6:00"
 lucky = raw_input("\n\n\n\nAre you feeling lucky??? (Do you want to only create one schedule) ")
 if lucky.lower() == "yes" or lucky.lower() == "y" or lucky.lower() == "ya":
 isLucky = True
 else:
 isLucky = False
 mult = raw_input("\n\n\n\nWould you like to limit the number of schedules made? ")
 if mult.lower() == "yes" or mult.lower() == "y" or mult.lower() == "ya":
 isMult = True
 multNum = raw_input("\n\n\n\nHow many? ")
 try:
 int(multNum)
 except:
 print "Ummm... That's not a number, so I'll set it to 6."
 multNum = 6
 time.sleep(3)
 elif any(char.isdigit() for char in mult):
 isMult = True
 multNum = mult
 try:
 int(multNum)
 except:
 print "Ummm... That's not a number, so I'll set it to 6."
 multNum = 6
 time.sleep(3)
 else:
 isMult = False
 daytimes = [startMon,endMon,startTus,endTus,startWen,endWen,startThu,endThu,startFri,endFri]
 timeConstraint = []
 for x in range(0,10, 2):
 blah = ["M","M","T","T","W","W","R","R","F","F"]
 broken1 = daytimes[x].split(":")
 startD = broken1[0] + broken1[1]
 broken2 = daytimes[x+1].split(":")
 endD = broken2[0] + broken2[1]
 timeConstraint.append([startD,endD, blah[x]])
 '''
 pickClass("PEP 112RF")
 pickClass("Electricity & Magnetism")
 pickClass("E 126C")
 pickClass("MA 221E")
 pickClass("CAL 103B")
 pickClass("E 231J")
 pickClass("Circuits and Systems")
 '''
 '''
 start1 = time.time()
 #printSelectedClasses()
 #printDic()
 end1 = time.time()
 print "Time to create database of every section of every class offered: " + str(end - start)
 print "Time to pick classes: " + str(end1 - start1)
 '''
 combos = possibleCombos(selectedClasses)
 if combos == "Bad":
 print "\nTry giving less range for classes or pick a section definitely want to be in instead of a whole class to lower the amount of possibilities\n"
 elif isLucky and combos:
 randSchedule = []
 rando = random.randint(0, len(combos)-1)
 print "Random number: " + str(rando)
 randSchedule.append(combos[rando])
 CreateScheduleImage(randSchedule)
 elif isMult and combos:
 multSchedule = []
 randSchedule = []
 randNums = []
 while True:
 if len(randNums) == int(multNum) or len(randNums) == len(combos):
 break
 repeat = False
 rando = random.randint(0, len(combos)-1)
 for num in randNums:
 if rando == num:
 repeat = True
 if repeat == False:
 randNums.append(rando)
 print "Random number: " + str(randNums)
 for x in randNums:
 randSchedule.append(combos[x])
 CreateScheduleImage(randSchedule)
 elif combos:
 CreateScheduleImage(combos)
 else: 
 print "\nNo combinations available\n"
 runEnd = time.time()
 print "Total run time: " + str(runEnd - runStart)
except KeyboardInterrupt:
 print "Bye"

Output:

Output

Related question here

asked Aug 9, 2017 at 21:29
\$\endgroup\$
11
  • 1
    \$\begingroup\$ Thanks for providing the context. This is now a much more interesting Code Review question. \$\endgroup\$ Commented Aug 10, 2017 at 18:00
  • \$\begingroup\$ @Vogel612 Where did I add information from the answers? Have you actually read the answers or my question? \$\endgroup\$ Commented Aug 11, 2017 at 19:44
  • 2
    \$\begingroup\$ Our site policy generally prohibits modifying the code in the question after an answer has been posted, because in our experience, it leads to very messy Q&A when there are multiple versions of the code. That said, your changes are largely independent of the recommendations in the two existing answers, so we could probably arrive at a compromise. Please edit the question so that there is one version of the code to be reviewed — choose either your old code or the new code. Until then, I'm putting this question on hold as "Unclear what you are asking". \$\endgroup\$ Commented Aug 11, 2017 at 20:18
  • 2
    \$\begingroup\$ Just remove all mentions of any changes or performance comparisons. If you want us to look at the multiprocessing version, then just go with that. \$\endgroup\$ Commented Aug 11, 2017 at 20:25
  • 1
    \$\begingroup\$ The follow-up question looks good! \$\endgroup\$ Commented Aug 15, 2017 at 20:22

3 Answers 3

6
\$\begingroup\$

Oh! My first impression is that you fell in love with dictionaries - your multilevel dictionary ClassDic (which name should be class_dic by PEP 8 - Style Guide for Python Code) is something horrible!

Why?

Mainly because it so contradicts the DRY principle (Don't Repeat Yourself). The one set of the keys used completely again and again — so are they necessary? Don't they break the readability and the logic, too?

Probably would be useful first read the PEP 20 - The Zen of Python (or type

import this

in your Python interpreter), particularly these advice:

Simple is better than complex.
Complex is better than complicated.
Flat is better than nested.
Readability counts.
There should be one - and preferably only one - obvious way to do it.
If the implementation is hard to explain, it's a bad idea.

So my tip is first to refactor your ClassDic multi-level dictionary — maybe the solution will then emerge as a miracle.

answered Aug 9, 2017 at 23:11
\$\endgroup\$
5
  • \$\begingroup\$ Thank you for the advice, I will make sure to read those articles, but the way I structured the dictionary is due to the fact that I am pulling the data off a schools database, would it be helpful if I showed all my code? (If I do ik there will be a cringfest because its not entirely "Zen" lol) \$\endgroup\$ Commented Aug 9, 2017 at 23:36
  • 1
    \$\begingroup\$ If your data are from some database, why to not show us how you read it - probably it would be better to work with records in the database than first pull them to such complicated dictionary. \$\endgroup\$ Commented Aug 9, 2017 at 23:45
  • \$\begingroup\$ I added the code so now you can see where I am getting the database from, I am also in the process of fixing a bug where the code I originally was asking about doesn't completely sort through things so hopefully I'll add that to here once I figure it out. The main reason I am using dictionaries is because the user enters a name and it is much easier in my eyes to use a dictionary for situations where there are many variables you need to juggle per name given, it just feels more organized to me, maybe I'm just weird lol. \$\endgroup\$ Commented Aug 10, 2017 at 16:02
  • \$\begingroup\$ In regards to PEP 8 - Style Guide for Python Code, it states: CapitalizedWords (or CapWords, or CamelCase -- so named because of the bumpy look of its letters ). This is also sometimes known as StudlyCaps. Note: When using abbreviations in CapWords, capitalize all the letters of the abbreviation. Thus HTTPServerError is better than HttpServerError. mixedCase (differs from CapitalizedWords by initial lowercase character!) Which is how I always find more visually appealing than lower_case_with_underscores :P \$\endgroup\$ Commented Aug 10, 2017 at 18:13
  • \$\begingroup\$ @Jake - You wrote: "... it is much easier in my eyes to use a dictionary for situations where there are many variables you need to juggle per name given ...". - See my new answer. It is fundamental to change the structure of your data, otherwise the majority of people will not see a sense of reading your long code. \$\endgroup\$ Commented Aug 10, 2017 at 18:32
2
\$\begingroup\$

Compare

+---------+-----+---------+ 
| Name | Age | Country | 
+---------+-----+---------+ +---------+-----+---------+ 
| John | 22 | Canada | | Name | Age | Country | 
+---------+-----+---------+ +---------+-----+---------+ 
| Name | Age | Country | | John | 22 | Canada | 
+---------+-----+---------+ with +---------+-----+---------+ 
| Ingrid | 25 | Austria | | Ingrid | 25 | Austria | 
+---------+-----+---------+ +---------+-----+---------+ 
| Name | Age | Country | | Natasha | 19 | Russia |
+---------+-----+---------+ +---------+-----+---------+
| Natasha | 19 | Russia |
+---------+-----+---------+

and

[
 {"Name": "John", "Age": 22, "Country": "Canada" },
 {"Name": "Ingrid", "Age": 25, "Country": "Austria"},
 {"Name": "Natasha", "Age": 19, "Country": "Russia" },
]

with

[
 ["Name", "Age", "Country"],
 ["John", 22, "Canada" ],
 ["Ingrid", 25, "Austria"],
 ["Natasha", 19, "Russia" ],
]

or - with even simplier

[
 ["John", 22, "Canada" ],
 ["Ingrid", 25, "Austria"],
 ["Natasha", 19, "Russia" ],
]

(you may assign numbers to column names, as

Name, Age, Country = 0, 1, 2

and then use those names as indices for lists).

But why reinvent the wheel?

I see many import statements in your code, but I don't see something as

import numpy as np

You work with multi-dimensional matrices in your code - and numpy is the right thing what you need!

answered Aug 10, 2017 at 18:24
\$\endgroup\$
12
  • \$\begingroup\$ I agree that this is much simpler, and I originally did consider it, but the information given by the database for each class varies because whoever created the database annoyingly didn't feel like being uniform with how they present information so the position of information varies and the amount of information varies per class as well, I figured this is the type of situation a dictionary is best at tackling. \$\endgroup\$ Commented Aug 10, 2017 at 18:45
  • \$\begingroup\$ Since, for example, there is sometimes more that one name or time, which would be hard to handle when using indexing because being able to tell the difference between what are names and what are titles in ["John","Math",15:00] and ["John","Lucy","Math",15:00] is harder with lists compared to dicts, where no matter the index you can determine whether there are multiple teachers or other variations. \$\endgroup\$ Commented Aug 10, 2017 at 18:45
  • \$\begingroup\$ This was my thought process, maybe I wasn't thinking strait, but do you see why I went with dicts? What would storing in a list compared to a dict better? Also I do convert the dict to a sorted list for a few processes in my code so I do see the benefit of lists compared to dicts in the appropriate place. \$\endgroup\$ Commented Aug 10, 2017 at 18:45
  • 2
    \$\begingroup\$ Oh, your situation is unenviable, I didn't realize it. It would be probably better if you refuse such work or put some requirements for unification of provided data, but I am not able to judge if it is possible for you. My forecast is that neither you nor your customer (supervisor? emploeyer?) will be satisfied. \$\endgroup\$ Commented Aug 10, 2017 at 19:55
  • 1
    \$\begingroup\$ See the economic term Sunk costs, and IT one GIGO (garbage in, garbage out). See how many people were eager to review your code, including myself. Be happy that your code works and forget about it. It's my best advice. Good luck with you. \$\endgroup\$ Commented Aug 11, 2017 at 15:19
2
\$\begingroup\$

I came to this answer after focusing on the other question I posted relating to this code. My thought process was that since I first make a huge list of possible schedules that then need to be iterated over and checked for overlaps, wouldn't it save a ton of time if I checked for overlaps as I build the list of schedules? And my thought was very very much correct!

I replaced this code:

PossibleSchedules = list((list(tup) for tup in itertools.product(*sortedAllClassList))) # List of all possible schedules generates a lot of schedules.
cores = mp.cpu_count()
splitSchedules = chunkify(PossibleSchedules, cores)
splitSchedules[0].append("First")
result = []
try:
 pool = mp.Pool(processes=cores)
 result = pool.map(removeOverlaps, splitSchedules)
except:
 pass
print ""
TruePossibleSchedules = []
for x in range(len(result)):
 TruePossibleSchedules = TruePossibleSchedules + result[x]
#TruePossibleSchedules = PossibleSchedules
TruePossibleSchedules.sort()
sortedTruePossibleSchedules = list(TruePossibleSchedules for TruePossibleSchedules,_ in itertools.groupby(TruePossibleSchedules))
def chunkify(lst,n):
 return [ lst[i::n] for i in xrange(n) ]
def removeOverlaps(PossibleSchedules):
 try:
 first = False
 if PossibleSchedules[-1] == "First":
 cores = mp.cpu_count()
 print "Commandeering your %s cores..."%(cores)
 del PossibleSchedules[-1]
 first = True
 listSize = len(PossibleSchedules)
 TruePossibleSchedules = []
 if first:
 for schedule in range(0,listSize):
 overlapping = [[s,e] for s in PossibleSchedules[schedule] for x in s for e in PossibleSchedules[schedule] for y in e if s is not e and x[2]==y[2] and (int(x[0])<=int(y[1]) and int(x[1])>=int(y[0]))]
 if not overlapping:
 TruePossibleSchedules.append(PossibleSchedules[schedule])
 sys.stdout.write("\rCalculating real schedules: " + str( float("{0:.2f}".format(( float(schedule+1)/float(listSize)) *100) )) + "% ")
 sys.stdout.flush()
 sys.stdout.write("\rThanks for letting me borrow those ")
 sys.stdout.flush()
 else:
 for schedule in range(0,listSize):
 overlapping = [[s,e] for s in PossibleSchedules[schedule] for x in s for e in PossibleSchedules[schedule] for y in e if s is not e and x[2]==y[2] and (int(x[0])<=int(y[1]) and int(x[1])>=int(y[0]))]
 if not overlapping:
 TruePossibleSchedules.append(PossibleSchedules[schedule])
 return TruePossibleSchedules 
 except KeyboardInterrupt:
 pass

With my new code:

PossibleSchedules = list((list(tup) for tup in product(*sortedAllClassList))) # List of all possible schedules generates a lot of schedules.
TruePossibleSchedules = PossibleSchedules
TruePossibleSchedules.sort()
sortedTruePossibleSchedules = list(TruePossibleSchedules for TruePossibleSchedules,_ in itertools.groupby(TruePossibleSchedules))
def chunkify(lst,n):
 return [ lst[i::n] for i in xrange(n) ]
def faster(result):
 results_to_delete = []
 for schedule in result:
 for classOne in schedule:
 for classTwo in schedule:
 if classOne is not classTwo:
 for meetingOne in classOne:
 for meetingTwo in classTwo:
 if meetingOne[2]==meetingTwo[2] and (int(meetingOne[0])<=int(meetingTwo[1]) and int(meetingOne[1])>=int(meetingTwo[0])):
 results_to_delete.append(result.index(schedule))
 results_to_delete_sorted = []
 for elem in results_to_delete:
 if elem not in results_to_delete_sorted:
 results_to_delete_sorted.append(elem)
 if results_to_delete_sorted:
 for nextDelete in reversed(results_to_delete_sorted):
 del result[nextDelete]
 return result
def productSchedules(*args):
 pools = map(tuple, args)
 result = [[]]
 cores = 4
 try:
 cores = mp.cpu_count()
 except:
 cores = 4
 for pooly in pools:
 result = [x+[y] for x in result for y in pooly]
 splitSchedules = chunkify(result, cores)
 results = []
 pool = mp.Pool(processes=cores)
 results = pool.map(faster, splitSchedules)
 pool.close()
 pool.join()
 trueResults = []
 for x in range(len(results)):
 trueResults = trueResults + results[x]
 result = trueResults
 sys.stdout.write("\rCalculating real schedules: {:.2f}% ".format(float(pools.index(pooly))/(len(pools)-1) *100)) 
 for prod in result:
 yield tuple(prod)

OUTPUT:

ORIGIONAL:
-> python Schedule.py
Loading classes: Done 
Time to create database of every section of every class offered: 4.80000782013
Want the best schedules? n
Are you feeling lucky??? (Do you want to only create one schedule) y
Commandeering your 4 cores...
Thanks for letting me borrow those 
Time to calculate and store all possible true schedules: 340.109536171 ***
True Schedules: 1429
Possibilities: 2350080
Time taken to process Schedules: 340.096308947
Random number: 515
Estimated time to load 1 images: 0 minutes and 1.4 seconds
Loading schedules: 100.0% 
Estimated time to load 1 images: 0 minutes and 1.4 seconds
Actual time to load 1 images: 0 minutes and 1.23541712761 seconds
Diff = 0.164559030533 seconds
Error in guess = 13.32%
Total run time: 359.528627157
NEW:
-> python Schedule.py
Loading classes: Done 
Time to create database of every section of every class offered: 5.04908514023
Want the best schedules? n
Are you feeling lucky??? (Do you want to only create one schedule) n
Would you like to limit the number of schedules made? 6
Time to calculate and store all possible true schedules: 8.99596405029 ***
True Schedules: 1429
Possibilities: 2350080
Time taken to process Schedules: 8.92583394051
Random number: [650, 238, 352, 956, 57, 503]
Preparing...
Preparing...
Estimated time to load 6 images: 0 minutes and 7.97566509246 seconds
Loading schedules: 100.0% 
Estimated time to load 6 images: 0 minutes and 7.97566509246 seconds
Actual time to load 6 images: 0 minutes and 7.38800001144 seconds
Diff = 0.587633132935 seconds
Error in guess = 7.95%
Total run time: 35.5436708927

Which is 3680.69% faster than the original code.

enter image description here

answered Aug 16, 2017 at 16:23
\$\endgroup\$

Your Answer

Draft saved
Draft discarded

Sign up or log in

Sign up using Google
Sign up using Email and Password

Post as a guest

Required, but never shown

Post as a guest

Required, but never shown

By clicking "Post Your Answer", you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.