Iterate over part of tuple key in Python dictionary

Question 1

I am working on an optimization project where I have a series of dictionaries with tuples as keys and another dictionary (a decision variable with Gurobi) where the key is the first element of the tuples in the other dictionaries. I need to be able to do the following:

data1 = {(place, person): q}
data2 = {person: s}
x = {place: var}
qx = {k: x[k]*data1[k] for k in x}
total1 = {}
for key, value in qx.items():
 person = key[1]
 if person in total1:
 total1[person] = total1[person] + value
 else:
 total1[person] = value
total2 = {k: total1[k]/data2[k] for k in total1}

(Please note that the data1, data2, and x dictionaries are very large, 10,000+ distinct place/person pairs).

This same process works when I use the raw data in place of the decision variable, which uses the same (place, person) key. Unfortunately, my variable within the Gurobi model itself must be a dictionary and it cannot contain the person key value.

Is there any way to iterate over just the first value in the tuple key?

EDIT: Here are some sample values (sensitive data, so placeholder values):

data1 = {(1, a): 28, (1, c): 57, (2, b): 125}
data2 = {a: 7.8, b: 8.5, c: 8.4}
x = {1: 0.002, 2: 0.013}

Values in data1 are all integers, data2 are hours, and x are small decimals.

Outputs in total2 should look similar to the following (assuming there are many other rows for each person):

total2 = {a: 0.85, b: 1.2, c: 1.01}

This code is essentially calculating a "productivity score" for each person. The decision variable, x, is looking only at each individual place for business purposes, so it cannot include the person identifiers. Also, the Gurobi package is very limiting about how things can be formatted, so I have not found a way to even use the tuple key for x.

Question 2

Please provide some short input and output data samples.

Question 3

And how do you arrive to those numbers in total2?

Question 4

Sorry, I didn't do the actual math for the sample data. Those are results that I would get using the entire set of data, which includes upwards of 100 rows for each person at each place.

Question 5

Generally, the most efficient way to aggregate values into bins is to use a for loop and store the values in a dictionary, as you did with total1 in your example. In the code below, I have fixed your qx line so it runs, but I don't know if this matches your intention. I also used total1.setdefault to streamline the code a little:

a, b, c = 'a', 'b', 'c'
data1 = {(1, a): 28, (1, c): 57, (2, b): 125}
data2 = {a: 7.8, b: 8.5, c: 8.4}
x = {1: 0.002, 2: 0.013}
qx = {place, person: x[place] * value for (place, person), value in data1.items()}
total1 = {}
for (place, person), value in qx.items():
 total1.setdefault(person, 0.0)
 total1[person] += value
total2 = {k: total1[k] / data2[k] for k in total1}
print(total2)
# {'a': 0.0071794871794871795, 'c': 0.013571428571428571, 'b': 0.19117647058823528}

But this doesn't produce the result you asked for. I can't tell at a glance how you get the result you showed, but this may help you move in the right direction.

It might also be easier to read if you moved the qx logic into the loop, like this:

total1 = {}
for (place, person), value in data1.items():
 total1.setdefault(person, 0.0)
 total1[person] += x[place] * value
total2 = {k: total1[k] / data2[k] for k in total1}

Or, if you want to do this often, it might be worth creating a cross-reference between persons and their matching places, as @martijn-pieters suggested (note, you still need a for loop to do the initial cross-referencing):

# create a list of valid places for each person
places_for_person = {}
for place, person in data1:
 places_for_person.setdefault(person, [])
 places_for_person[person].append(place)
# now do the calculation
total2 = {
 person: 
 sum(
 data1[place, person] * x[place]
 for place in places_for_person[person]
 ) / data2[person]
 for person in data2
}

Question 6

Thanks! The streamlined bits are super helpful. At this point, I still can't get total1 to work with x, so I have a feeling it is more to do with the way Gurobi processes things rather than the straight up Python syntax.

Question 7

Glad to help! If you do this a lot, you might also try total1=collections.defaultdict(list) for the first example or total1=collections.defaultdict(float) for the second example (instead of using a standard dict). Then you don't even need the setdefault part.

Question 8

By the way, if you're doing a lot of optimization in Python, you might consider using the Pyomo package. It's a robust, mature package and allows you to use pretty much any solver without changing your code (gurobi, cplex, glpk, cbc, etc.). Also has some nice extensions for stochastic programming.

Question 9

For creating a new dictionary removing the tuple:

a, b, c = "a", "b", "c"
data1 = {(1, a): 28, (1, c): 57, (2, b): 125}
total = list()
spot = 0 
for a in data1:
 total.append(list(a[1])) # Add new Lists to list "total" containing the Key values
 total[spot].append(data1[a]) # Add Values to Keys judging from their spot in the list
 spot += 1 # to keep the spot in correct place in lists
total = dict(total) # convert it to dictionary
print(total)

Output:

{'a': 28, 'c': 57, 'b': 125}

Question 10

it doesn't seem very optimised. The author said the dataset is huge.

score 4 · Accepted Answer · 2017-12-13 20:07:43Z

Generally, the most efficient way to aggregate values into bins is to use a for loop and store the values in a dictionary, as you did with total1 in your example. In the code below, I have fixed your qx line so it runs, but I don't know if this matches your intention. I also used total1.setdefault to streamline the code a little:

a, b, c = 'a', 'b', 'c'
data1 = {(1, a): 28, (1, c): 57, (2, b): 125}
data2 = {a: 7.8, b: 8.5, c: 8.4}
x = {1: 0.002, 2: 0.013}
qx = {place, person: x[place] * value for (place, person), value in data1.items()}
total1 = {}
for (place, person), value in qx.items():
 total1.setdefault(person, 0.0)
 total1[person] += value
total2 = {k: total1[k] / data2[k] for k in total1}
print(total2)
# {'a': 0.0071794871794871795, 'c': 0.013571428571428571, 'b': 0.19117647058823528}

But this doesn't produce the result you asked for. I can't tell at a glance how you get the result you showed, but this may help you move in the right direction.

It might also be easier to read if you moved the qx logic into the loop, like this:

total1 = {}
for (place, person), value in data1.items():
 total1.setdefault(person, 0.0)
 total1[person] += x[place] * value
total2 = {k: total1[k] / data2[k] for k in total1}

Or, if you want to do this often, it might be worth creating a cross-reference between persons and their matching places, as @martijn-pieters suggested (note, you still need a for loop to do the initial cross-referencing):

# create a list of valid places for each person
places_for_person = {}
for place, person in data1:
 places_for_person.setdefault(person, [])
 places_for_person[person].append(place)
# now do the calculation
total2 = {
 person: 
 sum(
 data1[place, person] * x[place]
 for place in places_for_person[person]
 ) / data2[person]
 for person in data2
}

Thanks! The streamlined bits are super helpful. At this point, I still can't get total1 to work with x, so I have a feeling it is more to do with the way Gurobi processes things rather than the straight up Python syntax.
Glad to help! If you do this a lot, you might also try total1=collections.defaultdict(list) for the first example or total1=collections.defaultdict(float) for the second example (instead of using a standard dict). Then you don't even need the setdefault part.
By the way, if you're doing a lot of optimization in Python, you might consider using the Pyomo package. It's a robust, mature package and allows you to use pretty much any solver without changing your code (gurobi, cplex, glpk, cbc, etc.). Also has some nice extensions for stochastic programming.

CollectivesTM on Stack Overflow

Iterate over part of tuple key in Python dictionary

2 Answers 2

Your Answer

Sign up or log in

Post as a guest

Post as a guest

Hot Network Questions

CollectivesTM on Stack Overflow

2 Answers 2

Your Answer

Sign up or log in

Post as a guest

Post as a guest

Related