I'm not sure if this is the correct place for an optimization question like this, but I'm not even sure where to start looking. So here it goes.
There's this game I play that has very simple rules. You start with some of a resource and a single worker. This worker collects said resource and you can buy one of two upgrades: another worker or increasing collection speed of all your workers by 20%.
To simulate this simple game, I've written a quick program in Python as follows (relevant portions):
import itertools
WORKER_COST = 1700
UPGRADES_COST = 350
CD_TIME = 10
BASE_EARNING_SPEED = 2.8
MAX_WORKERS = 30
MAX_UPGRADES = 60
TARGET_SPEED = 30
SEQ_LEN = 15
GAME_LENGTH = 3000
WORKER_ID = 1
UPGRADES_ID = 0
global_cut_currency = []
global_cut_speeds = []
min_seq = []
min_time = GAME_LENGTH
max_speed = 0.0
skip = 0
def init():
global min_seq
global skip
x = [0, 1]
for p in itertools.product(x, repeat=SEQ_LEN):
# Only applies if SEQ_LEN > 30
if p.count(WORKER_ID) > MAX_WORKERS:
continue
# To skip permutations on large SEQ_LEN. Does nothing right now
if skip % 1 != 0:
skip = skip + 1
continue
collect_currency(p)
skip = skip + 1
def total_earning(workers_len, upgrades):
return (BASE_EARNING_SPEED*(1+0.2*upgrades)) * float(workers_len)
def earning(upgrades):
return (BASE_EARNING_SPEED*(1+0.2*upgrades))
def collect_currency(seq):
global min_seq
global max_speed
global min_time
global global_cut_currency
global global_cut_speeds
cut_currency = []
cut_speeds = []
upgrades = 0
currency = 850
time = 0
workers = [0.0]
worker_cd = False
upgrades_cd = False
worker_cd_time = 0
upgrades_cd_time = 0
index = 0
#print(seq)
while time < GAME_LENGTH:
cut_currency.append(currency)
cut_speeds.append(total_earning(len(workers), upgrades))
# If both earning and currency below top path, drop current path
if global_cut_currency:
if time < len(global_cut_currency):
if currency < global_cut_currency[time] and total_earning(len(workers), upgrades) < global_cut_speeds[time]:
#print("cut: ", seq)
break
#print(workers)
if time >= worker_cd_time and worker_cd == True:
worker_cd = False
if time >= upgrades_cd_time and upgrades_cd == True:
upgrades_cd = False
for i, worker in enumerate(workers):
if worker >= 50:
currency = currency + worker
workers[i] = 0
#print(earning(upgrades))
workers[i] = workers[i] + earning(upgrades)
if seq[index] == WORKER_ID and currency >= WORKER_COST and worker_cd == False:
#print(currency)
currency = currency - WORKER_COST
workers.append(0.0)
worker_cd = True
worker_cd_time = time + CD_TIME
index = index + 1
elif seq[index] == UPGRADES_ID and currency >= UPGRADES_COST and upgrades_cd == False:
#print(currency)
currency = currency - UPGRADES_COST
upgrades = upgrades + 1
upgrades_cd = True
upgrades_cd_time = time + CD_TIME
index = index + 1
if total_earning(len(workers), upgrades) >= TARGET_SPEED:
if time < min_time or max_speed < TARGET_SPEED:
min_time = time
min_seq = seq
max_speed = total_earning(len(workers), upgrades)
global_cut_currency = cut_currency
global_cut_speeds = cut_speeds
break
if index == SEQ_LEN:
if total_earning(len(workers), upgrades) >= max_speed-0.1 and max_speed != TARGET_SPEED:
if max_speed + 0.1 > total_earning(len(workers), upgrades) and total_earning(len(workers), upgrades) > max_speed - 0.1:
if time < min_time:
min_time = time
max_speed = total_earning(len(workers), upgrades)
min_seq = seq
global_cut_currency = cut_currency
global_cut_speeds = cut_speeds
else:
max_speed = total_earning(len(workers), upgrades)
min_seq = seq
min_time = time
global_cut_currency = cut_currency
global_cut_speeds = cut_speeds
#print(max_speed)
break
#print(currency)
time = time + 1
try:
init()
print("Speed: ", max_speed)
print("Sequence: ", min_seq)
print("Skip:", skip)
print("Time:", min_time)
except KeyboardInterrupt:
print("Speed: ", max_speed)
print("Sequence: ", min_seq)
print("Skip:", skip)
print("Time:", min_time)
The problem here is fairly obvious: 2^30 (In a perfect world, it would be even more) sequences are a lot to go through, especially with a while loop that iterates up to 3000 (GAME_LENGTH) times , which isn't even mentioning the nested for loop.
Are there ways to simplify this program in some way? Simulating it second by second was the most simple solution in my mind, but it's grossly inefficient. There's maybe an improvement in jumping forward in time to the next worker "deposit" since they deposit in batches of 50, but that really complicates things with the variable earning rate.
I'm trying to max out workers (30) and upgrades (60). This can be simplified by the fact that upgrades are more valuable at all times than workers when worker count reaches 5, but due to the cooldown, one can buy both at the same time with an income of 205 currency/second. This back and forth can be done earlier however, I'm just not exactly sure when.
I've posted the full code and I'm happy to add additional information if requested.
1 Answer 1
Mostly for interest's sake and not exactly fitting your current question: a simplified reading of these rules
You start with some of a resource and a single worker. This worker collects said resource and you can buy one of two upgrades: another worker or increasing collection speed of all your workers by 20%.
that assumes zero cost for upgrades and all upgrade cooldowns equal to one time unit can be analysed as follows.
Let s
= speed index, equivalent to your upgrades
variable; w
= worker count; then the rate is:
$$ r = w (1 + 0.2 s) $$
For some given number of actions a
(kind of analogous to your GAME_LENGTH
), the plane producing a Manhattan-norm of a
is defined by
$$ s + w = a $$
The Manhattan-norm is necessary because s
, w
and a
are discrete, but from here on for simplicity of analysis we treat them as continuous.
The intersection of that plane with the rate-surface is the curve
$$ r = (a - s) (1 + 0.2 s) $$
So for a fixed number of actions, the rate is parabolic. The analytic maximum for the rate is when $$\frac {dr}{ds} = 0$$ $0ドル = 0.2 a - 0.4 s - 1$$ $$s = a/2 - 2.5$$
For a
=30, s
= 12.5 ; in other words: upgrade the speed index about 12 times, with the remaining 18 upgrades going to worker count. In this formulation the sequence of when to upgrade speed or workers doesn't matter so long as the count matches the above.
Of course, your rules are more complex:
- you have fixed upgrade costs for workers or speed;
- these costs cannot be paid if the current balance would go below 0;
- on every tick, earnings might be consolidated, but only if those earnings are at least 50 per worker;
- upgrades can only happen on intervals of at least 10 ticks.
I think you can do better than your current brute-force code. Fundamentally, your use of itertools.product
and your SEQ_LEN
of 15 are both inaccurate and too pessimistic for reality. product()
varies the number of 0s and 1s through all of their possibilities. What you really want, I think, is a fixed number of 0s and 1s, equal to your MAX_WORKERS
and MAX_UPGRADES
respectively, producing an effective SEQ_LEN = MAX_WORKERS + MAX_UPGRADES
. In this interpretation, you're optimising for the entire sequence of possible upgrades to their maximal counts, while minimising the total time to do so. You suggest:
There's maybe an improvement in jumping forward in time to the next worker "deposit" since they deposit in batches of 50
Yes, do that.
that really complicates things with the variable earning rate.
It's not so bad. Operate on the following assumptions:
- All workers deposit at the same time
- The working currency balance increases in steps at the deposit times, not continuously
- If an upgrade becomes available due to a cooldown expiring between two deposits, it doesn't need to wait for the next deposit so long as there is enough balance
- All upgrades are bought as soon as possible
Operating on this model, a sampling of 100 random permutations of the 90-element action sequence produces a best time of about 27 minutes. This can be decreased by writing a permutation traversal algorithm with the following characteristics:
- Iterate through permutations in lexicographic order
- Consider two different permutations that swap equal actions to be one permutation and not separate (i.e.
001
and001
swapping the first and second element) - When you get to a certain depth, calculate the best-case time to leaf depth as the sum of all necessary cooldowns to get there. In other words, if you're at depth 70 out of 90, the best-case additional time is 10*(90 - 70) = 200 seconds. If the time at the current depth plus the additional best-case time exceeds the overall best-seen time, prune this branch and go up in depth.
Example code
This code includes:
- demonstrations of permutation traversal functions
next_perm
andprev_perm
- a
fill_events()
that produces a sequence of descriptive events given an action sequence - a sampling demonstration that shows the best-performing permutation among 500 options
It does not include actual permutation traversal or the described branch-pruning mechanism.
import random
from datetime import timedelta
from pprint import pprint
from typing import Literal, NamedTuple, Iterator
WORKER_COST = 1700
UPGRADES_COST = 350
COOLDOWN_TIME = 10
BASE_EARNING_SPEED = 2.8 # currency per second
COLLECTION_THRESHOLD = 50
MAX_WORKERS = 30
MAX_UPGRADES = 60
DO_UPGRADE = 0
ADD_WORKER = 1
Action = Literal[DO_UPGRADE, ADD_WORKER]
Actions = list[Action]
ACTION_COSTS = [0,0]
ACTION_COSTS[DO_UPGRADE] = UPGRADES_COST
ACTION_COSTS[ADD_WORKER] = WORKER_COST
class Event(NamedTuple):
time: float
balance: float
speed: float
workers: int
to_collect: float
upgrade_times: tuple[float, float]
def next_perm(current: Actions) -> int:
"""
Mutate current to the next-highest permutation.
Return the index of the earliest changed element.
Throw StopIteration if there is no next permutation.
"""
n = len(current)
i_increase = next( # eventually throws StopIteration
i for i in range(n-1, 0, -1)
if current[i-1] == 0 and current[i] == 1
)
current[i_increase - 1] = 1
current[i_increase] = 0
current[i_increase+1:] = current[n-1: i_increase: -1]
return i_increase - 1
def prev_perm(current: Actions) -> int:
"""
Mutate current to the next-lowest permutation.
Return the index of the earliest changed element.
Throw StopIteration if there is no next permutation.
"""
n = len(current)
i_decrease = next( # eventually throws StopIteration
i for i in range(n-1, 0, -1)
if current[i-1] == 1 and current[i] == 0
)
current[i_decrease - 1] = 0
current[i_decrease] = 1
current[i_decrease+1:] = current[n-1: i_decrease: -1]
return i_decrease - 1
def count_perms() -> int:
# 90! / 30! 60!
# = prod(61..90) / prod(1..30)
n = 1
for i in range(1, MAX_WORKERS+1):
n *= 1 + MAX_UPGRADES/i
return n
def fill_events(actions: Actions) -> Iterator[Event]:
events = [
Event(
time=0, balance=0, speed=BASE_EARNING_SPEED, workers=1, to_collect=COLLECTION_THRESHOLD,
upgrade_times=(-float('inf'), -float('inf')),
)
]
for action in actions:
next_cost = ACTION_COSTS[action]
available_at = events[-1].upgrade_times[action] + COOLDOWN_TIME
# Wait for the collections necessary to afford the next upgrade action, and
# potentially wait for any number of whole collection periods during the timeout.
while True:
next_collection_time = events[-1].to_collect/events[-1].speed + events[-1].time
if (
events[-1].balance >= next_cost and # If we have enough currency
next_collection_time > available_at # If waiting would exceed the timeout-availability deadline
):
break
events.append(Event(
time=next_collection_time,
balance=events[-1].balance + events[-1].workers * COLLECTION_THRESHOLD,
speed=events[-1].speed,
workers=events[-1].workers,
upgrade_times=events[-1].upgrade_times,
to_collect=COLLECTION_THRESHOLD,
))
# We might need to wait for a partial collection period so that the timeout expires.
partial_time = available_at - events[-1].time
if partial_time > 0:
to_collect = COLLECTION_THRESHOLD - partial_time * events[-1].speed
events.append(Event(
time=available_at,
balance=events[-1].balance,
speed=events[-1].speed,
workers=events[-1].workers,
upgrade_times=events[-1].upgrade_times,
to_collect=to_collect,
))
else:
to_collect = COLLECTION_THRESHOLD
if action == ADD_WORKER:
new_workers = events[-1].workers + 1
new_speed = events[-1].speed
elif action == DO_UPGRADE:
new_workers = events[-1].workers
new_speed = events[-1].speed + 0.2*BASE_EARNING_SPEED
upgrade_times = list(events[-1].upgrade_times)
upgrade_times[action] = events[-1].time
events.append(Event(
time=events[-1].time,
balance=events[-1].balance - next_cost,
speed=new_speed,
workers=new_workers,
upgrade_times=upgrade_times,
to_collect=to_collect,
))
return events
def main() -> None:
print(f'{count_perms():.2e} possibilities before branch pruning')
all_upgrades = [DO_UPGRADE]*MAX_UPGRADES
all_workers = [ADD_WORKER]*MAX_WORKERS
left = all_upgrades + all_workers
right = all_workers + all_upgrades
all_actions = [left, right]
for _ in range(500):
shuffled = list(left)
random.shuffle(shuffled)
all_actions.append(shuffled)
all_events = (
((events := fill_events(actions))[-1].time, events, actions)
for actions in all_actions
)
best_time, best_events, best_actions = min(all_events)
print(f'Best time: {timedelta(seconds=best_time)}')
print('Events:')
pprint(best_events)
if __name__ == '__main__':
main()
Output
6.73e+23 possibilities before branch pruning
Best time: 0:27:04.546768
Events:
[Event(time=0, balance=0, speed=2.8, workers=1, to_collect=50, upgrade_times=(-inf, -inf)),
Event(time=17.857142857142858, balance=50, speed=2.8, workers=1, to_collect=50, upgrade_times=(-inf, -inf)),
Event(time=35.714285714285715, balance=100, speed=2.8, workers=1, to_collect=50, upgrade_times=(-inf, -inf)),
Event(time=53.57142857142857, balance=150, speed=2.8, workers=1, to_collect=50, upgrade_times=(-inf, -inf)),
Event(time=71.42857142857143, balance=200, speed=2.8, workers=1, to_collect=50, upgrade_times=(-inf, -inf)),
Event(time=89.28571428571429, balance=250, speed=2.8, workers=1, to_collect=50, upgrade_times=(-inf, -inf)),
Event(time=107.14285714285715, balance=300, speed=2.8, workers=1, to_collect=50, upgrade_times=(-inf, -inf)),
Event(time=125.00000000000001, balance=350, speed=2.8, workers=1, to_collect=50, upgrade_times=(-inf, -inf)),
Event(time=125.00000000000001, balance=0, speed=3.36, workers=1, to_collect=50, upgrade_times=[125.00000000000001, -inf]),
Event(time=139.8809523809524, balance=50, speed=3.36, workers=1, to_collect=50, upgrade_times=[125.00000000000001, -inf]),
Event(time=154.7619047619048, balance=100, speed=3.36, workers=1, to_collect=50, upgrade_times=[125.00000000000001, -inf]),
...
Event(time=1624.162152623287, balance=130650, speed=36.399999999999984, workers=30, to_collect=50, upgrade_times=[1614.5467680079023, 1614.5467680079023]),
Event(time=1624.5467680079023, balance=130650, speed=36.399999999999984, workers=30, to_collect=36.0000000000051, upgrade_times=[1614.5467680079023, 1614.5467680079023]),
Event(time=1624.5467680079023, balance=128950, speed=36.399999999999984, workers=31, to_collect=36.0000000000051, upgrade_times=[1614.5467680079023, 1624.5467680079023])]
-
1\$\begingroup\$ This is very helpful! I'm going to try my hand at a full implementation and report back \$\endgroup\$Olivier Poulin– Olivier Poulin2022年07月02日 16:45:26 +00:00Commented Jul 2, 2022 at 16:45
[...]
) and (2) Don't ask specific questions. It doesn't match the format of the site. You're welcome to mention concerns like "this is too slow" as long as the code works \$\endgroup\$