A few weeks ago, I wrote a Python implementation of 2048. Since then, I've been working on a simple AI to play the game for me. The code uses expectimax search to evaluate each move, and chooses the move that maximizes the search as the next move to execute.
Similar to what others have suggested, the evaluation function examines monotonicity along a "snake path"; I also penalize boards that don't keep the largest tile at the snake path's terminal.
The AI can get to 2048 (and even 4096 occasionally), but not further. One of the things I've had difficulty with is tuning the evaluation function, and looking at what to try next. I have taken a look at awarding bonuses for zero tiles (for example, z*log(s)
where z
is the number of zeros and s
is the total sum of the tiles on the board; zeros become more valuable deeper into the game) but haven't had much luck improving things. Rather, I feel like I'm guessing at what to do next to improve the heuristic (or I'll look at a board that defied my human strategy, and try and write something to mirror this), and am curious what someone else might suggest.
The code for the game and the AI can be found here. You can run aiplay
with a Game
instance.
Below is the relevant AI code:
from game import *
import math
def aimove(b):
"""
Returns a list of possible moves ("left", "right", "up", "down")
and each corresponding fitness
"""
def fitness(b):
"""
Returns the heuristic value of b
Snake refers to the "snake line pattern" (http://tinyurl.com/l9bstk6)
Here we only evaluate one direction; we award more points if high valued tiles
occur along this path. We penalize the board for not having
the highest valued tile in the lower left corner
"""
if Game.over(b):
return -float("inf")
snake = []
for i, col in enumerate(zip(*b)):
snake.extend(reversed(col) if i % 2 == 0 else col)
m = max(snake)
return sum(x/10**n for n, x in enumerate(snake)) - \
math.pow((b[3][0] != m)*abs(b[3][0] - m), 2)
def search(b, d, move=False):
"""
Performs expectimax search on a given configuration to
specified depth (d).
Algorithm details:
- if the AI needs to move, make each child move,
recurse, return the maximum fitness value
- if it is not the AI's turn, form all
possible child spawns, and return their weighted average
as that node's evaluation
"""
if d == 0 or (move and Game.over(b)):
return fitness(b)
alpha = fitness(b)
if move:
for _, child in Game.actions(b):
return max(alpha, search(child, d-1))
else:
alpha = 0
zeros = [(i,j) for i,j in itertools.product(range(4), range(4)) if b[i][j] == 0]
for i, j in zeros:
c1 = [[x for x in row] for row in b]
c2 = [[x for x in row] for row in b]
c1[i][j] = 2
c2[i][j] = 4
alpha += .9*search(c1, d-1, True)/len(zeros) + \
.1*search(c2, d-1, True)/len(zeros)
return alpha
return [(action, search(child, 5)) for action ,child in Game.actions(b)]
def aiplay(game):
"""
Runs a game instance playing the move determined
by aimove.
"""
b = game.b
while True:
print(Game.string(b) + "\n")
action = max(aimove(b), key = lambda x: x[1])[0]
if action == "left" : b = Game.left(b)
if action == "right": b = Game.right(b)
if action == "up" : b = Game.up(b)
if action == "down" : b = Game.down(b)
b = Game.spawn(b, 1)
if Game.over(b):
m = max(x for row in b for x in row)
print("game over...best was %s" %m)
print(Game.string(b))
break
2 Answers 2
Don't use wildcard imports
PEP8 advises against this:
from game import *
Rewrite like this:
import math
import itertools
from game import Game
Use elif
for mutually exclusive if
s
These if
statements cannot all be true:
if action == "left": b = Game.left(b) if action == "right": b = Game.right(b) if action == "up": b = Game.up(b) if action == "down": b = Game.down(b)
But they will all be evaluated, even if action is "left", it will be checked against "right", "up", "down", pointlessly.
if action == "left":
b = Game.left(b)
elif action == "right":
b = Game.right(b)
elif action == "up":
b = Game.up(b)
elif action == "down":
b = Game.down(b)
Avoid shadowing variables
In this code, the parameter b
of aimove
is going to be shadowed by the parameters in fitness
, and the many other methods where you reuse this name:
def aimove(b): # ... def fitness(b): # ... def search(b, d, move=False): # ...
This can be confusing and can lead to errors.
Dealing with long lines
This is just a small tip.
I don't like using \
for breaking new lines like this:
alpha += .9 * search(c1, d - 1, True) / len(zeros) + \ .1 * search(c2, d - 1, True) / len(zeros)
I prefer to put the expression within parentheses, which is another way of breaking a long line, like this:
alpha += (.9 * search(c1, d - 1, True) / len(zeros) +
.1 * search(c2, d - 1, True) / len(zeros))
Easier way to make deep copies
c1 = [[x for x in row] for row in b] c2 = [[x for x in row] for row in b]
An easier way:
from copy import deepcopy
c1 = deepcopy(b)
c2 = deepcopy(b)
Other simplifications
This expressions can be written simpler:
(b2[3][0] != m) * abs(b2[3][0] - m)
When b2[3][0]
is not equal to m
,
the expression becomes:
1 * abs(b2[3][0] - m) abs(b2[3][0] - m)
When b2[3][0]
is equal to m
,
the expression becomes:
0 * abs(b2[3][0] - m) 0 * abs(m - m) 0 * abs(0) 0 * 0
Since the multiplication with (b2[3][0] != m)
doesn't change the outcome,
you can simply write:
abs(b2[3][0] - m)
Game
shouldn't be a class, in its present form
The posted code doesn't really use a Game
instance.
It just takes the initial board out of the Game
instance it receives,
but that's all the data a Game
instance contains,
and it's never used again.
All the calls involving Game
are static Game.some_function
calls.
This is not a normal practice, and confusing.
As such, it would be better to move all the functions of Game
to package scope,
do import game
,
and call the methods like game.actions
, game.over
, and so on.
You point out in a comment that you like to use a class for logical grouping purposes. But classes are not the only thing that can do that, packages can do that too, in fact it's a very good way of using packages.
However, the option is probably to keep the Game
class,
but make its methods non-static,
and work with instances, instead of calling static methods.
Instead of manipulating a board directly,
let the Game
class contain the board,
and hide within the details of all the operations on the board.
It will be good encapsulation.
For example,
the static call Game.over(b)
should be replaced with the instance method call game.over()
, with no more board parameter.
-
\$\begingroup\$ Thanks for the tip about the
if
/elif
and the penalty assessment in the evaluation function. Is putting my functions in a class really that egregious, though? I like the logical grouping classes provide, and the fact thatGame
constructs an initial configuration. \$\endgroup\$rookie– rookie2015年06月08日 13:06:48 +00:00Commented Jun 8, 2015 at 13:06 -
\$\begingroup\$ I added more details to my last point, with an alternative that's better, and you might like it more too. \$\endgroup\$janos– janos2015年06月08日 13:53:38 +00:00Commented Jun 8, 2015 at 13:53
-
\$\begingroup\$ Thanks for the update. For state space search, I like to avoid mutability, and this is the reason my functions for really manipulating the board/advancing the game are static. What are your thoughts on this? Would you suggest that I make the move functions return a new instance? Thank you for your time -- your design advice is very good! Do you have any thoughts on improving/evaluating the evaluation function? In general, I'm kind of at a loss for what to try next, or even how to confirm that I'm going in the right direction. Some advice on this thought process would be much appreciated. \$\endgroup\$rookie– rookie2015年06月08日 14:02:07 +00:00Commented Jun 8, 2015 at 14:02
-
\$\begingroup\$ In the current program you are carrying around a board that gets repatedly reassigned to the next state. This isn't really different from carrying around a
game
instance that mututes the board inside of it. But this latter setup will be more natural, and good step in the direction of proper encapsulation and information hiding. I didn't want to focus too much onGame
though, as it was not part of the code posted in the question. \$\endgroup\$janos– janos2015年06月08日 14:18:06 +00:00Commented Jun 8, 2015 at 14:18 -
\$\begingroup\$ As for your algorithm... I didn't have time to look too close. I thought you implemented expectimax as you mentioned in the description, and didn't really think of verifying (you are supposed to post fully working code). Is that what you're looking for? Verifying if your expectimax implementation is correct? \$\endgroup\$janos– janos2015年06月08日 14:19:34 +00:00Commented Jun 8, 2015 at 14:19
Your code is actually pretty good! I do a few suggestions for general improvement, specifically regarding PEP8 and such.
- Running your code through a PEP8 checker, you have about
~20
PEP8 errors, most of them being whitespace errors, such as missing two blank lines between functions, or the following line.
if action == "up" : b = Game.up(b)
- Another point I want to mention would be variable naming. While, in this scenario, not having looked at the other background code, I may be wrong. Some of your variables have short, unclear names, like
c1
,b
, ord
.
For now, that's all I can really suggest. If there's anything else that you want me to cover, just mention it in the comments, I'll see if I can cover it. I hope this helps!
-
1\$\begingroup\$ Thanks for the review, Ethan. Didn't know about the PEP8 checker (what a great resource!) I was more hoping people could suggest improvements to the AI evaluation function (i.e. improve how well the AI plays the game). I feel like I'm going in circles, and any algorithm improvements along with stylistic/structural code improvements would be much appreciated. Maybe CR stackexchange isn't well suited for this first point, but I did want to post the code for review. Thanks again! \$\endgroup\$rookie– rookie2015年06月06日 03:40:43 +00:00Commented Jun 6, 2015 at 3:40
-
1\$\begingroup\$ One more thing. You're right -- using
b
is bad -- I should just useboard
.c1
refers to "child one" as there are two spawn possibilities (we either get a two or a four).d
refers to the depth of the search. Maybedepth
is a better name; I think it's relatively clear if you're familiar with recursive depth limited search. \$\endgroup\$rookie– rookie2015年06月06日 03:46:14 +00:00Commented Jun 6, 2015 at 3:46 -
\$\begingroup\$ @rookie I'm not too familiar with recursive depth limited search, but I'll see what I can review. In the meantime, it's time for me to find a pillow and a bed. :) \$\endgroup\$Ethan Bierlein– Ethan Bierlein2015年06月06日 04:14:46 +00:00Commented Jun 6, 2015 at 4:14
Explore related questions
See similar questions with these tags.
b = Game().b
; callingaiplay(b)
will run the ai. \$\endgroup\$aiplay
should really take a game and not that game'sb
. Thanks! \$\endgroup\$