Simple 2048 AI in Python

Question 1

A few weeks ago, I wrote a Python implementation of 2048. Since then, I've been working on a simple AI to play the game for me. The code uses expectimax search to evaluate each move, and chooses the move that maximizes the search as the next move to execute.

Similar to what others have suggested, the evaluation function examines monotonicity along a "snake path"; I also penalize boards that don't keep the largest tile at the snake path's terminal.

The AI can get to 2048 (and even 4096 occasionally), but not further. One of the things I've had difficulty with is tuning the evaluation function, and looking at what to try next. I have taken a look at awarding bonuses for zero tiles (for example, z*log(s) where z is the number of zeros and s is the total sum of the tiles on the board; zeros become more valuable deeper into the game) but haven't had much luck improving things. Rather, I feel like I'm guessing at what to do next to improve the heuristic (or I'll look at a board that defied my human strategy, and try and write something to mirror this), and am curious what someone else might suggest.

The code for the game and the AI can be found here. You can run aiplay with a Game instance.

Below is the relevant AI code:

from game import *
import math
def aimove(b):
 """
 Returns a list of possible moves ("left", "right", "up", "down")
 and each corresponding fitness
 """
 def fitness(b):
 """
 Returns the heuristic value of b
 Snake refers to the "snake line pattern" (http://tinyurl.com/l9bstk6)
 Here we only evaluate one direction; we award more points if high valued tiles
 occur along this path. We penalize the board for not having
 the highest valued tile in the lower left corner
 """
 if Game.over(b):
 return -float("inf")
 snake = []
 for i, col in enumerate(zip(*b)):
 snake.extend(reversed(col) if i % 2 == 0 else col)
 m = max(snake)
 return sum(x/10**n for n, x in enumerate(snake)) - \
 math.pow((b[3][0] != m)*abs(b[3][0] - m), 2)
 def search(b, d, move=False):
 """
 Performs expectimax search on a given configuration to
 specified depth (d).
 Algorithm details:
 - if the AI needs to move, make each child move,
 recurse, return the maximum fitness value
 - if it is not the AI's turn, form all
 possible child spawns, and return their weighted average 
 as that node's evaluation
 """
 if d == 0 or (move and Game.over(b)):
 return fitness(b)
 alpha = fitness(b)
 if move:
 for _, child in Game.actions(b):
 return max(alpha, search(child, d-1))
 else:
 alpha = 0
 zeros = [(i,j) for i,j in itertools.product(range(4), range(4)) if b[i][j] == 0]
 for i, j in zeros:
 c1 = [[x for x in row] for row in b]
 c2 = [[x for x in row] for row in b]
 c1[i][j] = 2
 c2[i][j] = 4
 alpha += .9*search(c1, d-1, True)/len(zeros) + \
 .1*search(c2, d-1, True)/len(zeros)
 return alpha
 return [(action, search(child, 5)) for action ,child in Game.actions(b)]
def aiplay(game):
 """
 Runs a game instance playing the move determined
 by aimove.
 """
 b = game.b
 while True:
 print(Game.string(b) + "\n")
 action = max(aimove(b), key = lambda x: x[1])[0]
 if action == "left" : b = Game.left(b)
 if action == "right": b = Game.right(b)
 if action == "up" : b = Game.up(b)
 if action == "down" : b = Game.down(b)
 b = Game.spawn(b, 1)
 if Game.over(b):
 m = max(x for row in b for x in row)
 print("game over...best was %s" %m)
 print(Game.string(b))
 break

Question 2

How does one run the game with the AI together?

Question 3

@ArtOfCode: please see my update. Basically, just get a board b = Game().b; calling aiplay(b) will run the ai.

Question 4

@ArtOfCode: one improvement I should make: aiplay should really take a game and not that game's b. Thanks!

Question 5

Worry not, I got it to work.

Question 6

Getting evaluation functions right is always the hardest part of projects like this. Mine always seem to be totally ineffective, wildly inefficient, or both. I would troll some journals and see if anyone's published anything on game evaluation functions for 2048. A Google search turned up a few resources. If you haven't seen stackoverflow.com/q/22342854/3376926, that might be worth a look too.

Question 7

Don't use wildcard imports

PEP8 advises against this:

from game import *

Rewrite like this:

import math
import itertools
from game import Game

Use `elif` for mutually exclusive `if`s

These if statements cannot all be true:

if action == "left":
 b = Game.left(b)
if action == "right":
 b = Game.right(b)
if action == "up":
 b = Game.up(b)
if action == "down":
 b = Game.down(b)

But they will all be evaluated, even if action is "left", it will be checked against "right", "up", "down", pointlessly.

if action == "left":
 b = Game.left(b)
elif action == "right":
 b = Game.right(b)
elif action == "up":
 b = Game.up(b)
elif action == "down":
 b = Game.down(b)

Avoid shadowing variables

In this code, the parameter b of aimove is going to be shadowed by the parameters in fitness, and the many other methods where you reuse this name:

def aimove(b):
 # ...
 def fitness(b):
 # ...
 def search(b, d, move=False):
 # ...

This can be confusing and can lead to errors.

Dealing with long lines

This is just a small tip. I don't like using \ for breaking new lines like this:

alpha += .9 * search(c1, d - 1, True) / len(zeros) + \
 .1 * search(c2, d - 1, True) / len(zeros)

I prefer to put the expression within parentheses, which is another way of breaking a long line, like this:

alpha += (.9 * search(c1, d - 1, True) / len(zeros) +
 .1 * search(c2, d - 1, True) / len(zeros))

Easier way to make deep copies

c1 = [[x for x in row] for row in b]
c2 = [[x for x in row] for row in b]

An easier way:

from copy import deepcopy
c1 = deepcopy(b)
c2 = deepcopy(b)

Other simplifications

This expressions can be written simpler:

(b2[3][0] != m) * abs(b2[3][0] - m)

When b2[3][0] is not equal to m, the expression becomes:

1 * abs(b2[3][0] - m)
abs(b2[3][0] - m)

When b2[3][0] is equal to m, the expression becomes:

0 * abs(b2[3][0] - m)
0 * abs(m - m)
0 * abs(0)
0 * 0

Since the multiplication with (b2[3][0] != m) doesn't change the outcome, you can simply write:

abs(b2[3][0] - m)

`Game` shouldn't be a class, in its present form

The posted code doesn't really use a Game instance. It just takes the initial board out of the Game instance it receives, but that's all the data a Game instance contains, and it's never used again. All the calls involving Game are static Game.some_function calls. This is not a normal practice, and confusing.

As such, it would be better to move all the functions of Game to package scope, do import game, and call the methods like game.actions, game.over, and so on.

You point out in a comment that you like to use a class for logical grouping purposes. But classes are not the only thing that can do that, packages can do that too, in fact it's a very good way of using packages.

However, the option is probably to keep the Game class, but make its methods non-static, and work with instances, instead of calling static methods. Instead of manipulating a board directly, let the Game class contain the board, and hide within the details of all the operations on the board. It will be good encapsulation. For example, the static call Game.over(b) should be replaced with the instance method call game.over(), with no more board parameter.

Question 8

Thanks for the tip about the if/elif and the penalty assessment in the evaluation function. Is putting my functions in a class really that egregious, though? I like the logical grouping classes provide, and the fact that Game constructs an initial configuration.

Question 9

I added more details to my last point, with an alternative that's better, and you might like it more too.

Question 10

Thanks for the update. For state space search, I like to avoid mutability, and this is the reason my functions for really manipulating the board/advancing the game are static. What are your thoughts on this? Would you suggest that I make the move functions return a new instance? Thank you for your time -- your design advice is very good! Do you have any thoughts on improving/evaluating the evaluation function? In general, I'm kind of at a loss for what to try next, or even how to confirm that I'm going in the right direction. Some advice on this thought process would be much appreciated.

Question 11

In the current program you are carrying around a board that gets repatedly reassigned to the next state. This isn't really different from carrying around a game instance that mututes the board inside of it. But this latter setup will be more natural, and good step in the direction of proper encapsulation and information hiding. I didn't want to focus too much on Game though, as it was not part of the code posted in the question.

Question 12

As for your algorithm... I didn't have time to look too close. I thought you implemented expectimax as you mentioned in the description, and didn't really think of verifying (you are supposed to post fully working code). Is that what you're looking for? Verifying if your expectimax implementation is correct?

Question 13

Your code is actually pretty good! I do a few suggestions for general improvement, specifically regarding PEP8 and such.

Running your code through a PEP8 checker, you have about ~20 PEP8 errors, most of them being whitespace errors, such as missing two blank lines between functions, or the following line.

if action == "up" : b = Game.up(b)

Another point I want to mention would be variable naming. While, in this scenario, not having looked at the other background code, I may be wrong. Some of your variables have short, unclear names, like c1, b, or d.

For now, that's all I can really suggest. If there's anything else that you want me to cover, just mention it in the comments, I'll see if I can cover it. I hope this helps!

Question 14

Thanks for the review, Ethan. Didn't know about the PEP8 checker (what a great resource!) I was more hoping people could suggest improvements to the AI evaluation function (i.e. improve how well the AI plays the game). I feel like I'm going in circles, and any algorithm improvements along with stylistic/structural code improvements would be much appreciated. Maybe CR stackexchange isn't well suited for this first point, but I did want to post the code for review. Thanks again!

Question 15

One more thing. You're right -- using b is bad -- I should just use board. c1 refers to "child one" as there are two spawn possibilities (we either get a two or a four). d refers to the depth of the search. Maybe depth is a better name; I think it's relatively clear if you're familiar with recursive depth limited search.

Question 16

@rookie I'm not too familiar with recursive depth limited search, but I'll see what I can review. In the meantime, it's time for me to find a pillow and a bed. :)

janos janos 113k15 gold badges154 silver badges396 bronze badges · Accepted Answer · 2015-06-07 06:53:48Z

Don't use wildcard imports

PEP8 advises against this:

from game import *

Rewrite like this:

import math
import itertools
from game import Game

Use `elif` for mutually exclusive `if`s

These if statements cannot all be true:

if action == "left":
 b = Game.left(b)
if action == "right":
 b = Game.right(b)
if action == "up":
 b = Game.up(b)
if action == "down":
 b = Game.down(b)

But they will all be evaluated, even if action is "left", it will be checked against "right", "up", "down", pointlessly.

if action == "left":
 b = Game.left(b)
elif action == "right":
 b = Game.right(b)
elif action == "up":
 b = Game.up(b)
elif action == "down":
 b = Game.down(b)

Avoid shadowing variables

In this code, the parameter b of aimove is going to be shadowed by the parameters in fitness, and the many other methods where you reuse this name:

def aimove(b):
 # ...
 def fitness(b):
 # ...
 def search(b, d, move=False):
 # ...

This can be confusing and can lead to errors.

Dealing with long lines

This is just a small tip. I don't like using \ for breaking new lines like this:

alpha += .9 * search(c1, d - 1, True) / len(zeros) + \
 .1 * search(c2, d - 1, True) / len(zeros)

I prefer to put the expression within parentheses, which is another way of breaking a long line, like this:

alpha += (.9 * search(c1, d - 1, True) / len(zeros) +
 .1 * search(c2, d - 1, True) / len(zeros))

Easier way to make deep copies

c1 = [[x for x in row] for row in b]
c2 = [[x for x in row] for row in b]

An easier way:

from copy import deepcopy
c1 = deepcopy(b)
c2 = deepcopy(b)

Other simplifications

This expressions can be written simpler:

(b2[3][0] != m) * abs(b2[3][0] - m)

When b2[3][0] is not equal to m, the expression becomes:

1 * abs(b2[3][0] - m)
abs(b2[3][0] - m)

When b2[3][0] is equal to m, the expression becomes:

0 * abs(b2[3][0] - m)
0 * abs(m - m)
0 * abs(0)
0 * 0

Since the multiplication with (b2[3][0] != m) doesn't change the outcome, you can simply write:

abs(b2[3][0] - m)

`Game` shouldn't be a class, in its present form

The posted code doesn't really use a Game instance. It just takes the initial board out of the Game instance it receives, but that's all the data a Game instance contains, and it's never used again. All the calls involving Game are static Game.some_function calls. This is not a normal practice, and confusing.

As such, it would be better to move all the functions of Game to package scope, do import game, and call the methods like game.actions, game.over, and so on.

You point out in a comment that you like to use a class for logical grouping purposes. But classes are not the only thing that can do that, packages can do that too, in fact it's a very good way of using packages.

However, the option is probably to keep the Game class, but make its methods non-static, and work with instances, instead of calling static methods. Instead of manipulating a board directly, let the Game class contain the board, and hide within the details of all the operations on the board. It will be good encapsulation. For example, the static call Game.over(b) should be replaced with the instance method call game.over(), with no more board parameter.

Thanks for the tip about the if/elif and the penalty assessment in the evaluation function. Is putting my functions in a class really that egregious, though? I like the logical grouping classes provide, and the fact that Game constructs an initial configuration.
I added more details to my last point, with an alternative that's better, and you might like it more too.
Thanks for the update. For state space search, I like to avoid mutability, and this is the reason my functions for really manipulating the board/advancing the game are static. What are your thoughts on this? Would you suggest that I make the move functions return a new instance? Thank you for your time -- your design advice is very good! Do you have any thoughts on improving/evaluating the evaluation function? In general, I'm kind of at a loss for what to try next, or even how to confirm that I'm going in the right direction. Some advice on this thought process would be much appreciated.
In the current program you are carrying around a board that gets repatedly reassigned to the next state. This isn't really different from carrying around a game instance that mututes the board inside of it. But this latter setup will be more natural, and good step in the direction of proper encapsulation and information hiding. I didn't want to focus too much on Game though, as it was not part of the code posted in the question.
As for your algorithm... I didn't have time to look too close. I thought you implemented expectimax as you mentioned in the description, and didn't really think of verifying (you are supposed to post fully working code). Is that what you're looking for? Verifying if your expectimax implementation is correct?

Stack Exchange Network

Simple 2048 AI in Python

2 Answers 2

Don't use wildcard imports

Use `elif` for mutually exclusive `if`s

Avoid shadowing variables

Dealing with long lines

Easier way to make deep copies

Other simplifications

`Game` shouldn't be a class, in its present form

Your Answer

Sign up or log in

Post as a guest

Post as a guest

Linked

Hot Network Questions

Simple 2048 AI in Python

2 Answers 2

Don't use wildcard imports

Use elif for mutually exclusive ifs

Avoid shadowing variables

Dealing with long lines

Easier way to make deep copies

Other simplifications

Game shouldn't be a class, in its present form

Your Answer

Sign up or log in

Post as a guest

Post as a guest

Linked

Related

Hot Network Questions

Use `elif` for mutually exclusive `if`s

`Game` shouldn't be a class, in its present form