I have just written this code here. The code is about creating a (local LLM based) AI agent to solve the N puzzle. I should stress that I am the one who wrote the code, and I did not use AI to write it.
The LLM is given some prompt, with instructions that it could control the following functions: move_up, move_down, move_left, move_right. At each turn, the LLM will try to choose from those functions, and the moves would then be made.
The qwen3:latest model in the Ollama library was used as the agent, while I chose a simple N puzzle as the problem for it to solve. Experiments were done on an ASUS Vivobook Pro 15 laptop, with a NVIDIA GeForce RTX 4060 having 8GB of VRAM.
Here is code for the demo_1_agent.py
file:
from pdb import set_trace
import os
import json
from copy import deepcopy
import requests
import math
from inspect import signature
import numpy as np
from pprint import pprint
from typing import Annotated, Sequence, TypedDict
from pydantic import BaseModel, Field
from ollama import chat
from ollama import ChatResponse
import pyautogui
pyautogui.PAUSE = 1.0
MOVE_UP_BUTTON_POS = (285, 559)
MOVE_DOWN_BUTTON_POS = (279, 718)
MOVE_LEFT_BUTTON_POS = (195, 646)
MOVE_RIGHT_BUTTON_POS = (367, 647)
class MoveList(BaseModel):
moves: list[str]
def get_n_digit(num):
if num > 0:
digits = int(math.log10(num))+1
elif num == 0:
digits = 1
else:
digits = int(math.log10(-num))+2 # +1 if you don't count the '-'
return digits
def parse_json_garbage(s):
s = s[next(idx for idx, c in enumerate(s) if c in "{["):]
try:
return json.loads(s)
except json.JSONDecodeError as e:
return json.loads(s[:e.pos])
class GameState:
def __init__(self, start, goal):
self.start = start
self.goal = goal
self.size = start.shape[0]
self.state = deepcopy(start)
def get_state(self):
return self.state
def finished(self):
return (self.state==self.goal).all()
def print_state(self, no_print=False):
max_elem = np.max(self.state)
n_digit = get_n_digit(max_elem)
state_text = ""
for row_idx in range(self.size):
for col_idx in range(self.size):
if int(self.state[row_idx, col_idx]) != 0:
text = '{num:0{width}} '.format(num=self.state[row_idx, col_idx], width=n_digit)
else:
text = "_" * (n_digit) + " "
state_text += text
state_text += "\n"
if no_print is False:
print(state_text)
return state_text
def create_diff_view(self):
"""Show which tiles are out of place"""
diff_state = ""
for i in range(self.size):
for j in range(self.size):
current = self.state[i, j]
target = self.goal[i, j]
if current == target:
diff_state += f"✓{current} "
else:
diff_state += f"✗{current} "
diff_state += "\n"
return diff_state
def move_up(self):
itemindex = np.where(self.state == 0)
pos_row = int(itemindex[0][0])
pos_col = int(itemindex[1][0])
if (pos_row == 0):
return
temp = self.state[pos_row, pos_col]
self.state[pos_row, pos_col] = self.state[pos_row-1, pos_col]
self.state[pos_row-1, pos_col] = temp
def move_down(self):
itemindex = np.where(self.state == 0)
pos_row = int(itemindex[0][0])
pos_col = int(itemindex[1][0])
if (pos_row == (self.size-1)):
return
temp = self.state[pos_row, pos_col]
self.state[pos_row, pos_col] = self.state[pos_row+1, pos_col]
self.state[pos_row+1, pos_col] = temp
def move_left(self):
itemindex = np.where(self.state == 0)
pos_row = int(itemindex[0][0])
pos_col = int(itemindex[1][0])
if (pos_col == 0):
return
temp = self.state[pos_row, pos_col]
self.state[pos_row, pos_col] = self.state[pos_row, pos_col-1]
self.state[pos_row, pos_col-1] = temp
def move_right(self):
itemindex = np.where(self.state == 0)
pos_row = int(itemindex[0][0])
pos_col = int(itemindex[1][0])
if (pos_col == (self.size-1)):
return
temp = self.state[pos_row, pos_col]
self.state[pos_row, pos_col] = self.state[pos_row, pos_col+1]
self.state[pos_row, pos_col+1] = temp
# 8-puzzle
start = np.array([
[0, 1, 3],
[4, 2, 5],
[7, 8, 6],
])
goal = np.array([
[1, 2, 3],
[4, 5, 6],
[7, 8, 0],
])
# 15-puzzle
# start = np.array([
# [ 6, 13, 7, 10],
# [ 8, 9, 11, 0],
# [15, 2, 12, 5],
# [14, 3, 1, 4],
# ])
# goal = np.array([
# [ 1, 2, 3, 4],
# [ 5, 6, 7, 8],
# [ 9, 10, 11, 12],
# [13, 14, 15, 0],
# ])
game_state = GameState(start, goal)
def move_up():
"""Move the '_' tile up by one block, swapping the tile with the number above. Returns the text describing the new game state after moving up."""
game_state.move_up()
pyautogui.moveTo(MOVE_UP_BUTTON_POS[0], MOVE_UP_BUTTON_POS[1])
pyautogui.click()
return game_state.print_state(no_print=True)
def move_down():
"""Move the '_' tile down by one block, swapping the tile with the number below. Returns the text describing the new game state after moving down."""
game_state.move_down()
pyautogui.moveTo(MOVE_DOWN_BUTTON_POS[0], MOVE_DOWN_BUTTON_POS[1])
pyautogui.click()
return game_state.print_state(no_print=True)
def move_left():
"""Move the '_' tile left by one block, swapping the tile with the number to the left. Returns the text describing the new game state after moving left."""
game_state.move_left()
pyautogui.moveTo(MOVE_LEFT_BUTTON_POS[0], MOVE_LEFT_BUTTON_POS[1])
pyautogui.click()
return game_state.print_state(no_print=True)
def move_right():
"""Move the '_' tile right by one block, swapping the tile with the number to the right. Returns the text describing the new game state after moving right."""
game_state.move_right()
pyautogui.moveTo(MOVE_RIGHT_BUTTON_POS[0], MOVE_RIGHT_BUTTON_POS[1])
pyautogui.click()
return game_state.print_state(no_print=True)
def main():
# game_state.print_state()
max_elem = np.max(goal)
n_digit = get_n_digit(max_elem)
size = goal.shape[0]
goal_text = ""
tool_list = [move_up, move_down, move_left, move_right]
for row_idx in range(size):
for col_idx in range(size):
if int(goal[row_idx, col_idx]) != 0:
text = '{num:0{width}} '.format(num=goal[row_idx, col_idx], width=n_digit)
else:
text = "_" * (n_digit) + " "
goal_text += text
goal_text += "\n"
# state_text = game_state.print_state()
# SYSTEM_PROMPT = f"""You are playing the N-puzzle game.
# The goal is as follows:
# {goal_text}
# This is the current state:
# {state_text}
# The difference between the goal and the current state is as follows (✓ is correct, ✗ is incorrect):
# {game_state.create_diff_view()}
# You need to find moves to go from the current state to the goal, such that all positions in current state are the same as the goal. At each turn, you can either move up, move down, move left, or move right.
# When you move the tile, the position of the tile will be swapped with the number at the place where you move to.
# You have access to the following tools:
# """
# for tool in tool_list:
# description = f"def {tool.__name__}({str(signature(tool))}).\n{tool.__doc__}\n"
# SYSTEM_PROMPT += description
# SYSTEM_PROMPT += """The way you use the tools is by specifying a json blob.
# Specifically, this json should have an `action` key (with the name of the tool to use) and an `action_input` key (with the input to the tool going here).
# example use :
# {{
# "action": "get_weather",
# "action_input": {"location": "New York"}
# }}
# ALWAYS use the following format:
# Question: the input question you must answer (list the goal first, then the current state)
# Thought: you should always think about one action to take. Only ONE action at a time in this format:
# Action:
# $JSON_BLOB (inside markdown cell)
# Observation: the result of the action. This Observation is unique, complete, and the source of truth.
# (this Thought/Action/Observation can repeat at most 3 times, you should take several steps when needed. The $JSON_BLOB must be formatted as markdown and only use a SINGLE action at a time.)
# You must always end your output with the following format:
# Thought: I now know the final answer
# Final Answer: The json blob detailing the FIRST move of the chosen list of moves based on your observations.
# Now begin! Reminder to ALWAYS use the exact characters `Final Answer:` when you provide a definitive answer.
# """
# set_trace()
messages = []
# messages.append({
# 'role': 'system',
# 'content': SYSTEM_PROMPT,
# },)
move_idx = 1
n_json_begin = 0
n_json_end = 0
new_json_begin = False
print(f"{move_idx}) CURRENT STATE:")
state_text = game_state.print_state()
is_finished = False
while True:
# move_idx += 1
the_message = f"""You are an N-puzzle solver. Your job is to output ONLY move commands.
You need to find moves to go from the current state to the goal, such that all positions in current state are the same as the goal. At each turn, you can either move up, move down, move left, or move right.
When you move the tile, the position of the tile will be swapped with the number at the place where you move to.
CURRENT STATE:
{state_text}
GOAL STATE:
{goal_text}
The difference between the goal and the current state is as follows (✓ is correct, ✗ is incorrect):
{game_state.create_diff_view()}
RULES:
- You can only use: move_up, move_down, move_left, move_right
- Provide reasoning for your answer. BUT at the end of your answer, write "FINAL ANSWER", then output ONLY the move commands, separated by commas
- Maximum 5 moves per response
EXAMPLE_OUTPUT (the "FINAL ANSWER" section):
move_left, move_right, move_up, move_down
YOUR RESPONSE (moves only):
"""
# print(the_message)
# messages = []
messages.append({
'role': 'user',
'content': the_message,
},)
response: ChatResponse = chat(model='qwen3:latest', messages=messages,
stream=True,
# format=MoveList.model_json_schema(),
)
response_text_full = ""
for chunk in response:
# Print model content
print(chunk.message.content, end='', flush=True)
response_text_full += chunk.message.content
# response_text_full = response['message']['content']
# print(response['message']['content'])
# response_text_full = response_text_full[response_text_full.find("Final Answer"):]
if "FINAL ANSWER" in response_text_full:
response_text_full = response_text_full[response_text_full.find("FINAL ANSWER"):]
response_text_full = response_text_full.replace("FINAL ANSWER", "")
response_text_full = response_text_full.replace("\n", "")
response_text_full = response_text_full.replace(":", "")
elif "Final Answer" in response_text_full:
response_text_full = response_text_full[response_text_full.find("Final Answer"):]
response_text_full = response_text_full.replace("Final Answer", "")
response_text_full = response_text_full.replace("\n", "")
response_text_full = response_text_full.replace(":", "")
else:
response_text_full = response_text_full[response_text_full.find("</think>"):]
response_text_full = response_text_full.replace("</think>", "")
response_text_full = response_text_full.replace("\n", "")
response_text_full = response_text_full.replace("```", "")
move_list = response_text_full.split(",")
move_list = [elem.rstrip().lstrip() for elem in move_list]
pprint(response_text_full)
for move_name_idx, move_name in enumerate(move_list):
move_name = move_name.replace("****", "")
print(f"Chosen move: {move_name}")
move_name = move_name.replace("(", "")
move_name = move_name.replace(")", "")
try:
func_result = globals()[move_name]()
except:
try:
if "move_right" in move_name:
func_result = globals()[move_right]()
elif "move_left" in move_name:
func_result = globals()[move_left]()
elif "move_down" in move_name:
func_result = globals()[move_down]()
elif "move_up" in move_name:
func_result = globals()[move_up]()
except:
try:
if ("Right" in move_name) or ("right" in move_name):
func_result = globals()[move_right]()
elif ("Left" in move_name) or ("left" in move_name):
func_result = globals()[move_left]()
elif ("Down" in move_name) or ("down" in move_name):
func_result = globals()[move_down]()
elif ("Up" in move_name) or ("up" in move_name):
func_result = globals()[move_up]()
except:
continue
move_idx += 1
print(f"{move_idx}) CURRENT STATE:")
state_text = game_state.print_state()
if game_state.finished():
is_finished = True
break
if game_state.finished():
print("FINISHED!")
is_finished = True
break
if __name__ == "__main__":
main()
And here is the code for the demo_1_n_puzzle_gui.py
file:
import tkinter as tk
from tkinter import messagebox
import numpy as np
import random
import sys
from pdb import set_trace
class NPuzzle:
def __init__(self, size=3):
self.size = size
self.root = tk.Tk()
self.root.title("N-Puzzle Game")
self.root.geometry("450x600")
# Initialize the puzzle state as 2D numpy array
puzzle_1d = list(range(1, size * size)) + [0] # 0 represents empty space
# self.puzzle = np.array(puzzle_1d).reshape(size, size)
self.puzzle = np.array([
[0, 1, 3],
[4, 2, 5],
[7, 8, 6],
])
# set_trace()
# Create GUI elements
self.create_widgets()
# self.shuffle_puzzle()
self.update_display()
def create_widgets(self):
# Title
title_label = tk.Label(self.root, text="N-Puzzle Game", font=("Arial", 20, "bold"))
title_label.pack(pady=10)
# Game board frame
self.board_frame = tk.Frame(self.root, bg="lightgray", padx=5, pady=5)
self.board_frame.pack(pady=10)
# Create buttons for puzzle pieces
self.buttons = []
for i in range(self.size):
row = []
for j in range(self.size):
btn = tk.Button(
self.board_frame,
text="",
width=6,
height=3,
font=("Arial", 16, "bold"),
command=lambda r=i, c=j: self.move_piece(r, c)
)
btn.grid(row=i, column=j, padx=2, pady=2)
row.append(btn)
self.buttons.append(row)
# Control buttons frame
control_frame = tk.Frame(self.root)
control_frame.pack(pady=20)
# Movement buttons
tk.Button(control_frame, text="↑", width=5, height=2, font=("Arial", 12, "bold"),
command=self.move_down).grid(row=0, column=1, padx=5, pady=5)
tk.Button(control_frame, text="←", width=5, height=2, font=("Arial", 12, "bold"),
command=self.move_right).grid(row=1, column=0, padx=5, pady=5)
tk.Button(control_frame, text="→", width=5, height=2, font=("Arial", 12, "bold"),
command=self.move_left).grid(row=1, column=2, padx=5, pady=5)
tk.Button(control_frame, text="↓", width=5, height=2, font=("Arial", 12, "bold"),
command=self.move_up).grid(row=2, column=1, padx=5, pady=5)
# Shuffle button
shuffle_btn = tk.Button(self.root, text="Shuffle", width=10, height=2,
font=("Arial", 12, "bold"), bg="lightblue",
command=self.shuffle_puzzle)
shuffle_btn.pack(pady=10)
# Bind keyboard events
self.root.bind('<Key>', self.on_key_press)
self.root.focus_set()
def get_empty_position(self):
"""Find the position of the empty space (0)"""
empty_pos = np.where(self.puzzle == 0)
return empty_pos[0][0], empty_pos[1][0]
def move_piece(self, row, col):
"""Move a piece if it's adjacent to the empty space"""
empty_row, empty_col = self.get_empty_position()
# Check if the clicked piece is adjacent to empty space
if (abs(row - empty_row) == 1 and col == empty_col) or \
(abs(col - empty_col) == 1 and row == empty_row):
# Swap the piece with empty space
self.puzzle[row, col], self.puzzle[empty_row, empty_col] = \
self.puzzle[empty_row, empty_col], self.puzzle[row, col]
self.update_display()
self.check_win()
def move_up(self):
"""Move empty space up (piece below moves up)"""
empty_row, empty_col = self.get_empty_position()
if empty_row < self.size - 1:
self.move_piece(empty_row + 1, empty_col)
def move_down(self):
"""Move empty space down (piece above moves down)"""
empty_row, empty_col = self.get_empty_position()
if empty_row > 0:
self.move_piece(empty_row - 1, empty_col)
def move_left(self):
"""Move empty space left (piece to the right moves left)"""
empty_row, empty_col = self.get_empty_position()
if empty_col < self.size - 1:
self.move_piece(empty_row, empty_col + 1)
def move_right(self):
"""Move empty space right (piece to the left moves right)"""
empty_row, empty_col = self.get_empty_position()
if empty_col > 0:
self.move_piece(empty_row, empty_col - 1)
def on_key_press(self, event):
"""Handle keyboard input"""
key = event.keysym.lower()
if key == 'up':
self.move_up()
elif key == 'down':
self.move_down()
elif key == 'left':
self.move_left()
elif key == 'right':
self.move_right()
def shuffle_puzzle(self):
"""Shuffle the puzzle by making random valid moves"""
for _ in range(1000): # Make 1000 random moves
moves = []
empty_row, empty_col = self.get_empty_position()
# Add possible moves
if empty_row > 0:
moves.append(self.move_down)
if empty_row < self.size - 1:
moves.append(self.move_up)
if empty_col > 0:
moves.append(self.move_right)
if empty_col < self.size - 1:
moves.append(self.move_left)
# Make a random move
if moves:
random.choice(moves)()
self.update_display()
def update_display(self):
"""Update the button display"""
for i in range(self.size):
for j in range(self.size):
value = self.puzzle[i, j]
if value == 0:
self.buttons[i][j].config(text="", bg="lightgray", state="disabled")
else:
self.buttons[i][j].config(text=str(value), bg="white", state="normal")
def check_win(self):
"""Check if the puzzle is solved"""
target_1d = list(range(1, self.size * self.size)) + [0]
target = np.array(target_1d).reshape(self.size, self.size)
# print(target)
if np.array_equal(self.puzzle, target):
messagebox.showinfo("Congratulations!", "You solved the puzzle!")
sys.exit()
def run(self):
"""Start the game"""
self.root.mainloop()
# Create and run the game
if __name__ == "__main__":
game = NPuzzle()
game.run()
Please provide some comments for my code. Any feedback and suggestions for improvement is welcome.
Github link: https://github.com/dangmanhtruong1995/N-puzzle-Agent/tree/main
1 Answer 1
demo_1_n_puzzle_gui.py
UX
When I run the code, I have to resize the GUI window to see "Shuffle" button. Perhaps I am using a different version of Python, libraries, OS, etc.
The fancy Unicode arrows in the create_widgets
function do not show up in my source code
editor or in the GUI when I run the code. This is a portability issue.
Imports
You could run code development tools to automatically find some style issues with your code.
ruff
identifies the following unused line:
from pdb import set_trace
It can be deleted. You can either use ruff
or isort
to sort the other import lines
to follow convention:
import random
import sys
import tkinter as tk
from tkinter import messagebox
import numpy as np
Comments
Delete commented-out code to reduce clutter. For example, these lines can be deleted:
# self.puzzle = np.array(puzzle_1d).reshape(size, size)
# set_trace()
# self.shuffle_puzzle()
# print(target)
Unused code
This line is unused and can be deleted:
puzzle_1d = list(range(1, size * size)) + [0] # 0 represents empty space
Size
size
is shown as a variable input to NPuzzle
:
def __init__(self, size=3):
self.size = size
But, it only works for 3 due to the puzzle
array dimensions.
I suggest removing it as an input until you decide to support other sizes:
def __init__(self):
self.size = 3
Documentation
The PEP 8 style guide recommends adding docstrings for classes as you have done for the functions.
demo_1_agent.py
ruff
ruff
identifies more unused imports and variables.
It also advises against using bare except
.
Comments
Again, delete commented-out code to reduce clutter.
Simpler
In lines like this:
if (pos_row == 0):
Parentheses are often omitted:
if pos_row == 0:
The black program can be used to automatically remove the parens.