4
\$\begingroup\$

I have just written this code here. The code is about creating a (local LLM based) AI agent to solve the N puzzle. I should stress that I am the one who wrote the code, and I did not use AI to write it.

The LLM is given some prompt, with instructions that it could control the following functions: move_up, move_down, move_left, move_right. At each turn, the LLM will try to choose from those functions, and the moves would then be made.

The qwen3:latest model in the Ollama library was used as the agent, while I chose a simple N puzzle as the problem for it to solve. Experiments were done on an ASUS Vivobook Pro 15 laptop, with a NVIDIA GeForce RTX 4060 having 8GB of VRAM.

Here is code for the demo_1_agent.py file:

from pdb import set_trace
import os
import json
from copy import deepcopy
import requests
import math
from inspect import signature
import numpy as np
from pprint import pprint
from typing import Annotated, Sequence, TypedDict
from pydantic import BaseModel, Field
from ollama import chat
from ollama import ChatResponse
import pyautogui
pyautogui.PAUSE = 1.0
MOVE_UP_BUTTON_POS = (285, 559)
MOVE_DOWN_BUTTON_POS = (279, 718)
MOVE_LEFT_BUTTON_POS = (195, 646)
MOVE_RIGHT_BUTTON_POS = (367, 647)
class MoveList(BaseModel): 
 moves: list[str]
def get_n_digit(num):
 if num > 0:
 digits = int(math.log10(num))+1
 elif num == 0:
 digits = 1
 else:
 digits = int(math.log10(-num))+2 # +1 if you don't count the '-' 
 return digits
def parse_json_garbage(s):
 s = s[next(idx for idx, c in enumerate(s) if c in "{["):]
 try:
 return json.loads(s)
 except json.JSONDecodeError as e:
 return json.loads(s[:e.pos])
 
class GameState:
 def __init__(self, start, goal):
 self.start = start
 self.goal = goal
 
 self.size = start.shape[0]
 self.state = deepcopy(start)
 
 
 def get_state(self):
 return self.state
 def finished(self):
 return (self.state==self.goal).all()
 def print_state(self, no_print=False):
 max_elem = np.max(self.state)
 n_digit = get_n_digit(max_elem)
 
 state_text = ""
 
 for row_idx in range(self.size):
 for col_idx in range(self.size):
 if int(self.state[row_idx, col_idx]) != 0:
 text = '{num:0{width}} '.format(num=self.state[row_idx, col_idx], width=n_digit)
 else: 
 text = "_" * (n_digit) + " "
 state_text += text
 state_text += "\n"
 if no_print is False:
 print(state_text)
 
 return state_text
 def create_diff_view(self):
 """Show which tiles are out of place"""
 diff_state = ""
 for i in range(self.size):
 for j in range(self.size):
 current = self.state[i, j]
 target = self.goal[i, j]
 if current == target:
 diff_state += f"✓{current} "
 else:
 diff_state += f"✗{current} "
 diff_state += "\n"
 return diff_state
 
 
 def move_up(self):
 itemindex = np.where(self.state == 0)
 pos_row = int(itemindex[0][0])
 pos_col = int(itemindex[1][0])
 
 if (pos_row == 0):
 return
 
 temp = self.state[pos_row, pos_col]
 self.state[pos_row, pos_col] = self.state[pos_row-1, pos_col]
 self.state[pos_row-1, pos_col] = temp
 
 def move_down(self):
 itemindex = np.where(self.state == 0)
 pos_row = int(itemindex[0][0])
 pos_col = int(itemindex[1][0])
 
 if (pos_row == (self.size-1)):
 return
 
 temp = self.state[pos_row, pos_col]
 self.state[pos_row, pos_col] = self.state[pos_row+1, pos_col]
 self.state[pos_row+1, pos_col] = temp
 
 
 def move_left(self):
 itemindex = np.where(self.state == 0)
 pos_row = int(itemindex[0][0])
 pos_col = int(itemindex[1][0])
 
 if (pos_col == 0):
 return
 
 temp = self.state[pos_row, pos_col]
 self.state[pos_row, pos_col] = self.state[pos_row, pos_col-1]
 self.state[pos_row, pos_col-1] = temp
 
 def move_right(self):
 itemindex = np.where(self.state == 0)
 pos_row = int(itemindex[0][0])
 pos_col = int(itemindex[1][0])
 
 if (pos_col == (self.size-1)):
 return
 
 temp = self.state[pos_row, pos_col]
 self.state[pos_row, pos_col] = self.state[pos_row, pos_col+1]
 self.state[pos_row, pos_col+1] = temp
# 8-puzzle
start = np.array([
 [0, 1, 3],
 [4, 2, 5],
 [7, 8, 6],
])
goal = np.array([
 [1, 2, 3],
 [4, 5, 6],
 [7, 8, 0],
])
# 15-puzzle
# start = np.array([
 # [ 6, 13, 7, 10],
 # [ 8, 9, 11, 0],
 # [15, 2, 12, 5],
 # [14, 3, 1, 4],
# ])
# goal = np.array([
 # [ 1, 2, 3, 4],
 # [ 5, 6, 7, 8],
 # [ 9, 10, 11, 12],
 # [13, 14, 15, 0],
# ])
game_state = GameState(start, goal)
def move_up():
 """Move the '_' tile up by one block, swapping the tile with the number above. Returns the text describing the new game state after moving up."""
 game_state.move_up()
 pyautogui.moveTo(MOVE_UP_BUTTON_POS[0], MOVE_UP_BUTTON_POS[1])
 pyautogui.click()
 return game_state.print_state(no_print=True)
 
 
def move_down():
 """Move the '_' tile down by one block, swapping the tile with the number below. Returns the text describing the new game state after moving down."""
 game_state.move_down()
 pyautogui.moveTo(MOVE_DOWN_BUTTON_POS[0], MOVE_DOWN_BUTTON_POS[1])
 pyautogui.click()
 return game_state.print_state(no_print=True)
 
 
def move_left():
 """Move the '_' tile left by one block, swapping the tile with the number to the left. Returns the text describing the new game state after moving left."""
 game_state.move_left()
 pyautogui.moveTo(MOVE_LEFT_BUTTON_POS[0], MOVE_LEFT_BUTTON_POS[1])
 pyautogui.click()
 return game_state.print_state(no_print=True)
 
 
def move_right():
 """Move the '_' tile right by one block, swapping the tile with the number to the right. Returns the text describing the new game state after moving right."""
 game_state.move_right()
 pyautogui.moveTo(MOVE_RIGHT_BUTTON_POS[0], MOVE_RIGHT_BUTTON_POS[1])
 pyautogui.click()
 return game_state.print_state(no_print=True)
 
def main():
 # game_state.print_state() 
 
 max_elem = np.max(goal)
 n_digit = get_n_digit(max_elem)
 size = goal.shape[0]
 goal_text = ""
 
 tool_list = [move_up, move_down, move_left, move_right]
 
 for row_idx in range(size):
 for col_idx in range(size):
 if int(goal[row_idx, col_idx]) != 0:
 text = '{num:0{width}} '.format(num=goal[row_idx, col_idx], width=n_digit)
 else: 
 text = "_" * (n_digit) + " "
 goal_text += text
 goal_text += "\n"
 
 # state_text = game_state.print_state()
 # SYSTEM_PROMPT = f"""You are playing the N-puzzle game.
# The goal is as follows:
# {goal_text}
# This is the current state:
# {state_text}
# The difference between the goal and the current state is as follows (✓ is correct, ✗ is incorrect):
# {game_state.create_diff_view()}
# You need to find moves to go from the current state to the goal, such that all positions in current state are the same as the goal. At each turn, you can either move up, move down, move left, or move right. 
# When you move the tile, the position of the tile will be swapped with the number at the place where you move to. 
# You have access to the following tools: 
# """
 
 # for tool in tool_list:
 # description = f"def {tool.__name__}({str(signature(tool))}).\n{tool.__doc__}\n"
 # SYSTEM_PROMPT += description
 
 # SYSTEM_PROMPT += """The way you use the tools is by specifying a json blob.
# Specifically, this json should have an `action` key (with the name of the tool to use) and an `action_input` key (with the input to the tool going here).
# example use :
# {{
 # "action": "get_weather",
 # "action_input": {"location": "New York"}
# }}
# ALWAYS use the following format:
# Question: the input question you must answer (list the goal first, then the current state)
# Thought: you should always think about one action to take. Only ONE action at a time in this format:
# Action:
# $JSON_BLOB (inside markdown cell)
# Observation: the result of the action. This Observation is unique, complete, and the source of truth.
# (this Thought/Action/Observation can repeat at most 3 times, you should take several steps when needed. The $JSON_BLOB must be formatted as markdown and only use a SINGLE action at a time.)
# You must always end your output with the following format:
# Thought: I now know the final answer
# Final Answer: The json blob detailing the FIRST move of the chosen list of moves based on your observations.
# Now begin! Reminder to ALWAYS use the exact characters `Final Answer:` when you provide a definitive answer.
# """
 # set_trace()
 
 messages = []
 
 
 # messages.append({
 # 'role': 'system',
 # 'content': SYSTEM_PROMPT,
 # },)
 
 move_idx = 1
 
 n_json_begin = 0
 n_json_end = 0
 new_json_begin = False
 
 print(f"{move_idx}) CURRENT STATE:")
 state_text = game_state.print_state()
 
 is_finished = False
 
 while True: 
 # move_idx += 1
 
 the_message = f"""You are an N-puzzle solver. Your job is to output ONLY move commands.
You need to find moves to go from the current state to the goal, such that all positions in current state are the same as the goal. At each turn, you can either move up, move down, move left, or move right. 
When you move the tile, the position of the tile will be swapped with the number at the place where you move to. 
CURRENT STATE:
{state_text}
GOAL STATE:
{goal_text}
The difference between the goal and the current state is as follows (✓ is correct, ✗ is incorrect):
{game_state.create_diff_view()}
RULES:
- You can only use: move_up, move_down, move_left, move_right
- Provide reasoning for your answer. BUT at the end of your answer, write "FINAL ANSWER", then output ONLY the move commands, separated by commas
- Maximum 5 moves per response
EXAMPLE_OUTPUT (the "FINAL ANSWER" section):
move_left, move_right, move_up, move_down
YOUR RESPONSE (moves only):
"""
 # print(the_message)
 # messages = []
 messages.append({
 'role': 'user',
 'content': the_message,
 },)
 
 
 response: ChatResponse = chat(model='qwen3:latest', messages=messages, 
 stream=True, 
 # format=MoveList.model_json_schema(),
 )
 
 response_text_full = ""
 for chunk in response:
 # Print model content
 print(chunk.message.content, end='', flush=True)
 response_text_full += chunk.message.content
 
 # response_text_full = response['message']['content']
 # print(response['message']['content'])
 # response_text_full = response_text_full[response_text_full.find("Final Answer"):]
 
 if "FINAL ANSWER" in response_text_full:
 response_text_full = response_text_full[response_text_full.find("FINAL ANSWER"):]
 response_text_full = response_text_full.replace("FINAL ANSWER", "")
 response_text_full = response_text_full.replace("\n", "")
 response_text_full = response_text_full.replace(":", "")
 elif "Final Answer" in response_text_full:
 response_text_full = response_text_full[response_text_full.find("Final Answer"):]
 response_text_full = response_text_full.replace("Final Answer", "")
 response_text_full = response_text_full.replace("\n", "")
 response_text_full = response_text_full.replace(":", "")
 else:
 response_text_full = response_text_full[response_text_full.find("</think>"):]
 response_text_full = response_text_full.replace("</think>", "")
 response_text_full = response_text_full.replace("\n", "")
 response_text_full = response_text_full.replace("```", "")
 
 move_list = response_text_full.split(",")
 move_list = [elem.rstrip().lstrip() for elem in move_list] 
 pprint(response_text_full)
 
 for move_name_idx, move_name in enumerate(move_list): 
 move_name = move_name.replace("****", "") 
 print(f"Chosen move: {move_name}")
 
 move_name = move_name.replace("(", "")
 move_name = move_name.replace(")", "")
 try:
 func_result = globals()[move_name]()
 except:
 try:
 if "move_right" in move_name:
 func_result = globals()[move_right]()
 elif "move_left" in move_name:
 func_result = globals()[move_left]()
 elif "move_down" in move_name:
 func_result = globals()[move_down]()
 elif "move_up" in move_name:
 func_result = globals()[move_up]()
 except:
 try:
 if ("Right" in move_name) or ("right" in move_name):
 func_result = globals()[move_right]()
 elif ("Left" in move_name) or ("left" in move_name):
 func_result = globals()[move_left]()
 elif ("Down" in move_name) or ("down" in move_name):
 func_result = globals()[move_down]()
 elif ("Up" in move_name) or ("up" in move_name):
 func_result = globals()[move_up]()
 except:
 continue
 
 move_idx += 1
 print(f"{move_idx}) CURRENT STATE:")
 state_text = game_state.print_state()
 
 if game_state.finished(): 
 is_finished = True
 break 
 
 if game_state.finished():
 print("FINISHED!")
 is_finished = True
 break 
 
 
if __name__ == "__main__":
 main()

And here is the code for the demo_1_n_puzzle_gui.py file:

import tkinter as tk
from tkinter import messagebox
import numpy as np
import random
import sys
from pdb import set_trace
class NPuzzle:
 def __init__(self, size=3):
 self.size = size
 self.root = tk.Tk()
 self.root.title("N-Puzzle Game")
 self.root.geometry("450x600")
 
 # Initialize the puzzle state as 2D numpy array
 puzzle_1d = list(range(1, size * size)) + [0] # 0 represents empty space
 # self.puzzle = np.array(puzzle_1d).reshape(size, size)
 self.puzzle = np.array([
 [0, 1, 3],
 [4, 2, 5],
 [7, 8, 6],
 ])
 # set_trace()
 
 # Create GUI elements
 self.create_widgets()
 # self.shuffle_puzzle()
 self.update_display()
 
 def create_widgets(self):
 # Title
 title_label = tk.Label(self.root, text="N-Puzzle Game", font=("Arial", 20, "bold"))
 title_label.pack(pady=10)
 
 # Game board frame
 self.board_frame = tk.Frame(self.root, bg="lightgray", padx=5, pady=5)
 self.board_frame.pack(pady=10)
 
 # Create buttons for puzzle pieces
 self.buttons = []
 for i in range(self.size):
 row = []
 for j in range(self.size):
 btn = tk.Button(
 self.board_frame,
 text="",
 width=6,
 height=3,
 font=("Arial", 16, "bold"),
 command=lambda r=i, c=j: self.move_piece(r, c)
 )
 btn.grid(row=i, column=j, padx=2, pady=2)
 row.append(btn)
 self.buttons.append(row)
 
 # Control buttons frame
 control_frame = tk.Frame(self.root)
 control_frame.pack(pady=20)
 
 # Movement buttons
 tk.Button(control_frame, text="↑", width=5, height=2, font=("Arial", 12, "bold"),
 command=self.move_down).grid(row=0, column=1, padx=5, pady=5)
 
 tk.Button(control_frame, text="←", width=5, height=2, font=("Arial", 12, "bold"),
 command=self.move_right).grid(row=1, column=0, padx=5, pady=5)
 
 tk.Button(control_frame, text="→", width=5, height=2, font=("Arial", 12, "bold"),
 command=self.move_left).grid(row=1, column=2, padx=5, pady=5)
 
 tk.Button(control_frame, text="↓", width=5, height=2, font=("Arial", 12, "bold"),
 command=self.move_up).grid(row=2, column=1, padx=5, pady=5)
 
 # Shuffle button
 shuffle_btn = tk.Button(self.root, text="Shuffle", width=10, height=2,
 font=("Arial", 12, "bold"), bg="lightblue",
 command=self.shuffle_puzzle)
 shuffle_btn.pack(pady=10)
 
 # Bind keyboard events
 self.root.bind('<Key>', self.on_key_press)
 self.root.focus_set()
 
 def get_empty_position(self):
 """Find the position of the empty space (0)"""
 empty_pos = np.where(self.puzzle == 0)
 return empty_pos[0][0], empty_pos[1][0]
 
 def move_piece(self, row, col):
 """Move a piece if it's adjacent to the empty space"""
 empty_row, empty_col = self.get_empty_position()
 
 # Check if the clicked piece is adjacent to empty space
 if (abs(row - empty_row) == 1 and col == empty_col) or \
 (abs(col - empty_col) == 1 and row == empty_row):
 # Swap the piece with empty space
 self.puzzle[row, col], self.puzzle[empty_row, empty_col] = \
 self.puzzle[empty_row, empty_col], self.puzzle[row, col]
 
 self.update_display()
 self.check_win()
 
 def move_up(self):
 """Move empty space up (piece below moves up)"""
 empty_row, empty_col = self.get_empty_position()
 if empty_row < self.size - 1:
 self.move_piece(empty_row + 1, empty_col)
 
 def move_down(self):
 """Move empty space down (piece above moves down)"""
 empty_row, empty_col = self.get_empty_position()
 if empty_row > 0:
 self.move_piece(empty_row - 1, empty_col)
 
 def move_left(self):
 """Move empty space left (piece to the right moves left)"""
 empty_row, empty_col = self.get_empty_position()
 if empty_col < self.size - 1:
 self.move_piece(empty_row, empty_col + 1)
 
 def move_right(self):
 """Move empty space right (piece to the left moves right)"""
 empty_row, empty_col = self.get_empty_position()
 if empty_col > 0:
 self.move_piece(empty_row, empty_col - 1)
 
 def on_key_press(self, event):
 """Handle keyboard input"""
 key = event.keysym.lower()
 if key == 'up':
 self.move_up()
 elif key == 'down':
 self.move_down()
 elif key == 'left':
 self.move_left()
 elif key == 'right':
 self.move_right()
 
 def shuffle_puzzle(self):
 """Shuffle the puzzle by making random valid moves"""
 for _ in range(1000): # Make 1000 random moves
 moves = []
 empty_row, empty_col = self.get_empty_position()
 
 # Add possible moves
 if empty_row > 0:
 moves.append(self.move_down)
 if empty_row < self.size - 1:
 moves.append(self.move_up)
 if empty_col > 0:
 moves.append(self.move_right)
 if empty_col < self.size - 1:
 moves.append(self.move_left)
 
 # Make a random move
 if moves:
 random.choice(moves)()
 
 self.update_display()
 
 def update_display(self):
 """Update the button display"""
 for i in range(self.size):
 for j in range(self.size):
 value = self.puzzle[i, j]
 if value == 0:
 self.buttons[i][j].config(text="", bg="lightgray", state="disabled")
 else:
 self.buttons[i][j].config(text=str(value), bg="white", state="normal")
 
 def check_win(self):
 """Check if the puzzle is solved"""
 target_1d = list(range(1, self.size * self.size)) + [0]
 target = np.array(target_1d).reshape(self.size, self.size)
 # print(target)
 if np.array_equal(self.puzzle, target):
 messagebox.showinfo("Congratulations!", "You solved the puzzle!")
 sys.exit()
 
 def run(self):
 """Start the game"""
 self.root.mainloop()
# Create and run the game
if __name__ == "__main__":
 game = NPuzzle()
 game.run()

Please provide some comments for my code. Any feedback and suggestions for improvement is welcome.

Github link: https://github.com/dangmanhtruong1995/N-puzzle-Agent/tree/main

toolic
14.9k5 gold badges29 silver badges207 bronze badges
asked Aug 18 at 15:49
\$\endgroup\$
0

1 Answer 1

4
\$\begingroup\$

demo_1_n_puzzle_gui.py

UX

When I run the code, I have to resize the GUI window to see "Shuffle" button. Perhaps I am using a different version of Python, libraries, OS, etc.

The fancy Unicode arrows in the create_widgets function do not show up in my source code editor or in the GUI when I run the code. This is a portability issue.

Imports

You could run code development tools to automatically find some style issues with your code.

ruff identifies the following unused line:

from pdb import set_trace

It can be deleted. You can either use ruff or isort to sort the other import lines to follow convention:

import random
import sys
import tkinter as tk
from tkinter import messagebox
import numpy as np

Comments

Delete commented-out code to reduce clutter. For example, these lines can be deleted:

 # self.puzzle = np.array(puzzle_1d).reshape(size, size)
 # set_trace()
 # self.shuffle_puzzle()
 # print(target)

Unused code

This line is unused and can be deleted:

puzzle_1d = list(range(1, size * size)) + [0] # 0 represents empty space

Size

size is shown as a variable input to NPuzzle:

def __init__(self, size=3):
 self.size = size

But, it only works for 3 due to the puzzle array dimensions. I suggest removing it as an input until you decide to support other sizes:

def __init__(self):
 self.size = 3

Documentation

The PEP 8 style guide recommends adding docstrings for classes as you have done for the functions.


demo_1_agent.py

ruff

ruff identifies more unused imports and variables.

It also advises against using bare except.

Comments

Again, delete commented-out code to reduce clutter.

Simpler

In lines like this:

if (pos_row == 0):

Parentheses are often omitted:

if pos_row == 0:

The black program can be used to automatically remove the parens.

answered Aug 18 at 17:43
\$\endgroup\$

Your Answer

Draft saved
Draft discarded

Sign up or log in

Sign up using Google
Sign up using Email and Password

Post as a guest

Required, but never shown

Post as a guest

Required, but never shown

By clicking "Post Your Answer", you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.