PettingZoo AI env for Azul multiplayer board game to enable AI agent training.
from azul_marl_env import azul_v1_2players, azul_v1_3players, azul_v1_4players env_2players = azul_v1_2players() env_3players = azul_v1_3players() env_4players = azul_v1_4players() env_2players_custom_max_moves = azul_v1_2players(max_moves=100)
from azul_marl_env import AzulEnv env = AzulEnv(player_count=2) env = AzulEnv(player_count=3) env = AzulEnv(player_count=4) env = AzulEnv(player_count=2, max_moves=100)
from azul_marl_env import azul_v1_2players import random # Create and reset the environment env = azul_v1_2players() observation, info = env.reset() # Iterate through agents for agent in env.agent_iter(): # Get current agent's observation and info observation, reward, termination, truncation, info = env.last() if termination or truncation: break # Get valid moves for current agent valid_moves = info["valid_moves"] # Select a random valid move action = random.choice(valid_moves) # Execute the move env.step(action) # Render the environment (optional) env.render() # Close the environment env.close()
from azul_marl_env import azul_v1_2players import random def play_random_game(): env = azul_v1_2players() observation, info = env.reset() for agent in env.agent_iter(): observation, reward, termination, truncation, info = env.last() if termination or truncation: print(f"Game finished! Final scores: {[player['score'] for player in observation['players']]}") break # Get valid moves and make a random move valid_moves = info["valid_moves"] if valid_moves: action = random.choice(valid_moves) env.step(action) env.close() play_random_game()
Factory count (num_factories):
2 player game -> 5
3 player game -> 7
4 player game -> 9
-
Action Space: MultiDiscrete([num_factories + 1, 5, 20, 5])
- First value: Factory index. Index 0 is taken for the center so the factory indexes are: 0 based factory index + 1.
- Second value: Tile color (0-4 representing different colors)
- Third value: Number of tiles to place on floor (0-19)
- Fourth value: Pattern line index (0-4)
-
Observation Space: Dictionary containing:
factories: Box(0, 4, (num_factories, 5), int32) - Tile counts in each factorycenter: Box(0, 3 * num_factories, (5,), int32) - Tile counts in centerplayers: Tuple of player states, each containing:pattern_lines: Box(0, 5, (5, 5), int32) - Current pattern lineswall: Box(0, 5, (5, 5), int32) - Wall statefloor: Box(0, 5, (7,), int32) - Floor tilesis_starting: Discrete(2) - First player markerscore: Discrete(241) - Player's score
bag: Box(0, 100, (5,), int32) - Remaining tiles in baglid: Box(0, 100, (5,), int32) - Discarded tiles
-
Reward:
-1for each step until game end-2for invalid moves- Final Azul score is added to cumulative reward at game end
-
Done:
Truewhen:-
Game is completed (at least one player filled at least one horizontal wall)
-
Falseotherwise -
Truncated:
Truewhen: -
Maximum moves reached (player_count * 150 by default)
-
Falseotherwise
-
-
Info: Contains
valid_moveslist for the current player