I made a GPL project to read Shogi Western Notations that I made public in GitHub. The project works perfectly on Ubuntu. Please check README.md to check requirements.
All this code works quite well. The basic idea was fragmenting each notated move with a long regexp, which I think is the best way to optimize the process. First, I read notated games in plain text and store every move as an element into a list:
shogi_reader.py
-> line 50
movs = game_data.movs.splitlines()
There are also two objects that control position of pieces to determine moves. They are defined in my project in the file managers.py
. The classes are named coords_manager()
and matrix_manager()
. coords_manager class controls a list of coords for every kind of piece and controls the revertion of coords if R key is pressed. matrix_manager controls a matrix of boolean values to determine if a square is owned by a piece or not. lamesa
is an instance of coords_manager(), which controls the arrays of coords of pieces and reverts the pieces' coords, which are a property of the class.
The program works in two steps:
1- It reads the notation and calculates how to store the data into a format revertible backwards for the user using matchobj.group(n)
. (shogi_reader.py
, line 69:)
reg = re.compile('^\s*(\d+)\s*-\s*(\+?[P|L|N|S|G|K|R|B](\d[a-i])?)([-|x|*])(\d[a-i][=|\+]?)$')
error_cnt = 0
for move in movs:
move_to_add = None
kind = ''
action = None
promoting = False
piece_respawn = ''
piece_x = 0
piece_y = 0
frag = reg.match(move)
destiny = [lamesa.coords_x[frag.group(5)[0]],lamesa.coords_y[frag.group(5)[1]]]
destiny_h = frag.group(5)
if len(frag.group(5)) > 2:
if frag.group(5)[2] == '+':
promoting = True
if (len(frag.group(2)) > 1 and frag.group(2)[0] != '+') or (len(frag.group(2)) > 2 and frag.group(2)[0] == '+'):
if frag.group(2)[0] == '+':
kind = frag.group(2)[0:2]
piece_x = lamesa.coords_x[frag.group(2)[2]]
piece_y = lamesa.coords_y[frag.group(2)[3]]
else:
kind = frag.group(2)[0]
piece_x = lamesa.coords_x[frag.group(2)[1]]
piece_y = lamesa.coords_y[frag.group(2)[2]]
else:
kind = frag.group(2)
if frag.group(4) == 'x':
action = 1
elif frag.group(4) == '*':
action = 2
elif frag.group(4) == '-':
action = 0
# [...]
2- The canvas is loaded, and the pieces are drawn into the shogi board. The notated game then can be played with arrows OR (A and D keys), and revert the board with R.
Once the second step begins, a list with the formatted moves is stored in history_pos
list and then replayed with following functions assigned to arrow keys. This is the part of the algorythm where the regexp abuse takes place(Lines 84, 94 and 141):
funciones.py
-> Line 65
def move_forward():
global pos, history
output = ''
if pos < len(history):
if history[pos][2] == 2:
#In this case,
# history[pos] structure -> [[kind_of_piece,counter_to_assign_id], destiny_coords, action, promoting, piece_respawn]
statem = 'lamesa.lista_'+history[pos][0][0]+'['+(str)(history[pos][0][1])+']=['+(str)(lamesa.coords_x[history[pos][1][0]])+','+(str)(lamesa.coords_y[history[pos][1][1]])+']'
exec statem
output = 'lamesa.r'+history[pos][0][0]+' -= 1'
else:
# Standard case,
# history[pos] structure -> [code_statement_to_select_piece, coords_sum, action, promoting, piece_respawn]
statem = 'lamesa.'+history[pos][0]+'[0]+='+(str)(history[pos][1][0]*lamesa.reverted)
exec statem
statem = 'lamesa.'+history[pos][0]+'[1]+='+(str)(history[pos][1][1]*lamesa.reverted)
exec statem
if history[pos][3] == True: #Handle piece_promotion==True
#TODO Simplify this task avoiding the regexp abuse
prueba = re.match('^.*\[(.*)\]$', history[pos][0])
statem = 'prueba2=lamesa.'+history[pos][0]
exec statem
statem = 'del lamesa.'+history[pos][0]
exec statem
prueba4 = re.match('^.*_(.*)\[.*\]$', history[pos][0])
statem = 'lamesa.lista_s'+prueba4.group(1)+'['+prueba.group(1)+']=['+(str)(prueba2[0])+','+(str)(prueba2[1])+']'
exec statem
if history[pos][4] != '': #Piece captured that has to be respawn backwards
# TODO Avoid regexp abuse
frag = re.match('^(lista_(.*)\[.*\])=\[.*,.*\]$', history[pos][4])
theobj = frag.group(1)
piece = frag.group(2)
if len(piece) == 3:
piece = piece[1:]
if piece[-1] == 'b':
piece = piece[:-1]+'n'
else:
piece = piece[:-1]+'b'
statem = 'del lamesa.'+theobj
exec statem
output = 'lamesa.r'+piece+' += 1'
pos += 1
previous_highlight(pos)
return output
funciones.py
-> Line: 129
def move_back():
global pos, history
output = ''
if pos > 0:
pos -= 1
if history[pos][2] == 2:
statem = 'del lamesa.lista_'+history[pos][0][0]+'['+(str)(history[pos][0][1])+']'
exec statem
output = 'lamesa.r'+history[pos][0][0]+' += 1'
else:
if history[pos][3] == True:
#TODO Avoid regexp abuse
prueba = re.match('^.*\[(.*)\]$', history[pos][0])
prueba4 = re.match('^.*_(.*)\[.*\]$', history[pos][0])
statem = 'prueba2=lamesa.lista_s'+prueba4.group(1)+'['+prueba.group(1)+']'
exec statem
statem = 'lamesa.'+history[pos][0]+'=['+(str)(prueba2[0])+','+(str)(prueba2[1])+']'
exec statem
statem = 'del lamesa.lista_s'+prueba4.group(1)+'['+prueba.group(1)+']'
exec statem
statem = 'lamesa.'+history[pos][0]+'[0]-='+(str)(history[pos][1][0]*lamesa.reverted)
exec statem
statem = 'lamesa.'+history[pos][0]+'[1]-='+(str)(history[pos][1][1]*lamesa.reverted)
exec statem
if history[pos][4] != '':
frag = re.match('^lista_(.*)\[.*\]=\[.*,.*\]$', history[pos][4])
piece = frag.group(1)
if len(piece) == 3:
piece = piece[1:]
if piece[-1] == 'b':
piece = piece[:-1]+'n'
else:
piece = piece[:-1]+'b'
exec 'lamesa.'+history[pos][4]
output = 'lamesa.r'+piece+' -= 1'
previous_highlight(pos)
return output
history
list's elements have two different formats depending on the kind of move:
Normal move:
[code_statement_to_select_piece, coords_sum, action, promoting, piece_respawn]
Drop:
[[kind_of_piece,counter_to_assign_id], destiny_coords, action, promoting, piece_respawn]
When action==2, the structure of history[pos] elements is different in order to send data that is needed in order to revert the drop move. This transformation is made to obtain a state that can be calculated forward and backwards for the position and state of pieces:
The way I structured the algorythm forced me to make some abuse of regexps in these fragments of code.
There are the move_forward()
and move_back()
functions, which performs a change on the properties of coords_manager() and matrix_manager() objects which objective is to keep the pieces' state. I had to make some abuse of regexps in these lines to make the operations I needed on the state.
I think I should change the structure of objects inside history_pos
list in order to prevent abuse of regexps, but I don't have good ideas for it.
Am I really abusing of regexps here? If so, which structure changes on objects inside history_pos
list would be needed to prevent the abuse of regexps?
2 Answers 2
In addition to the "regexp abuse" concern, you also have a problem with the use of exec
. Actually, I would consider them both to be symptoms of a deeper problem: your modelling is underdeveloped.
Specifically, the problem is that your data are "stringly typed".
A better design would be for the parser to return a list of Move
objects. Here's a rough idea:
class Move
def __init__(self, from_coord, to_coord, promoted_to=None):
self.from = from_coord
self.to = to_coord
self.piece = None
self.captured_piece = None
self.promoted_to = None
def do_move(self, board):
self.piece = board.get_piece(self.from_coord)
self.captured_piece = board.get_piece(self.to_coord)
if self.captured_piece is not None:
board.remove_piece(self.to_coord)
board.remove_piece(self.from_coord)
board.put_piece(self.to_coord, self.promoted_to or self.piece)
def undo_move(self, board):
board.remove_piece(self.to_coord)
board.put_piece(self.from_coord, self.piece)
if self.captured_piece is not None:
board.put_piece(self.to_coord, self.captured_piece)
The shogi_reader
would return a list of these Move
objects, and you can play and unplay the moves as necessary by calling do_move()
or undo_move()
.
-
\$\begingroup\$ Looks like a good way to prevent abuse of regexps and exec instruction. I'll check it out. \$\endgroup\$SebasSBM– SebasSBM2015年05月29日 00:04:12 +00:00Commented May 29, 2015 at 0:04
-
\$\begingroup\$ There are four managers in managers.py: coords_manager(), sprites_manager(), matrix_manager() and input_manager(). The 1st and 3rd control the logic on squares and piece's coords, so making a Move class that handles it in a cleaner way may be factible. Late or soon, I will try it out. \$\endgroup\$SebasSBM– SebasSBM2015年05月29日 00:38:53 +00:00Commented May 29, 2015 at 0:38
-
\$\begingroup\$ I encourage you to ask about
coords_manager
in a separate question. \$\endgroup\$200_success– 200_success2015年05月29日 00:47:37 +00:00Commented May 29, 2015 at 0:47 -
\$\begingroup\$ Just adding information about what I'm planning to do to perform a
Move
class. I'm not sure if you mean there's a problem withcoords_manager()
\$\endgroup\$SebasSBM– SebasSBM2015年05月29日 00:53:42 +00:00Commented May 29, 2015 at 0:53 -
1\$\begingroup\$ Follow-up question posted \$\endgroup\$200_success– 200_success2015年07月12日 22:25:03 +00:00Commented Jul 12, 2015 at 22:25
Considering just your regex...
reg = re.compile('^\s*(\d+)\s*-\s*(\+?[P|L|N|S|G|K|R|B](\d[a-i])?)([-|x|*])(\d[a-i][=|\+]?)$')
I see a few problems.
- It's good practice to use
r'raw strings'
for any non-trivial regex, to prevent backslashes from having additional special meaning. - The use of character classes is incorrect. Inside a character class, the "or" is implicit. That is,
[PLNSGKRB]
means the same as(?:P|L|N|S|G|K|R|B)
. - Inside a character class, some characters, such as
+
, are taken literally. You don't need to put a backslash in front.
reg = re.compile(r'^\s*(\d+)\s*-\s*(\+?[PLNSGKRB](\d[a-i])?)([-x*])(\d[a-i][=+]?)$')
For readability, I suggest taking advantage of re.VERBOSE
and (?P<name>...)
groups.
reg = re.compile(r"""^\s*(?P<move_num>\d+)
\s*-\s*(?P<piece>\+?[PLNSGKRB](?P<from>\d[a-i])?)
(?P<action>[-x*])
(?P<to>\d[a-i][=+]?)$""", re.VERBOSE)
-
\$\begingroup\$ Thank you for the analysis. When I have somemore time I'll use this information to make the code more readable and test your VERBOSE version to see if it keeps accuracy. \$\endgroup\$SebasSBM– SebasSBM2015年05月28日 22:49:14 +00:00Commented May 28, 2015 at 22:49
-
\$\begingroup\$ Anyway, the main issues with regexps abuse are in
move_forward()
andmove_back()
functions (funciones.py
file) \$\endgroup\$SebasSBM– SebasSBM2015年05月28日 22:50:40 +00:00Commented May 28, 2015 at 22:50
Explore related questions
See similar questions with these tags.