As a part of a console utilities module, I created a function that takes a table in the form of an array of arrays, and generates an ASCII table with the given contents.
I've also added the options of adding a header, determining the in-cell alignment, and adding a border to the table.
Table cells have auto computed lengths by the longest cell of their column, rows that have a small amount of elements get filled by blank cells to adjust their length.
I'm quite new to Python, so I wanted to know about whether my writing style is OK. Can I improve or shorten some parts of the code? Any other advice?
def ascii_table (table, **k):
header = k.get('header', [])
align = k.get('align', 'left')
border = k.get('border', False)
widths = []
for i in range(max(map(len, table))): widths.append(max(max(map(len, [row[i] for row in table if len(row) > i])), len(header[i]) if len(header) > i else 0))
printable = []
if border:
printrow = []
for i in range(max(map(len, table))):
if i > 0 and i < max(map(len, table)) - 1: printrow.append('─' * (widths[i] + 2))
else: printrow.append('─' * (widths[i] + 1))
printable.append('┌─' + '┬'.join(printrow) + '─┐')
# header formatting
if len(header) > 0:
printrow = []
for i in range(len(header)):
assert header[i]
if align == 'center': printrow.append(header[i].center(widths[i]))
elif align == 'left': printrow.append(header[i].ljust(widths[i]))
elif align == 'right': printrow.append(header[i].rjust(widths[i]))
if border: printable.append('│ ' + ' │ '.join(printrow) + ' │')
else: printable.append(' │ '.join(printrow))
printrow = []
for i in range(len(header)):
if i > 0 and i < len(header) - 1: printrow.append('─' * (widths[i] + 2))
else: printrow.append('─' * (widths[i] + 1))
if border: printable.append('├─' + '┼'.join(printrow) + '─┤')
else: printable.append('┼'.join(printrow))
# table formatting
for row in table:
printrow = []
for i in range(len(widths) - len(row)):
row.append('')
for i in range(len(row)):
if align == 'center': printrow.append(row[i].center(widths[i]))
elif align == 'left': printrow.append(row[i].ljust(widths[i]))
elif align == 'right': printrow.append(row[i].rjust(widths[i]))
if border: printable.append('│ ' + ' │ '.join(printrow) + ' │')
else: printable.append(' │ '.join(printrow))
if border:
printrow = []
for i in range(max(map(len, table))):
if i > 0 and i < max(map(len, table)) - 1: printrow.append('─' * (widths[i] + 2))
else: printrow.append('─' * (widths[i] + 1))
printable.append('└─' + '┴'.join(printrow) + '─┘')
return '\n'.join(printable)
Demo:
>>> from asciiart import ascii_table >>> table = list(map(lambda x: x.split(), ['Jon Doe 20', 'Mark Waine 35', 'Donald Rory 43'])) >>> header = ['First Name', 'Last Name', 'Age'] >>> ascii = ascii_table(table, header=header, align='center', border=True) >>> print(ascii) ┌────────────┬───────────┬─────┐ │ First Name │ Last Name │ Age │ ├────────────┼───────────┼─────┤ │ Jon │ Doe │ 20 │ │ Mark │ Waine │ 35 │ │ Donald │ Rory │ 43 │ └────────────┴───────────┴─────┘
2 Answers 2
There are a lot of duplicated code, inefficiencies and redundancies (wait...) but there is that one thing that you are doing right: you use join
instead of trying to manually insert separators between each piece of text.
Lists, iterables and generators
One of the most obvious inefficiency in your code is that you build a list only to feed it to join
. There is just widths
that you keep all along.
Since you build these lists using append
in a for
loop, you could use a list-comprehension instead. But for it to be readable, we will need to work on how to write it. Then, once you have a nice list-comprehension you can feed it to join
instead of using an intermediate variable. An since join
accept any iterable (meaning: anything a for
loop can work with) and not only lists, you can just remove the brackets to turn the list-comprehension into a generator expression that will happily be consumed by join
but not take nearly as much memory as the list.
The same kind of optimization can be applied to your function as a whole. Instead of building a list of each line to print and then joining them, you could turn your function into a generator and yield
each line as you compute it. The caller will be responsible for joining them or can print
them as they are generated. You can also provide a wrapper around the generator that '\n'.join
the lines yielded.
List-comprehensions and generator-expressions
So, onto simplifying the building of your lists as a whole. But before we do that, let's analyze how you do things a bit. We can see that, when computing the length of a column, you account for varying number of columns (with your if len(row) > i
and if len(header) > i
) and add missing items as you go on the lines you iterate over latter on. Thus modifying table
in place. Thaaat's... not a side effect I would expect from such function. Let's fix that as well.
widths
First of, never write anything other than a comment after a colon: it impairs readability.
Second, when dealing with structures like table
holding a list of lines, you can easily get a list of columns using zip(*table)
. This, however, assumes that table
has all rows of the same length. But we need to account for rows of varying lengths, so let's use itertools.zip_longest
instead.
Now we can iterate over the columns easily, but we also need to iterate over the headers at the same time. Yet another job for zip
... hum, well, zip_longest
yet again:
from itertools import zip_longest
columns = zip_longest(*table, fillvalue='')
widths = [
max(len(title), max(len(word) for word in column))
for title, column in zip_longest(headers, columns, fillvalue='')
]
Lines of text
You can simplify the building of your "separator lines" (upper border, lower border and line under headers) by figuring out that it follows the same pattern than the lines of text: the inner part is a join
of a 3-character string over text of length widths[i]
. The 3 characters are filler, actual separator, filler: '─┬─'
for the upper border, '─┼─'
below the header, '─┴─'
for the lower border and ' │ '
for the lines of text.
Once you see that, you don't need those weird widths[i] + 1
or widths[i] + 2
for the "separator lines". So the inner parts for each line can all be written along the lines of:
'─┼─'.join('─' * width for width in widths)
or
' │ '.join(word.rjust(width) for word, width in zip_longest(row, widths, fillvalue=''))
each of them enclosed in borders if any. Due to the high similarity of such constructs, let's write a function that build such strings:
def content(iterable, left='│', inner='│', right='│', filler=' ', border=True):
sep = '{0}{1}{0}'.format(filler, inner)
line = sep.join(iterable)
if border:
line = '{0}{1}{2}{1}{3}'.format(left, filler, line, right)
return line
and you call it like:
content((word.rjust(width) for word, width in zip_longest(row, widths, fillvalue='')), border=border)
content(('─' * width for width in widths), '└', '┴', '┘', '─', border=border)
You can also wrap the building of the "separator lines" using:
def separator(widths, left, inner, right, border=True):
return content(('─' * width for width in widths), left, inner, right, '─', border)
so you "just" need to call it with
separator(widths, '└', '┴', '┘', border=border)
format
You may have noted that I used str.format
instead of strings concatenation. First off, it is more efficient. Second, it can also be used to format strings and "align" them. You can use a translation mapping like:
ALIGN = {'left': '<', 'center': '^', 'right': '>'}
and the '{:{}{}}'
template to get format
to build the aligned string you want:
content(('{:{}{}}'.format(word, ALIGN[align], width) for word, width in zip_longest(row, widths, fillvalue='')), border=border)
Parameters
The **kwargs
syntax for unknown number of named arguments doesn't fit well in this case where you explicitly expect only 3 arguments in addition to table
. In such cases parameters with default values are a better fit. If you want to force the caller to use named arguments, you can use (only in Python 3) a single *
as a "parameter name". Parameters folowing this *
will need to be named by the caller or they will get an error.
Proposed improvements
from itertools import zip_longest
ALIGN = {
'left': '<',
'center': '^',
'right': '>',
}
TEMPLATE = '{:{}{}}'
def ascii_table(table, *, headers=None, align='left', border=False):
if headers is None:
headers = []
align = ALIGN[align]
return '\n'.join(ascii_table_generator(table, headers, align, border))
def separator(widths, left, inner, right, border=True):
return content(
('─' * width for width in widths),
left, inner, right, '─', border)
def content(iterable, left='│', inner='│', right='│', filler=' ', border=True):
sep = '{0}{1}{0}'.format(filler, inner)
line = sep.join(iterable)
if border:
line = '{0}{1}{2}{1}{3}'.format(left, filler, line, right)
return line
def ascii_table_generator(table, headers, align, border):
columns = zip_longest(*table, fillvalue='')
widths = [
max(len(title), max(len(word) for word in column))
for title, column in zip_longest(headers, columns, fillvalue='')
]
if border:
yield separator(widths, '┌', '┬', '┐')
# header formatting
if headers:
yield content(
(TEMPLATE.format(title, align, width)
for title, width in zip_longest(headers, widths, fillvalue='')),
border=border)
yield separator(widths, '├', '┼', '┤', border=border)
# table formatting
for row in table:
yield content(
(TEMPLATE.format(value, align, width)
for value, width in zip_longest(row, widths, fillvalue='')),
border=border)
if border:
yield separator(widths, '└', '┴', '┘')
-
\$\begingroup\$ Wow thanks! I'll just mention that in
zip_longest
itsfillvalue
without_
. Also, you used the variableline
many times but it doesn't exist. ((line * width for width in widths)
). What is that? \$\endgroup\$Uriel– Uriel2016年10月20日 20:42:04 +00:00Commented Oct 20, 2016 at 20:42 -
\$\begingroup\$ @UrielEli Fixed.
line
was at some point a parameter with default value (line='─'
) but I removed it. Not everywhere it seems... \$\endgroup\$301_Moved_Permanently– 301_Moved_Permanently2016年10月20日 20:48:00 +00:00Commented Oct 20, 2016 at 20:48 -
\$\begingroup\$ Awesome. Thanks for all of the comments about generators and formatting. I actually dislike
format
but the rest are great. I never thought of generators that way. \$\endgroup\$Uriel– Uriel2016年10月20日 20:59:05 +00:00Commented Oct 20, 2016 at 20:59 -
\$\begingroup\$ @UrielEli You can still use
ALIGN = {'left': str.ljust, 'center': str.center, 'right': str.rjust}
andalign(title, width)
instead ofTEMPLATE.format
. It might also feel more natural. But I wouldn't use string concatenation incontent
... it's... not my taste. \$\endgroup\$301_Moved_Permanently– 301_Moved_Permanently2016年10月20日 21:28:09 +00:00Commented Oct 20, 2016 at 21:28
In no particular order:
Your approach to default keyword arguments with the
**k
is unusual. I’d expect to see something more like:def ascii_table(table, header=None, align='left', border=False): if header is None: header = []
Note that I’ve dropped the space between the function name and the opening parens (per PEP 8 style), and also used
None
as the default for theheader
parameter – this is to avoid quirks around mutable default arguments.This style is more in keeping with other Python code, and makes it easier for somebody to see what parameters this function takes. It also makes it easier to spot when somebody is calling this with entirely wrong arguments.
Newlines are cheap! Don’t put stuff on a single line, especially
if ... else
blocks. For example, this block:if border: printable.append('│ ' + ' │ '.join(printrow) + ' │') else: printable.append(' │ '.join(print row))
would usually be written more like:
if border: printable.append('│ ' + ' │ '.join(printrow) + ' │') else: printable.append(' │ '.join(print row))
That style is more conventional, and makes it easier to see the structure of the code.
Loop over the elements of a list directly, not its indices. For example, this loop:
for i in range(len(header)): assert header[i] if align == 'center': printrow.append(header[i].center(widths[i])) .... # do stuff with header[i]
could be better written as something like:
for h, w in zip(header, widths): if align == 'centre': printrow.append(h.center(w)) ...
Iterating directly over lists is more Pythonic, and generally makes for cleaner code.
Don’t cram too much into a single expression; it becomes difficult to parse and hurts readability. This is on the edge of what I’d consider acceptable:
for i in range(max(map(len, table))):
This is unreadable:
for i in range(max(map(len, table))): widths.append(max(max(map(len, [row[i] for row in table if len(row) > i])), len(header[i]) if len(header) > i else 0))
You should break it across multiple lines.
You can tidy up some of your logical conditions. For example, this:
if i > 0 and i < max(map(len, table)) - 1:
can be reduced to:
if 0 < i < max(len(t) for t in table) - 1
which is a bit easier to read.
This could potentially choke on a large table, because you’re building all the rows in a list
printable
. I would consider rewriting this as a generator – essentially, replace everyprintable.append(foo)
withyield foo
.Then your function just lazily computes the rows, and passes them to the caller. It’s up to them how they want to use it – whether they want to save it, do some more processing, join it into a string (and then they can choose the newline).
Generators and iteration are a very powerful feature of Python – if you’re not familiar with them already, this PyCon talk is a good introduction.
-
\$\begingroup\$ Thanks for the syntax tips! Also didn't know that you can make stuff like 1 < i < ... for more than two sides. \$\endgroup\$Uriel– Uriel2016年10月22日 22:33:19 +00:00Commented Oct 22, 2016 at 22:33
**k
as a formal parameter instead of the more explicitdef ascii_table(table, header=None, align='left', border=False)
? Nice question in any case! \$\endgroup\$