I wrote a program that managed projects and created a set of SQL files for each project. In the logic, I have a few functions that, each time I look at them, I wonder if they could be written in a better way. They are pretty straight forward and compact, however I feel like there are better, more efficient ways of computing their results.
I would appreciate any comments on coding style or improvements (either in readability or performance) that you may have.
Oxford Comma
This function takes a list
as input then passes an Oxford comma-separated string off to another function:
# >>> obj._format_missing_message(['a','b','c'])
# 'a, b, and c'
def _format_missing_message(self, missing):
length = len(missing)
# Show the message if needed
if length > 0:
and_clause = '{} and '.format('' if length <= 2 else ',')
message = and_clause.join([', '.join(missing[:-1]), missing[-1]])
self._tab_pane.currentWidget().update_status(
"Configs not found for: {}.".format(message))
SQL 'LIKE' String Creator
This next function takes a list
of string inputs then finds the SQL LIKE syntax that recognizes all of the list:
import difflib
# >>> create_sql_regex(['123','124','125'])
# '12%'
def create_sql_regex(ids):
''' Creates a SQL regex from a list of strings for use in LIKE statement '''
longest_match = ''
# Raise an error if there is only one item in the list as there needs to be at least two
# items to create the regex.
length = len(ids)
if length <= 1:
raise NotEnoughComparisonData(
'Not enough data to compare. Passed list length: {}'.format(length))
# Create the SequenceMatcher and loop through each element in the list, comparing it and
# the previous item.
matcher = difflib.SequenceMatcher()
for item, next_item in zip(ids[:-1], ids[1:]):
matcher.set_seqs(item, next_item)
long_match = matcher.find_longest_match(0, len(item), 0, len(next_item))
# If the match is shorter than the previous longest match or if we have not found
# a match yet, store the match
if long_match.size < len(longest_match) or longest_match == '':
longest_match = item[long_match.a:long_match.a + long_match.size]
# If not match was found, raise an error
if longest_match == '':
raise NoMatchFound('A match could not be found in this list: {0}'.format(ids))
return '{}%'.format(longest_match)
Count Duplicates
This function is the simplest of the three. It takes a list
as input and returns a hash
that holds the count of each unique item found in the list
:
# >>> count_duplicates(['a','a','b'])
# {'a':2, 'b':1}
def count_duplicates(values):
counts = {}
for val in values:
if val in counts:
counts[val] += 1
else:
counts[val] = 1
return counts
1 Answer 1
Oxford comma
Are these really your expected outcomes?
# for ['a']
Configs not found for: and a.
# for ['a', 'b']
Configs not found for: a and b.
# for ['a', 'b', 'c']
Configs not found for: a, b, and c.
In particular, the case of a single item looks like a bug, and in the case of 3 or more items the 2 spaces between a, b
looks a bit strange.
In any case, it would make sense to split the method, for example:
def _format_missing_message(self, missing):
if missing:
message = _oxford_comma(missing)
self._tab_pane.currentWidget().update_status(
"Configs not found for: {}.".format(message))
@staticmethod
def _oxford_comma(items):
length = len(items)
and_clause = '{} and '.format('' if length <= 2 else ',')
return and_clause.join([', '.join(items[:-1]), items[-1]])
This way it's easier to unit test the Oxford comma logic.
Finally, I think the method would be slightly easier to read like this:
def _oxford_comma(items):
length = len(items)
if length == 1:
return items[0]
if length == 2:
return '{} and {}'.format(*items)
return '{}, and {}'.format(', '.join(items[:-1]), items[-1])
Count Duplicates
My only objection is the method name, count_duplicates
. You are counting elements, not necessarily duplicates, and returning a map of item => count
pairs. So for example get_counts
would be more natural.
That's all I can pick on ;-)
-
\$\begingroup\$ Yeah the 1 item oxford comma is a bug. As for the double-space, it was a mistype :P Thanks for the input! \$\endgroup\$BeetDemGuise– BeetDemGuise2014年05月20日 19:56:56 +00:00Commented May 20, 2014 at 19:56
-
1\$\begingroup\$ See here for a more concise way to count the letter in a string. \$\endgroup\$BeetDemGuise– BeetDemGuise2014年05月23日 15:13:19 +00:00Commented May 23, 2014 at 15:13
-
\$\begingroup\$ @DarinDouglass nice to know, thanks for the reminder! (I keep forgetting that handy tool.) \$\endgroup\$janos– janos2014年05月25日 20:49:04 +00:00Commented May 25, 2014 at 20:49
Explore related questions
See similar questions with these tags.