Creating "completely different" anagrams in python

Question 1

For a text adventure I'm writing, I need to establish an anagram X of a word Y such that letter Z of X is not letter Z of Y.

In other words, ABCDEE could go to EEDCBA or DCEEBA but not ABCEED, because ABCEED matches ABCDEE at slot 5.

#
# amak.py: this makes an anagram of a word with no identical letter slots.
#
# in other words, HEAT and HATE have the first letter identical, but EATH has no letter slots in common with HEAT.
#
import re
import sys
from collections import defaultdict
#option(s). There may be more later.
shift_1_on_no_repeat = False
try_rotating_first = False
# determine if we can still switch a pair. With 3 letters left, it is not possible. With 2, it should be.
#
def can_take_even(x):
 if x % 2 == 0: return x > 0
 else: return x > 3
# here is the explanation of the algorithm:
#
# 1. unless we have exactly 3 letters to place, we look for the 2 most frequent letters that have not been switched yet nd switch the earliest incidences of each
# 2. if there are 3 unique letters remaining, then we go a->b->c.
# 2a. Note that we can never have 2-1 left, because the previous would have to have 3-?-?. If we started with, say, 2-2-1, we would have 1-1-1 after. Similarly we can never have x-(summing less to x) unless we start with something unviable, because we'd have to have had x+1 and (something less than x+1) on the previous try. If we had x on the previous try, we would have deducted from it.
# note having y>x/2 in x letters means we cannot have a unique anagram. That is because we would have x-y slots to move the y to, but x<2y so that doesn't work.
def find_nomatch_anagram(x):
 x = re.sub("[- '\.]", "", x.lower()) # allow for spaces, apostrophes, etc.
 old_string = list(x)
 new_string = ['-'] * len(x)
 f = defaultdict(list)
 letters_to_place = len(old_string)
 if not len(x):
 print("Blank string...")
 return ""
 for y in range(0, len(x)):
 if old_string[y] not in 'abcdefghijklmnopqrstuvwxyz':
 print("Nonalphabetical character in", x, 'slot', y, "--", old_string[y])
 return ""
 f[x[y]].append(y)
 if shift_1_on_no_repeat and len(f) == len(old_string): return x[1:] + x[0] #abcde quickly sent to bcdea
 if try_rotating_first:
 for y in range(1, len(x)):
 retval = x[-y:] + x[:-y]
 print("Trying", retval)
 bad_matches = False
 for z in range(0, len(x)):
 bad_matches |= (retval[z] == old_string[z])
 if not bad_matches: return retval
 for q in f:
 if len(f[q]) > len(old_string) / 2:
 print(q, "appears too many times in", x, "to create an anagram with no letter slots in common.")
 return ""
 while can_take_even(letters_to_place):
 u = sorted(f, key=lambda x:len(f[x]), reverse=True)
 x1 = f[u[0]].pop(0)
 x2 = f[u[1]].pop(0)
 new_string[x1] = u[1]
 new_string[x2] = u[0]
 letters_to_place -= 2
 if letters_to_place == 3:
 u = sorted(f, key=lambda x:len(f[x]), reverse=True)
 new_string[f[u[0]][0]] = u[1]
 new_string[f[u[1]][0]] = u[2]
 new_string[f[u[2]][0]] = u[0]
 for y in range(0, len(x)):
 if old_string[y] == new_string[y]:
 print("Uh oh, failure at letter", y)
 print(old_string[y])
 print(new_string[y])
 sys.exit()
 if new_string[y] == '-':
 print("Uh oh, blank letter at", y)
 sys.exit()
 return ''.join(new_string)
def show_results(q, result_string = "has this anagram with no letters in common:"):
 temp = find_nomatch_anagram(q)
 if not temp: return
 print(q, result_string, temp)
if len(sys.argv) > 1:
 for q in sys.argv[1:]:
 if q == 's1': shift_1_on_no_repeat = True #this works for one option, but what if there are several?
 elif q == 'tr': try_rotating_first = True #this works for one option, but what if there are several?
 for q in sys.argv[1:]:
 if q != 's1' and q != 'tr': show_results(q, "<=>") # this feels like a real hack, again. I want to process meta commands before any results, though.
else: #these are just general test cases
 show_results("aabbb") #throw error
 show_results("stroll")
 show_results("aaabbbc")
 show_results("aaabbcc")
 show_results("basically")
 show_results("TeTrIs")
 show_results("try this")
 show_results("")

What I have works. But I am wondering about a few things:

is there any way I can write the command line better? I am taking two passes through it right now, but this seems inefficient. I want to be able to give the user the option of trying the obvious anagrams (shift everything 1/2/3/etc. letters over until you find one)
While my algorithm seems to work provably, the code for it seems awkward. I plan (n/2) swaps where I match the 2 top remaining frequencies for unswapped letters, then take them, until I am at 3 or 0. Then I do a 3-way rotation for the final letters.

Question 2

Things that stand out as particularly good:

Test cases, including tests that unsolvable inputs get handled correctly.
Comments, including comments that describe the high level algorithm.
The algorithm description makes it clear how it's avoiding painting itself into a corner.
Testing, as early as possible, for inputs that break your function rather than discovering part way through is fantastic.

Possible room for improvement:

Test cases should specify their expected output. Ideally, you'd have a testing rig that automatically checks they give a satisfying output.
Think about how you handle failures. In python, I'd suggest that exceptions are the way to go.
Where you have multiple approaches to solving the same problem, that is a particularly good sign that it's time to split them out of a mega-function.
Many of your variable names could do with elaboration, especially the single letter ones.
Run the code through a Pep-8 style linter and ensure it's laid out in standard pythonic form. For example single line condition and response is discouraged.
This algorithm produces an output that is determinisitic and depending on its place in your game may be a little bit boring and predictable. Having the rotate mode mixes it up a bit, but that actually makes for an even more obvious pattern. It may be better to see whether you can introduce some random element. The easist option would be to do a random shuffle, and then swap out any letters that violate the rule.

Question 3

Thanks! I do tend to be lazy with variable names, and I am not good about using PEP8 editor. I like the randomization suggestion. I should maybe have mentioned that the first use would be for a test script, to make sure basic things work. But still, it would be nice to have a stress test of possible anagrams. I also agree the function can be broken up. I really appreciate the points you made and the quick response. I take 24 hours before accepting an answer but anything that beats yours will have to be quite good indeed.

Question 4

Oh, one other thing. Maybe have a look at argparse for handling command line input (docs.python.org/3/howto/argparse.html)

Josiah Josiah 6,1062 gold badges18 silver badges39 bronze badges · Accepted Answer · 2019-06-29 12:31:58Z

Things that stand out as particularly good:

Test cases, including tests that unsolvable inputs get handled correctly.
Comments, including comments that describe the high level algorithm.
The algorithm description makes it clear how it's avoiding painting itself into a corner.
Testing, as early as possible, for inputs that break your function rather than discovering part way through is fantastic.

Possible room for improvement:

Test cases should specify their expected output. Ideally, you'd have a testing rig that automatically checks they give a satisfying output.
Think about how you handle failures. In python, I'd suggest that exceptions are the way to go.
Where you have multiple approaches to solving the same problem, that is a particularly good sign that it's time to split them out of a mega-function.
Many of your variable names could do with elaboration, especially the single letter ones.
Run the code through a Pep-8 style linter and ensure it's laid out in standard pythonic form. For example single line condition and response is discouraged.
This algorithm produces an output that is determinisitic and depending on its place in your game may be a little bit boring and predictable. Having the rotate mode mixes it up a bit, but that actually makes for an even more obvious pattern. It may be better to see whether you can introduce some random element. The easist option would be to do a random shuffle, and then swap out any letters that violate the rule.

Thanks! I do tend to be lazy with variable names, and I am not good about using PEP8 editor. I like the randomization suggestion. I should maybe have mentioned that the first use would be for a test script, to make sure basic things work. But still, it would be nice to have a stress test of possible anagrams. I also agree the function can be broken up. I really appreciate the points you made and the quick response. I take 24 hours before accepting an answer but anything that beats yours will have to be quite good indeed.
Oh, one other thing. Maybe have a look at argparse for handling command line input (docs.python.org/3/howto/argparse.html)

Stack Exchange Network

Creating "completely different" anagrams in python

1 Answer 1

Your Answer

Sign up or log in

Post as a guest

Post as a guest

Hot Network Questions

Creating "completely different" anagrams in python

1 Answer 1

Your Answer

Sign up or log in

Post as a guest

Post as a guest

Related

Hot Network Questions