5
\$\begingroup\$

I am having a problem with my program: it does as required but largest_blanagram() is too slow. I'm not sure whether the best way to speed it up is (1) try to speed up blanagram(), which is being called a lot (2) figure out a way to call blanagram() less. I'm not sure how to go about either of these things particularly.

Things I've been thinking: is there a way to use another dictionary more efficiently, such as using map()?

# CW Q2 blanagram.py
import string
from Q1_anagram import readfile, make_anagram_dict # change
def blanagram(word, anagramdict): # split up into smaller functions?
 """
 """
 word = sorted(word) # sort alphabetically
 alphabet = list(string.ascii_lowercase)
 index = 0
 blanagrams = []
 for char in word:
 original_char = char # save original char, so can revert back to it later
 for letter in alphabet:
 word[index] = letter
 word_copy = word # as order messes up when order alphabetically, so save current state and revert back to it after the iteration
 word = ''.join(sorted(word)) # list -> string to be able to check dict and sort alphabetically
 if word in anagramdict.keys() and anagramdict[word] not in blanagrams:
 for anagramdict[word] in anagramdict[word]:
 blanagrams.append(anagramdict[word])# change name?
 word = word_copy
 # next character in word 
 word = list(word) # string -> list in order to index 
 word[index] = original_char # set back to original char, as only 1 char can be different from original word
 index += 1 # to change following char 
 return blanagrams # get rid of anagrams?
def largest_blanagram(words, word_length): # greatest number of blanagram variants
 """
 """
 blanagrams = {}
 length = 0
 for word in words:
 if len(word) == word_length:
 variants = blanagram(word, make_anagram_dict(words))
 #print(variants)
 if len(variants) > length:
 blanagrams[word] = variants
 length = len(variants)
 return blanagrams
if __name__ == "__main__":
 # print(blanagram('ghost', make_anagram_dict(readfile('words.txt'))))
 import sys
 print(largest_blanagram(readfile('words.txt'), 13)) #sys.argv[1]

Imports -

def readfile(filename):
"""
"""
file = open(filename,'r').readlines() # look into WITH open file
words = []
for line in file:
 line = line.lower().split() # orang and Orang? unique? or is asteer another answer
 for word in line:
 words.append(word)
return words
def make_anagram_dict(words):
"""
"""
assert len(words) != 0
anagram_dict = {}
for word in words:
 alpha = ''.join(sorted(word)) # Order word alphabetically
 # Key is alphabetically ordered word, value is list of anagrams of the key
 anagram_dict.setdefault(alpha, []).append(word) # So not appending NoneType (as list.append returns None)
return anagram_dict

And here is a pastebin equivalent (pastebin has file size limit) link to the words.txt file that I use.

Perry
3653 silver badges13 bronze badges
asked Dec 24, 2016 at 18:21
\$\endgroup\$
6
  • 2
    \$\begingroup\$ Can you give inputs / outputs to play with? \$\endgroup\$ Commented Dec 24, 2016 at 18:31
  • 3
    \$\begingroup\$ Welcome to CodeReview.SE! Your question title (and your comment) should tell us more about the problem you are trying to solve. Also, it seems like we are missing some code from a different file... \$\endgroup\$ Commented Dec 24, 2016 at 18:32
  • \$\begingroup\$ You have lots of loops on the data. So some options to speed it up:1) use pandas and use column-based formulas (Will use fast C code of pandas and numpy) 2) using Cython can increase the loop speed about 100 times. \$\endgroup\$ Commented Dec 24, 2016 at 18:55
  • 1
    \$\begingroup\$ The indentation in the second piece of code seems wrong. Please have a look at at. \$\endgroup\$ Commented Dec 24, 2016 at 20:47
  • 1
    \$\begingroup\$ So, what exactly is a blanagram? \$\endgroup\$ Commented Dec 24, 2016 at 20:52

1 Answer 1

8
\$\begingroup\$

Do not compute things more often than you need to

The point of make_anagram_dict is to have a cache to perform some anagram check very efficiently. However, you rebuild it for each and every words which misses the point of having a cache.

By writing:

ana_dict = make_anagram_dict(words)
length = 0
for word in words:
 if len(word) == word_length:
 variants = blanagram(word, ana_dict)

things should be much faster already.

Strange loop

I must confess that for anagramdict[word] in anagramdict[word] really surprised me. What are you expecting from this loop?

answered Dec 24, 2016 at 21:02
\$\endgroup\$

Your Answer

Draft saved
Draft discarded

Sign up or log in

Sign up using Google
Sign up using Email and Password

Post as a guest

Required, but never shown

Post as a guest

Required, but never shown

By clicking "Post Your Answer", you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.