4
\$\begingroup\$

I wanted to write a function to generate only unpronounceable words. I realised afterwards that the same thing could probably be achieved by just taking out the vowels. Regardless, I feel that there is probably a better way to iterate through the syllable checking.

function unpronouncable(length) {
 var twoLetters = ['aa','ae','ag','ai','ar','ao','as','aw','ay','ba','da','de','ed','ef','eh','ei','em','en','er','et','ex','fa','fe','fu','gi','go','hm','ho','so','id','ie','jo','ka','ki','ku','lo','mi','mm','mu','na','ne','nu','od','oe','oi','om','op','os','os','oy','pe','pi','qi','re','se','ta','ti','ut','wo','xi','xi','ya','yu','za','zo','co','my','qu','ge','ro','ra','xa','wa'];
 var threeLetters = ['thm','css','dvd','buh','bad','rue','cun','pub','puk','pul'];
 var letters = "abcdefghijklmnopqrstuvwxyz";
 var string = "";
 for (var i = 0; i < length; i += 1) {
 string += letters[Math.floor(Math.random() * letters.length)];
 }
 function generate() {
 for (var j = 0; j < string.length; j += 1) {
 var test = string.slice(j,j+2);
 for (var k = 0; k < twoLetters.length; k += 1) {
 if (twoLetters[k] === test) {
 console.log('Has two letter syllable');
 return unpronouncable(length);
 }
 }
 }
 for (var j = 0; j < string.length; j += 1) {
 var test = string.slice(j,j+3);
 for (var k = 0; k < twoLetters.length; k += 1) {
 if (threeLetters[k] === test) {
 console.log('Has three letter syllable');
 return unpronouncable(length);
 }
 } 
 }
 return string;
 }
 return generate();
}
console.log(unpronouncable(8));
200_success
146k22 gold badges190 silver badges478 bronze badges
asked Aug 29, 2014 at 17:13
\$\endgroup\$
1
  • \$\begingroup\$ Why not generate words based on the Sonority hierarchy? Reverse the hierarchy and they'll be unpronounceable in most langauges. \$\endgroup\$ Commented Aug 30, 2014 at 2:53

3 Answers 3

10
\$\begingroup\$

Copy Pasting

If you are copy-pasting code, it is a sign that you should write it differently (your two for loops are quite obviously copy pasted).

Bug

I'm pretty sure that here:

 for (var k = 0; k < twoLetters.length; k += 1) {
 if (threeLetters[k] === test) {

you meant to change it to k < threeLetters.length;.

Solution

Instead of copy-pasting the for loop code, extract it to a function:

function generateHelper(string, array, charCount) {
 for (var j = 0; j < string.length; j += 1) {
 var test = string.slice(j,j+charCount);
 for (var k = 0; k < array.length; k += 1) {
 if (array[k] === test) {
 console.log('Has ' + charCount + ' letter syllable');
 return unpronouncable(length);
 }
 }
 }
}

Coding Style

i++

It is customary to write j++ instead of j += 1 in loops.

loop variable

It is also customary to use i as primary loop variable and j as secondary variable in case the loop is nested inside another loop. As your j-loops in generate are not nested, I would use i for them, and then j instead of k for the nested loop.

Call of return generate();

return should generally be at the end of a function, but in this case, I would pull it up in front of the generate function definition. It is easier to overlook at the bottom.

But the whole generate method seems unnecessary. Couldn't you just remove these lines:

function generate() {
}
return generate();

I don't mean the code inside the function, just the function definition and the function call.

General Approach

You already mentioned that this is probably not the best approach, so I'm not going to say too much about it. Generally, I would probably go with an iterative solution as most of the time they are easier to understand and faster as well.

An example of an iterative approach might look like this:

function unpronouncable(length) {
 // filter these out
 var twoLetters = ['aa', 'ae', 'ag', 'ai', 'ar', 'ao', 'as', 'aw', 'ay', 'ba', 'da', 'de', 'ed', 'ef', 'eh', 'ei', 'em', 'en', 'er', 'et', 'ex', 'fa', 'fe', 'fu', 'gi', 'go', 'hm', 'ho', 'so', 'id', 'ie', 'jo', 'ka', 'ki', 'ku', 'lo', 'mi', 'mm', 'mu', 'na', 'ne', 'nu', 'od', 'oe', 'oi', 'om', 'op', 'os', 'os', 'oy', 'pe', 'pi', 'qi', 're', 'se', 'ta', 'ti', 'ut', 'wo', 'xi', 'xi', 'ya', 'yu', 'za', 'zo', 'co', 'my', 'qu', 'ge', 'ro', 'ra', 'xa', 'wa'];
 var threeLetters = ['thm', 'css', 'dvd', 'buh', 'bad', 'rue', 'cun', 'pub', 'puk', 'pul'];
 // generate a new string until one is found that does not contain the filters
 var generatedString = "";
 var done = false;
 while (!done) {
 generatedString = generate(length);
 done = !contains(generatedString, twoLetters, 2) && !contains(generatedString, threeLetters, 3);
 }
 return generatedString;
 // generates a random string with given length
 function generate(length) {
 var letters = "abcdefghijklmnopqrstuvwxyz";
 var string = "";
 for (var i = 0; i < length; i++) {
 string += letters[Math.floor(Math.random() * letters.length)];
 }
 return string;
 }
 // returns true if string contains any characters in the stringArray.
 function contains(string, stringArray, charCount) {
 for (var k = 0; k < stringArray.length; k++) {
 if (string.indexOf(stringArray[k]) > -1) {
 console.log('Has ' + charCount + ' letter syllable');
 return true;
 }
 }
 return false;
 }
}
console.log(unpronouncable(8));

You could also replace the done = [...] assignment with two if statements, which might be easier to read.

Also: your approach with iterating over the string and using splice is quite complicated. Just use string.indexOf(stringArray[k]) > -1; to check if a string contains another string.

And just out of curiosity: how would you pronounce css or dvd?

answered Aug 29, 2014 at 17:36
\$\endgroup\$
4
  • \$\begingroup\$ I was thinking of those acronyms as things that aid memory, so avoiding them would make the output as random as possible. My idea was to come up with a name that nobody would ever be able to say or remember. Could you comment on how I might use an iterative approach? \$\endgroup\$ Commented Aug 29, 2014 at 17:56
  • \$\begingroup\$ @user759 ok, so it's more unrememberable than unpronouncable. And I updated my answer for an iterative approach. \$\endgroup\$ Commented Aug 29, 2014 at 18:12
  • \$\begingroup\$ Loop variables don't have to be i then j. Use whatever letter feel rights. Use r and c when dealing with rows and columns or x and y if they have to do with coordinates. Try to use the same index variables on the same collection. \$\endgroup\$ Commented Aug 29, 2014 at 21:30
  • \$\begingroup\$ @FlorianF well, obviously they don't have to be. but if there isn't a good reason (like with your two examples), i and j are standard, and using j and k without any reason is odd and can lead to confusion and harder to read code (to some degree; I'm not saying that it is a major mistake, just that it is against common style). \$\endgroup\$ Commented Aug 29, 2014 at 21:41
2
\$\begingroup\$

An efficient approach could be to build a data structure which will let you look at each letter in turn and test if following letters make the word pronounceable.

That could get complex, but fortunately you can get JavaScript to do this (or something like it) for you, using a regular expression. I'd write it like this:

var twoLetters = ['aa','ae','ag','ai','ar','ao','as','aw','ay','ba','da','de','ed','ef','eh','ei','em','en','er','et','ex','fa','fe','fu','gi','go','hm','ho','so','id','ie','jo','ka','ki','ku','lo','mi','mm','mu','na','ne','nu','od','oe','oi','om','op','os','os','oy','pe','pi','qi','re','se','ta','ti','ut','wo','xi','xi','ya','yu','za','zo','co','my','qu','ge','ro','ra','xa','wa'];
var threeLetters = ['thm','css','dvd','buh','bad','rue','cun','pub','puk','pul'];
var pronounceable = new RegExp(twoLetters.concat(threeLetters).join('|'));
function random_word(length) {
 var letters = "abcdefghijklmnopqrstuvwxyz";
 var string = "";
 for (var i = 0; i < length; i++) {
 string += letters[Math.floor(Math.random() * letters.length)];
 }
 return string;
}
function generate_unpronounceable(length) {
 var word;
 while (true) {
 word = random_word(length);
 if (! pronounceable.test(word)) {
 return word;
 }
 }
}
answered Aug 29, 2014 at 18:51
\$\endgroup\$
2
\$\begingroup\$

To start with, unpronouncable should be spelled unpronounceable, or better yet, generate_unpronounceable().

I'm not convinced that this is a valid strategy. If it is a valid approach, then your lists of pronounceable syllables is incomplete. For example, unpronouncable(4) is capable of returning "test", which I consider to be a pronounceable word.

Your two-letter combinations are listed almost alphabetically, which makes the list hard to maintain. For example, "so" is out of place, "os" and "xi" are listed twice, and eight elements are haphazardly added at the end. The three-letter list doesn't appear to be in any recognizable order.

When you reject a candidate word, you use recursion to retry. In JavaScript, you should avoid using recursion when a simple while loop will do — especially when there is no bound to the depth of the recursion. The code should be reorganized along these lines:

function generate_unpronounceable(length) {
 var LETTERS = "abcdefghijklmnopqrstuvwxyz";
 do {
 var word = "";
 for (var i = 0; i < length; i++) {
 word += LETTERS[Math.floor(Math.random() * LETTERS.length)];
 }
 } while (is_pronounceable(word));
 return word;
}
function is_prounounceable(word) {
 ...
}

Decomposing the problem into generate_unpronounceable() and is_pronounceable() is also better because is_pronounceable() can be unit-tested more easily.

answered Aug 29, 2014 at 19:46
\$\endgroup\$
1
  • \$\begingroup\$ I was originally using recursion, but then if there is a match and I break out of the test loop, I'm still stuck in the outer for loop and the next three letter syllable one. Is there a way to avoid that? \$\endgroup\$ Commented Aug 29, 2014 at 20:24

Your Answer

Draft saved
Draft discarded

Sign up or log in

Sign up using Google
Sign up using Email and Password

Post as a guest

Required, but never shown

Post as a guest

Required, but never shown

By clicking "Post Your Answer", you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.