Checking if a given string is in alphabetical order

Question 1

I wrote this function to check if a given string is in alphabetic order. How can I improve?

def alphacheck(x):
 '''This function checks if a given string is in Alphabetical order'''
 # Creating a variable to allow me to efficiently index a string of any length
 i = len(x) - 1
 # Using a recursive formula and checking for the base case when the string is of less than one character
 if len(x) <= 1:
 return True
 # This the recursive portion of the formula
 else:
 # This code checks if the last two characters are in alphabetical order
 # By utilizing the property that the letters of the alphabet increase in "value" accroding to python
 if x[i - 1] <= x[i]:
 # I remove the last letter of the string to reduce the probelm size
 x = x[:i]
 # I then repeat this on the smaller string until it reaches the base case or
 # Finds out the string is not in alphabetical order and returns False
 return alphacheck(x)
 else:
 return False

Question 2

I am not 100% sure of this, but your code will probably return unexpected results if you use it with accented characters. accènt will probably return false because è is later in the extended ASCII table than n, but humans will view it as alphabetical because they view è the same as e.

There are libraries that can help you with this. https://github.com/kmike/text-unidecode has a more permissive license. https://pypi.python.org/pypi/Unidecode gives slightly better results and performance, but has a GPL license which is harder to use.

Question 3

Collation depends on locale ( language + location) and even then conventions can vary greatly by context and be undefined. There are few true standards that are universally used and mixed language text will have issues.

Question 4

Sorting is idempotent, sorting an already sorted list leaves it unchanged.

So checking if a list is ordered equals checking if it equals itself when sorted:

def is_sorted_alphabetically(string):
 return string == ''.join(sorted(string))

If you do not want to have a \$O(NlogN)\$ runtime of sorting or you want to implement this yourself without using the sorted built-in, I suggest you avoid recursion and use the all and pairwise helper functions:

def is_sorted(iterable):
 return all(x1 <= x2 for x1, x2 in pairwise(iterable))

This is general and works for any iterable such as string but also lists of numbers or lists of anything of comparable (the type is Iterable[Comparable] -> bool)

Question 5

This would work assuming the input is only lowercase or only uppercase.

alphacheck('abc')

would return True, while

alphacheck('aBc')

would return False (because B < a < c)

You can solve this by converting the input string to lower or upper, like this:

x = x.lower()

It's good that you're trying to make the code readable, but IMO you're using too much comments and not expressive enough naming. It's often too easy to write comments that are just plain wrong, e.g. like here

# Using a recursive formula and checking for the base case when the string is of less than one character
if len(x) <= 1:

you're checking if x's length is less than or equal to 1, but in the comment you're saying that the check is for length < 1. It's a pretty unimportant mistake here, but it's just one of the many ways that excessive commenting can be bad. I would like less comments, and more expressive names -- something like this:

def is_alphabetical_ordered(input_string):
 '''This function checks if a given string is in Alphabetical order'''
 input_string = input_string.lower()
 last_index = len(input_string) - 1
 if len(input_string) <= 1:
 return True
 elif input_string[last_index - 1] <= input_string[last_index]:
 input_string = input_string[:last_index]
 return is_alphabetical_ordered(input_string)
 else:
 return False

More on readability

For a really concise and insightful explanation of the purpose of comments, read rolfl's answer to this codereview question.

I also highly recommend reading Robert Martin's Clean Code, as it goes more deeply into readability (among other things) — beyond comments and naming.

Question 6

+1 for mentioning the too many comments. Using this kind of comments, while well-intentioned, is a cop-out from writing readable code instead.

Question 7

In general, with Python, after first understanding the problem, it is often useful to attempt a pythonic implementation. There is a certain pattern to pythonic solutions, and while I don't know of any strict definition, a pythonic solution will often be concise and make full use of Python's built-in facilities. Also, it will often perform better and be more legible than a naïve implementation.

def is_alphabetical(string):
 '''Checks if a given string is in alphabetical order
 is_alphabetical() is case-insensitive, and will treat all strings
 as lowercase.
 Args:
 string: a string of characters.
 Returns:
 True if the string is alphabetical.
 '''
 string = string.lower()
 return not any((m >= n for m, n in zip(string, string[1:])))

Question 8

The function name

Your function is called a boolean function (a function that returns whether True or False). Hence alphacheck isn't a good name, I suggest re-naming it as is_abecedarian, which sounds like a yes/no question (which suits a boolean function).

Variables names

Your variables names aren't very descriptive such as x and i. It's better to re-name x to string, and i to index.

Documentation

You can re-write your function docstring a better way as:

"""Returns whether a string appears in an alphabetical order. """

it's conventional to use triple double quotes (""") instead of triple single quotes (''') as observed by me (not sure if it's the best way).

Here's the code after all of this:

def is_abecedarian(string):
 """Returns whether a string appears in an alphabetical order. """
 index = len(string) - 1
 string = string.lower()
 if len(string) <= 1:
 return True
 else: 
 if string[index-1] <= string[index]:
 return is_abecedarian(string[:index])
 return False

Notes:

I've made string lowercase according to the first answer (a good one), I didn't catch this, and eliminated comments according to it.
It's more concise and readable to remove the nested else in the second branch, because if string[index - 1] <= string[index] isn't True then the interpreter will execute (automatically) the next statement(s), and that we need, and here there's no use for else.
return is_abecedarian(string[:index]) is better, as you don't need to reassign string the new value or define a new variable, just pass the new value (string[:index]) to the function without using a variable, as variables are evaluated first to find values assigned to them, so it's the same to pass the value directly.

Edit: None isn't a boolean value, as I've mentioned. It belongs to NoneType class, not to bool class. Sorry for this misconception.

Question 9

This fails with the input is_abecedarian('a' * 10000), also this has \$O(n^2)\$ memory usage. Two things that are very simple to fix.

Question 10

Thank you! According to what I know, it fails because it exceeds Python maximum recursion depth, I just focused on making the code more concise and readable, but leaving the code as it is. I've wrote it, before without using recursion. I'm hobbyist and beginner, and I haven't come across big O notation yet, so I don't understand your second note . Feel free to edit the answer!

Question 11

The error is due to the recursion, since you state you intentionally keep it don't worry, :) Also it's \$O(n^2)\$ as if you see the line return is_abecedarian(string[:index]) it'll make a new list n-1 the size of the last. So if you have a string 5 long it'll allocate 5 + 4 + 3 + 2 + 1 memory addresses for this algorithm, and sum(range(x+1)) is x(x+1)/2 which is roughly x^2 so \$O(n^2)\$. So just ignore my comment, :)

Nzall Nzall 1,5389 silver badges23 bronze badges · Accepted Answer · 2016-09-19 09:30:29Z

I am not 100% sure of this, but your code will probably return unexpected results if you use it with accented characters. accènt will probably return false because è is later in the extended ASCII table than n, but humans will view it as alphabetical because they view è the same as e.

There are libraries that can help you with this. https://github.com/kmike/text-unidecode has a more permissive license. https://pypi.python.org/pypi/Unidecode gives slightly better results and performance, but has a GPL license which is harder to use.

Collation depends on locale ( language + location) and even then conventions can vary greatly by context and be undefined. There are few true standards that are universally used and mixed language text will have issues.

Stack Exchange Network

Checking if a given string is in alphabetical order

5 Answers 5

More on readability

Your Answer

Sign up or log in

Post as a guest

Post as a guest

Linked

Hot Network Questions

Checking if a given string is in alphabetical order

5 Answers 5

More on readability

Your Answer

Sign up or log in

Post as a guest

Post as a guest

Linked

Related

Hot Network Questions