Convert integer to English words

Question 1

I was recently asked an interview question with regards to algorithm design. The challenge is as follows:

Given a 6 or less digits positive integer (0 - 999999 inclusive), write a function englishify(number: int) that returns the full English equivalent of that number. Here are some samples of the structure you are expected to generate:

1 - One

222 - Two Hundred And Twenty two

1234 - One Thousand, Two Hundred and Thirty Four

31337 - Thirty One Thousand, Three Hundred And Thirty Seven

100100 - One Hundred Thousand And One Hundred

200111 - Two Hundred Thousand, One Hundred And Eleven

As you may be able to see, there is a key requirement when it comes to formatting this:

There should be a comma, not 'And', after the thousands and before the hundreds, if both exist.

I have attempted this challenge with the code below:

def englishify(number):
 #Numbers 0-19 (unique numbers)
 OneToNine= 'One Two Three Four Five Six Seven Eight Nine'.split()
 TenToNineteen = 'Ten Eleven Twelve Thirteen Fourteen Fifteen Sixteen Seventeen Eighteen Nineteen'.split()
 ZeroToNineteen = [''] + OneToNine + TenToNineteen
 #Numbers >= 20 at intervals of 10
 Tens = 'Twenty Thirty Forty Fifty Sixty Seventy Eighty Ninety'.split()
 #Additional function for ease of processing of numbers
 def englishifyHundreds(number):
 #Special case: Number = 0
 if number == 0:
 return 'Zero'
 #1. Number from 1-19
 if number < 20:
 return ZeroToNineteen[number]
 #2. Number from 20-99
 if number >= 20 and number < 100:
 result = Tens[int(number/10)-2] + ' ' + ZeroToNineteen[int(number%10)]
 return result.rstrip()
 #3. Number from 100-999
 if number >= 100:
 #Separating hundreds digit and tens digit
 tens = number - ((number//100)*100)
 #Accounting for edges = 0 (number = 100, 200, ...)
 if number%100 == 0:
 return ZeroToNineteen[int(number/100)] + ' Hundred'
 else:
 return ZeroToNineteen[int(number/100)] + ' Hundred And ' + englishifyHundreds(tens)
 #Actual processing of number
 if len(str(number)) <= 3:
 return englishifyHundreds(number)
 else:
 #Splitting number into 'thousands' digits and 'hundreds' digits
 thousands = int(str(number)[:-3])
 hundreds = int(str(number)[-3:])
 #Accounting for edges = 0 (thousands = 1000, 2000, ...)
 if thousands % 1000 == 0:
 return englishifyHundreds(thousands) + ' Thousand'
 else:
 #Accounting for if hundreds == 0:
 if hundreds == 0:
 return englishifyHundreds(thousands) + ' Thousand'
 #Accounting for cases where comma is not necessary
 elif hundreds % 100 == 0 or hundreds < 100:
 return englishifyHundreds(thousands) + ' Thousand And ' + englishifyHundreds(hundreds)
 #Remaining cases implementing comma
 else:
 return englishifyHundreds(thousands) + ' Thousand, ' + englishifyHundreds(hundreds)

I am currently trying to rack my brains thinking of ways to optimize this, but with my limited knowledge of recursions and algorithms this is the best I can churn out for now. Hopefully I can seek some opinions from some of the more experienced programmers around here.

Question 2

Did you write englishify(number)?

Question 3

Sorry, I'm not quite sure what you meant. If you were asking whether I wrote this function englishify(number) myself, the answer is yes.

Question 4

RE this: tens = number - ((number//100)*100), you can use %100 to get the remainder, and I'm not sure that tens is the right name for a variable that represents the tens and units. Also you should try to be a bit more consistent between using integer division(number//100 for example) and fractional division cast to integer(int(number/100)). They give the same results so you should try to avoid mixing them to minimise potential confusion.

Question 5

Ahh! Yes! The tens variable. Not quite sure what I was thinking there... must have messed up my operators in the midst of it. Will work on it. Nevertheless, thank you! Your points are deeply appreciated.

Question 6

The basic approach doesn't look bad. You can optimize by working more with arrays that won't require any offset-handling.

For example:

You can combine 'Zero' and the arrays OneToTen and TenToNineteen to one direct-initialized array.
The array Ten could also have two empty values in the first two entries, making 'Twenty' available at index 2.
The Thousands, Millions, etc. could also be stored inside an array.

This would look somewhat like this:

 #Numbers 0-19 (unique numbers)
 ZeroToNineteen = [
 'Zero',
 'One',
 'Two',
 'Three',
 'Four',
 'Five',
 'Six',
 'Seven',
 'Eight',
 'Nine',
 'Ten',
 'Eleven',
 'Twelve',
 'Thirteen',
 'Fourteen',
 'Fifteen',
 'Sixteen',
 'Seventeen',
 'Eighteen',
 'Nineteen']
 #Numbers at intervals of 10
 Tens = [
 '',
 '',
 'Twenty',
 'Thirty',
 'Forty',
 'Fifty',
 'Sixty',
 'Seventy',
 'Eighty',
 'Ninety']
 #Numbers at intervals of 1000
 Thousands = [
 '',
 'Thousand',
 'Million',
 'Billion',
 'Trillion',
 'Quadrillion',
 'Quintillion',
 'Sextillion',
 'Septillion'
 ]

Doing so allows for a slightly shorter and simpler version of your englishifyHundreds function:

 #Additional function for ease of processing of numbers
 def englishifyHundreds(number):
 #1. Number from 0-19
 if number < 20:
 return ZeroToNineteen[int(number)]
 #2. Number from 20-99
 elif number < 100:
 return Tens[int(number/10)] + ' ' + ZeroToNineteen[int(number%10)]
 #3. Number from 100-999
 else:
 #Accounting for edges = 0 (number = 100, 200, ...)
 remainder = int(number) % 100
 if remainder == 0:
 return ZeroToNineteen[int(number/100)] + ' Hundred'
 else:
 return ZeroToNineteen[int(number/100)] + ' Hundred And ' + englishifyHundreds(remainder)

The assembly can than be achieved by first splitting the number into number representing 3 digits each:

 parts=[]
 iterations = int((len(str(number))-1)/3) + 1
 iteration = int(0)
 while iteration < iterations:
 part = int(number % 1000)
 number = int(number / 1000)
 parts.append(part)
 iteration += 1
 parts.reverse()

And then reassembling it according to your rules:

 numberString=''
 for i, part in enumerate(parts):
 if part == 0:
 continue
 thousandsIndex = len(parts) - (i+1)
 separatorString = ('' if i == 0 else ' And ' if (part < 100 or part % 100 == 0) else ', ')
 partString = englishifyHundreds(part)
 thousandString = (' ' + Thousands[thousandsIndex]) if thousandsIndex > 0 else ''
 numberString += separatorString + partString + thousandString
 return numberString

I have tried the code using Repl.it (Link) with the following test cases:

# Test cases
print(englishify(1)) # - One
print(englishify(222)) # - Two Hundred And Twenty two
print(englishify(1234)) # - One Thousand, Two Hundred and Thirty Four
print(englishify(31337)) # - Thirty One Thousand, Three Hundred And Thirty Seven
print(englishify(100100)) # - One Hundred Thousand And One Hundred
print(englishify(200111)) # - Two Hundred Thousand, One Hundred And Eleven
# Custom
print(englishify(10000000025))

Result:

One
Two Hundred And Twenty Two
One Thousand, Two Hundred And Thirty Four
Thirty One Thousand, Three Hundred And Thirty Seven
One Hundred Thousand And One Hundred
Two Hundred Thousand, One Hundred And Eleven
Ten Billion And Twenty Five

Question 7

The test case comment says the same as the block quote: 100100 - One Hundred Thousand And One Hundred, the question claiming There should be a comma, not 'And', after the thousands and before the hundreds, if both exist a key requirement in the very next paragraph.

Question 8

I like the point on creating the empty values in my 'Ten' array, it would help a lot with keeping some of my indexing consistent. Also, even though it is true that one special case was left out (One Hundred Thousand And One Hundred instead of One Hundred Thousand, One Hundred), it seems like your approach could be used for values greater than 999999 as well. Thank you!

Question 9

@greybeard Oh yes, I missed that one. Can be easily fixed by replacing separatorString = ('' if i == 0 else ' And ' if part < 100 else ', ') with separatorString = ('' if i == 0 else ' And ' if (part < 100 or part % 100 == 0) else ', ')

Question 10

Adjusted the answer. There are for sure some cases that don't quite work as expected, but I leave these for you to solve. :-)

Question 11

Your answer is right on. The adjustment is much appreciated!

Question 12

It's often best to write this kind of function along with its tests. Let's start with a simple test:

import doctest;
def englishify(number):
 """Format NUMBER into standard English form.
 NUMBER must be in range [0..999999]
 >>> englishify(0)
 'Zero'
 """
 return 'Zero'
if __name__ == '__main__':
 doctest.testmod()

Now, start adding more tests and make each one pass before moving on to the next.

You'll find that when you reach the bigger numbers, there's a useful recursive property. We can split off the thousands and the hundreds, format each non-zero part separately, and then join using one of the techniques from Joining words together with a comma, and "and". For example, consider these inputs:

123456 ⟶ 123 Thousand , 4 Hundred and 56
123056 ⟶ 123 Thousand and 56
123400 ⟶ 123 Thousand and 4 Hundred

Now those individual numbers can be formatted into words (and the last case will have two "and"s in normal English: "One hundred and twenty-three thousand, four hundred and fifty-six.'

Modified code

import doctest;
def englishify(number):
 """Format NUMBER into standard English form.
 NUMBER must be in range [0..999999]
 >>> englishify(0)
 'Zero'
 >>> englishify(10)
 'Ten'
 >>> englishify(20)
 'Twenty'
 >>> englishify(99)
 'Ninety Nine'
 >>> englishify(100)
 'One Hundred'
 >>> englishify(101)
 'One Hundred and One'
 >>> englishify(1001)
 'One Thousand and One'
 >>> englishify(1201)
 'One Thousand, Two Hundred and One'
 >>> englishify(123201)
 'One Hundred and Twenty Three Thousand, Two Hundred and One'
 """
 Units = [None, 'One', 'Two', 'Three', 'Four', 'Five',
 'Six', 'Seven', 'Eight', 'Nine',
 'Ten', 'Eleven', 'Twelve', 'Thirteen', 'Fourteen', 'Fifteen',
 'Sixteen', 'Seventeen', 'Eighteen', 'Nineteen']
 Tens = [None, None, 'Twenty', 'Thirty', 'Forty', 'Fifty',
 'Sixty', 'Seventy', 'Eighty', 'Ninety']
 if number < 20:
 return Units[number] or 'Zero'
 if number < 100:
 return ' '.join(filter(None, [Tens[number//10], Units[number%10]]))
 # Larger numbers - break down and englishify each part
 parts = list(filter(None,
 map(lambda quantity, number:
 englishify(number) + quantity if number else None,
 [' Thousand', ' Hundred', ''],
 [number // 1000, number // 100 % 10, number % 100])))
 if len(parts) == 1:
 return parts[0]
 return ' and '.join([', '.join(parts[:-1]), parts[-1]])
if __name__ == '__main__':
 doctest.testmod()

The extension to support millions and more should now be obvious - just add the unit and its extraction to the list passed to map().

Question 13

Yes! That was exactly the kind of approach I was going for. Hence the reason why I created the englishifyHundreds function to allow me to pass both thousands and hundreds through that function for separate processing, before adding in the 'And' myself. Nevertheless, thank you for the useful tip regarding tests! Will try to implement them in future when I am trying to design other algorithms for interview/personal purposes.

Question 14

I didn't explain myself very well, so I've added a worked example that simplifies the joining of the thousands, hundreds and units. The key is to filter out the empty components before joining the remaining ones.

Question 15

Wow, that is much neater indeed. Looks like I should brush up on my usage of lists and maps/filters. I also love that your solution is scalable to numbers in the millions/billions. Thank you!

Simon Kraemer Simon Kraemer 3892 silver badges10 bronze badges · Answer 1 · 2020-01-21 17:25:43Z

The basic approach doesn't look bad. You can optimize by working more with arrays that won't require any offset-handling.

For example:

You can combine 'Zero' and the arrays OneToTen and TenToNineteen to one direct-initialized array.
The array Ten could also have two empty values in the first two entries, making 'Twenty' available at index 2.
The Thousands, Millions, etc. could also be stored inside an array.

This would look somewhat like this:

 #Numbers 0-19 (unique numbers)
 ZeroToNineteen = [
 'Zero',
 'One',
 'Two',
 'Three',
 'Four',
 'Five',
 'Six',
 'Seven',
 'Eight',
 'Nine',
 'Ten',
 'Eleven',
 'Twelve',
 'Thirteen',
 'Fourteen',
 'Fifteen',
 'Sixteen',
 'Seventeen',
 'Eighteen',
 'Nineteen']
 #Numbers at intervals of 10
 Tens = [
 '',
 '',
 'Twenty',
 'Thirty',
 'Forty',
 'Fifty',
 'Sixty',
 'Seventy',
 'Eighty',
 'Ninety']
 #Numbers at intervals of 1000
 Thousands = [
 '',
 'Thousand',
 'Million',
 'Billion',
 'Trillion',
 'Quadrillion',
 'Quintillion',
 'Sextillion',
 'Septillion'
 ]

Doing so allows for a slightly shorter and simpler version of your englishifyHundreds function:

 #Additional function for ease of processing of numbers
 def englishifyHundreds(number):
 #1. Number from 0-19
 if number < 20:
 return ZeroToNineteen[int(number)]
 #2. Number from 20-99
 elif number < 100:
 return Tens[int(number/10)] + ' ' + ZeroToNineteen[int(number%10)]
 #3. Number from 100-999
 else:
 #Accounting for edges = 0 (number = 100, 200, ...)
 remainder = int(number) % 100
 if remainder == 0:
 return ZeroToNineteen[int(number/100)] + ' Hundred'
 else:
 return ZeroToNineteen[int(number/100)] + ' Hundred And ' + englishifyHundreds(remainder)

The assembly can than be achieved by first splitting the number into number representing 3 digits each:

 parts=[]
 iterations = int((len(str(number))-1)/3) + 1
 iteration = int(0)
 while iteration < iterations:
 part = int(number % 1000)
 number = int(number / 1000)
 parts.append(part)
 iteration += 1
 parts.reverse()

And then reassembling it according to your rules:

 numberString=''
 for i, part in enumerate(parts):
 if part == 0:
 continue
 thousandsIndex = len(parts) - (i+1)
 separatorString = ('' if i == 0 else ' And ' if (part < 100 or part % 100 == 0) else ', ')
 partString = englishifyHundreds(part)
 thousandString = (' ' + Thousands[thousandsIndex]) if thousandsIndex > 0 else ''
 numberString += separatorString + partString + thousandString
 return numberString

I have tried the code using Repl.it (Link) with the following test cases:

# Test cases
print(englishify(1)) # - One
print(englishify(222)) # - Two Hundred And Twenty two
print(englishify(1234)) # - One Thousand, Two Hundred and Thirty Four
print(englishify(31337)) # - Thirty One Thousand, Three Hundred And Thirty Seven
print(englishify(100100)) # - One Hundred Thousand And One Hundred
print(englishify(200111)) # - Two Hundred Thousand, One Hundred And Eleven
# Custom
print(englishify(10000000025))

Result:

One
Two Hundred And Twenty Two
One Thousand, Two Hundred And Thirty Four
Thirty One Thousand, Three Hundred And Thirty Seven
One Hundred Thousand And One Hundred
Two Hundred Thousand, One Hundred And Eleven
Ten Billion And Twenty Five

The test case comment says the same as the block quote: 100100 - One Hundred Thousand And One Hundred, the question claiming There should be a comma, not 'And', after the thousands and before the hundreds, if both exist a key requirement in the very next paragraph.
I like the point on creating the empty values in my 'Ten' array, it would help a lot with keeping some of my indexing consistent. Also, even though it is true that one special case was left out (One Hundred Thousand And One Hundred instead of One Hundred Thousand, One Hundred), it seems like your approach could be used for values greater than 999999 as well. Thank you!
@greybeard Oh yes, I missed that one. Can be easily fixed by replacing separatorString = ('' if i == 0 else ' And ' if part < 100 else ', ') with separatorString = ('' if i == 0 else ' And ' if (part < 100 or part % 100 == 0) else ', ')
Adjusted the answer. There are for sure some cases that don't quite work as expected, but I leave these for you to solve. :-)
Your answer is right on. The adjustment is much appreciated!

Toby Speight Toby Speight 87.2k14 gold badges104 silver badges322 bronze badges · Answer 2 · 2020-01-21 15:50:52Z

It's often best to write this kind of function along with its tests. Let's start with a simple test:

import doctest;
def englishify(number):
 """Format NUMBER into standard English form.
 NUMBER must be in range [0..999999]
 >>> englishify(0)
 'Zero'
 """
 return 'Zero'
if __name__ == '__main__':
 doctest.testmod()

Now, start adding more tests and make each one pass before moving on to the next.

You'll find that when you reach the bigger numbers, there's a useful recursive property. We can split off the thousands and the hundreds, format each non-zero part separately, and then join using one of the techniques from Joining words together with a comma, and "and". For example, consider these inputs:

123456 ⟶ 123 Thousand , 4 Hundred and 56
123056 ⟶ 123 Thousand and 56
123400 ⟶ 123 Thousand and 4 Hundred

Now those individual numbers can be formatted into words (and the last case will have two "and"s in normal English: "One hundred and twenty-three thousand, four hundred and fifty-six.'

Modified code

import doctest;
def englishify(number):
 """Format NUMBER into standard English form.
 NUMBER must be in range [0..999999]
 >>> englishify(0)
 'Zero'
 >>> englishify(10)
 'Ten'
 >>> englishify(20)
 'Twenty'
 >>> englishify(99)
 'Ninety Nine'
 >>> englishify(100)
 'One Hundred'
 >>> englishify(101)
 'One Hundred and One'
 >>> englishify(1001)
 'One Thousand and One'
 >>> englishify(1201)
 'One Thousand, Two Hundred and One'
 >>> englishify(123201)
 'One Hundred and Twenty Three Thousand, Two Hundred and One'
 """
 Units = [None, 'One', 'Two', 'Three', 'Four', 'Five',
 'Six', 'Seven', 'Eight', 'Nine',
 'Ten', 'Eleven', 'Twelve', 'Thirteen', 'Fourteen', 'Fifteen',
 'Sixteen', 'Seventeen', 'Eighteen', 'Nineteen']
 Tens = [None, None, 'Twenty', 'Thirty', 'Forty', 'Fifty',
 'Sixty', 'Seventy', 'Eighty', 'Ninety']
 if number < 20:
 return Units[number] or 'Zero'
 if number < 100:
 return ' '.join(filter(None, [Tens[number//10], Units[number%10]]))
 # Larger numbers - break down and englishify each part
 parts = list(filter(None,
 map(lambda quantity, number:
 englishify(number) + quantity if number else None,
 [' Thousand', ' Hundred', ''],
 [number // 1000, number // 100 % 10, number % 100])))
 if len(parts) == 1:
 return parts[0]
 return ' and '.join([', '.join(parts[:-1]), parts[-1]])
if __name__ == '__main__':
 doctest.testmod()

The extension to support millions and more should now be obvious - just add the unit and its extraction to the list passed to map().

Yes! That was exactly the kind of approach I was going for. Hence the reason why I created the englishifyHundreds function to allow me to pass both thousands and hundreds through that function for separate processing, before adding in the 'And' myself. Nevertheless, thank you for the useful tip regarding tests! Will try to implement them in future when I am trying to design other algorithms for interview/personal purposes.
I didn't explain myself very well, so I've added a worked example that simplifies the joining of the thousands, hundreds and units. The key is to filter out the empty components before joining the remaining ones.
Wow, that is much neater indeed. Looks like I should brush up on my usage of lists and maps/filters. I also love that your solution is scalable to numbers in the millions/billions. Thank you!

Stack Exchange Network

Convert integer to English words

2 Answers 2

Modified code

Your Answer

Sign up or log in

Post as a guest

Post as a guest

Hot Network Questions

Convert integer to English words

2 Answers 2

Modified code

Your Answer

Sign up or log in

Post as a guest

Post as a guest

Related

Hot Network Questions