2

I have written this code to convert string in such format "0(532) 222 22 22" to integer such as 05322222222 .

class Phone():
 def __init__(self,input):
 self.phone = input
 def __str__(self):
 return self.phone
 #convert to integer.
 def to_int(self):
 return int((self.phone).replace(" ","").replace("(","").replace(")",""))
test = Phone("0(532) 222 22 22")
print test.to_int()

It feels very clumsy to use 3 replace methods to solve this. I am curious if there is a better solution?

asked Mar 23, 2010 at 12:53
2
  • 6
    Why? A phone number is not an integer - it is a string of digits and sometimes other characters. Converting it to an int serves no useful purpose, and will lose information such as leading zeros. Don't do it. Commented Mar 25, 2010 at 21:01
  • Cutting off leading zeros doesn't matter if the phone number length always stays the same. Commented Jul 26, 2012 at 15:56

4 Answers 4

9
p = "0(532) 222 22 22"
print ''.join([x for x in p if x.isdigit()])

Note that you'll "lose" the leading zero if you want to convert it to int (like you suggested in the title). If you want to do that, just wrap the above in a int() call. A telephone number does make more sense as a string though (in my opinion).

answered Mar 23, 2010 at 13:00
Sign up to request clarification or add additional context in comments.

4 Comments

Feel free to ditch the square brackets.
The square brackets are only optional in newer versions of Python.
@gabe If by "newer" you mean "2.4 or higher", which is 5 years old by now?
I work on plenty of systems that are stuck on 2.3, and I doubt I'm the only one.
6

In Python 2.6 or 2.7,
(self.phone).translate(None,' ()')
will remove any spaces or ( or ) from the phone string. See Python 2.6 doc on str.translate for details.

In Python 3.x, str.translate() takes a mapping (rather than two strings as shown above). The corresponding snippet therefore is something like the following, using str.maketrans() to produce the mapping.
'(self.phone).translate(str.maketrans('','', '()-/ '))
See Python 3.1 doc on str.translate for details.

mjv
75.7k14 gold badges119 silver badges158 bronze badges
answered Mar 23, 2010 at 13:01

2 Comments

great aswell, sucks debian uses 2.5 for stable :(
@Hellnar: I'm often limited to 2.4 on CentOS 5. I actually kind of like ChristopheD's answer better than mine anyway.
1

How about just using regular expressions?

Example:

>>> import re
>>> num = '0(532) 222 22 22'
>>> re.sub('[\D]', '', num) # Match all non-digits ([\D]), replace them with empty string, where found in the `num` variable.
'05322222222'

The suggestion made by ChristopheD will work just fine, but is not as efficient.

The following is a test program to demonstrate this using the dis module (See Doug Hellman's PyMOTW on the module here for more detailed info).

TEST_PHONE_NUM = '0(532) 222 22 22'
def replace_method():
 print (TEST_PHONE_NUM).replace(" ","").replace("(","").replace(")","")
def list_comp_is_digit_method():
 print ''.join([x for x in TEST_PHONE_NUM if x.isdigit()])
def translate_method():
 print (TEST_PHONE_NUM).translate(None,' ()')
import re
def regex_method():
 print re.sub('[\D]', '', TEST_PHONE_NUM)
if __name__ == '__main__':
 from dis import dis
 print 'replace_method:'
 dis(replace_method)
 print
 print
 print 'list_comp_is_digit_method:'
 dis(list_comp_is_digit_method)
 print
 print
 print 'translate_method:'
 dis(translate_method)
 print
 print
 print "regex_method:"
 dis(phone_digit_strip_regex)
 print

Output:

replace_method:
 5 0 LOAD_GLOBAL 0 (TEST_PHONE_NUM)
 3 LOAD_ATTR 1 (replace)
 6 LOAD_CONST 1 (' ')
 9 LOAD_CONST 2 ('')
 12 CALL_FUNCTION 2
 15 LOAD_ATTR 1 (replace)
 18 LOAD_CONST 3 ('(')
 21 LOAD_CONST 2 ('')
 24 CALL_FUNCTION 2
 27 LOAD_ATTR 1 (replace)
 30 LOAD_CONST 4 (')')
 33 LOAD_CONST 2 ('')
 36 CALL_FUNCTION 2
 39 PRINT_ITEM 
 40 PRINT_NEWLINE 
 41 LOAD_CONST 0 (None)
 44 RETURN_VALUE 
phone_digit_strip_list_comp:
 3 0 LOAD_CONST 1 ('0(532) 222 22 22')
 3 STORE_FAST 0 (phone)
 4 6 LOAD_CONST 2 ('')
 9 LOAD_ATTR 0 (join)
 12 BUILD_LIST 0
 15 DUP_TOP 
 16 STORE_FAST 1 (_[1])
 19 LOAD_GLOBAL 1 (test_phone_num)
 22 GET_ITER 
 23 FOR_ITER 30 (to 56)
 26 STORE_FAST 2 (x)
 29 LOAD_FAST 2 (x)
 32 LOAD_ATTR 2 (isdigit)
 35 CALL_FUNCTION 0
 38 JUMP_IF_FALSE 11 (to 52)
 41 POP_TOP 
 42 LOAD_FAST 1 (_[1])
 45 LOAD_FAST 2 (x)
 48 LIST_APPEND 
 49 JUMP_ABSOLUTE 23
 52 POP_TOP 
 53 JUMP_ABSOLUTE 23
 56 DELETE_FAST 1 (_[1])
 59 CALL_FUNCTION 1
 62 PRINT_ITEM 
 63 PRINT_NEWLINE 
 64 LOAD_CONST 0 (None)
 67 RETURN_VALUE 
translate_method:
 11 0 LOAD_GLOBAL 0 (TEST_PHONE_NUM)
 3 LOAD_ATTR 1 (translate)
 6 LOAD_CONST 0 (None)
 9 LOAD_CONST 1 (' ()')
 12 CALL_FUNCTION 2
 15 PRINT_ITEM 
 16 PRINT_NEWLINE 
 17 LOAD_CONST 0 (None)
 20 RETURN_VALUE 
phone_digit_strip_regex:
 8 0 LOAD_CONST 1 ('0(532) 222 22 22')
 3 STORE_FAST 0 (phone)
 9 6 LOAD_GLOBAL 0 (re)
 9 LOAD_ATTR 1 (sub)
 12 LOAD_CONST 2 ('[\\D]')
 15 LOAD_CONST 3 ('')
 18 LOAD_GLOBAL 2 (test_phone_num)
 21 CALL_FUNCTION 3
 24 PRINT_ITEM 
 25 PRINT_NEWLINE 
 26 LOAD_CONST 0 (None)
 29 RETURN_VALUE 

The translate method will be the most efficient, though relies on py2.6+. regex is slightly less efficient, but more compatible (which I see a requirement for you). The original replace method will add 6 additional instructions per replacement, while all of the others will stay constant.

On a side note, store your phone numbers as strings to deal with leading zeros, and use a phone formatter where needed. Trust me, it's bitten me before.

answered Mar 24, 2010 at 2:38

2 Comments

you don't need character class there, also it would slightly more efficient to use + quantifier. The question I have is: "Since when output of dis.dis demonstrates efficiency?"
Yes, you're correct. dis.dis will just give insight into what the compiler is doing, not actual speed of execution. I'll update the tests to use the timeit module. Nonetheless, would you agree that re is the best way to do this in a cross-version compatible manner?
-1

SilentGhost: dis.dis does demonstrate underlying conceptual / executional complexity. after all, the OP complained about the original replacement chain being too ‘clumsy’, not too ‘slow’.

i recommend against using regular expressions where not inevitable; they just add conceptual overhead and a speed penalty otherwise. to use translate() here is IMHO just the wrong tool, and nowhere as conceptually simple and generic as the original replacement chain.

so you say tamaytoes, and i say tomahtoes: the original solution is quite good in terms of clarity and genericity. it is not clumsy at all. in order to make it a little denser and more parametrized, consider changing it to

phone_nr_translations = [ 
 ( ' ', '', ), 
 ( '(', '', ), 
 ( ')', '', ), ]
def sanitize_phone_nr( phone_nr ):
 R = phone_nr
 for probe, replacement in phone_nr_translations:
 R = R.replace( probe, replacement )
 return R

in this special application, of course, what you really want to do is just cancelling out any unwanted characters, so you can simplify this:

probes = ' ()'
def sanitize_phone_nr( phone_nr ):
 R = phone_nr
 for probe in probes:
 R = R.replace( probe, '' )
 return R

coming to think of it, it is not quite clear to me why you want to turn a phone nr into an integer—that is simply the wrong data type. this can be demonstrated by the fact that at least in mobile nets, + and # and maybe more are valid characters in a dial string (dial, string—see?).

but apart from that, sanitizing a user input phone nr to get out a normalized and safe representation is a very, very valid concern—only i feel that your methodology is too specific. why not re-write the sanitizing method to something very generic without becoming more complex? after all, how can you be sure your users never input other deviant characters in that web form field?

so what you want is really not to dis-allow specific characters (there are about a hundred thousand defined codepoints in unicode 5.1, so how do catch up with those?), but to allow those very characters that are deemed legal in dial strings. and you can do that with a regular expression...

from re import compile as _new_regex
illegal_phone_nr_chrs_re = _new_regex( r"[^0-9#+]" )
def sanitize_phone_nr( phone_nr ):
 return illegal_phone_nr_chrs_re.sub( '', phone_nr )

...or with a set:

legal_phone_nr_chrs = set( '0123456789#+' )
def sanitize_phone_nr( phone_nr ):
 return ''.join( 
 chr for chr in phone_nr 
 if chr in legal_phone_nr_chrs )

that last stanza could well be written on a single line. the disadvantage of this solution would be that you iterate over the input characters from within Python, not making use of the potentially speeder C traversal as offered by str.replace() or even a regular expression. however, performance would in any case be dependent on the expected usage pattern (i am sure you truncate your phone nrs first thing, right? so those would be many small strings to be processed, not few big ones).

notice a few points here: i strive for clarity, which is why i try to avoid over-using abbreviations. chr for character, nr for number and R for the return value (more likely to be, ugh, retval where used in the standard library) are in my style book. programming is about getting things understood and done, not about programmers writing code that approaches the spatial efficiency of gzip. now look, the last solution does fairly much what the OP managed to get done (and more), in...

legal_phone_nr_chrs = set( '0123456789#+' )
def sanitize_phone_nr( phone_nr ): return ''.join( chr for chr in phone_nr if chr in legal_phone_nr_chrs )

...two lines of code if need be, whereas the OP’s code...

class Phone():
 def __init__ ( self, input ): self.phone = self._sanitize( input )
 def __str__ ( self ): return self.phone
 def _sanitize ( self, input ): return input.replace( ' ', '' ).replace( '(', '' ).replace( ')', '' )

...can hardly be compressed below four lines. see what additional baggage that strictly-OOP solution gives you? i believe it can be left out of the picture most of the time.

answered Mar 25, 2010 at 19:30

Comments

Your Answer

Draft saved
Draft discarded

Sign up or log in

Sign up using Google
Sign up using Email and Password

Post as a guest

Required, but never shown

Post as a guest

Required, but never shown

By clicking "Post Your Answer", you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.