4
\$\begingroup\$

The following is my solution to Java vs C++. I think the way I have used the re library is inefficient, and possible erroneous as I am getting tle.

import sys
import re
cpp = re.compile("^[a-z]([a-z_]*[a-z])*$")
java= re.compile("^[a-z][a-zA-Z]*$")
r = re.compile("[A-Z][a-z]*")
lines = [line for line in sys.stdin.read().splitlines() if line!=""]
for line in lines:
 if cpp.match(line) : 
 line = line.split("_")
 for i in range(1,len(line)):
 line[i] = line[i].capitalize()
 print "".join(line)
 elif java.match(line): 
 namelist = r.findall(line)
 for name in namelist:
 line = line.replace(name , name.replace(name[0],"_"+name[0].lower()))
 print line
 else : print "Error!"

For instance, is there a better way to replace string components inline instead of creating a new string and copying the way I have used :

line = line.replace(name , name.replace(name[0],"_"+name[0].lower()))

or is there a way that is entirely different from my approach?

Jamal
35.2k13 gold badges134 silver badges238 bronze badges
asked May 25, 2014 at 7:50
\$\endgroup\$

2 Answers 2

1
\$\begingroup\$

When using the re module, make sure to use strings marked with a leading 'r' as I do in the code below. This tells the interpreter to not try and interpolate any escaped strings.

By using findall to test for matches too we end up with more succinct code that still does exactly what we want in a clear fashion.

By putting the logic in a function you get the ability to write unit tests or do something other than print the result.

Using an exception for the failure cases makes the code clear and is more Pythonic in style.

Here is how I would have coded it.

import re
import sys
cpp = re.compile(r"(_[a-z]*)")
java= re.compile(r"([A-Z]+[a-z]*)")
class InvalidStyle(Exception):
 pass
def style_transform(input):
 if input[0].isupper():
 raise InvalidStyle(input)
 m = cpp.findall(input)
 if m:
 # import pdb; pdb.set_trace()
 if any(re.match(r"^[A-Z_]$", w) for w in m):
 raise InvalidStyle(input)
 pos = input.find(m[0])
 return input[:pos] + "".join(w.capitalize() for w in m)
 m = java.findall(input)
 if m:
 # import pdb; pdb.set_trace()
 pos = input.find(m[0])
 words = [input[:pos]] + [w.lower() for w in m]
 return "_".join(words)
 if input.lower() == input:
 return input
 else:
 raise InvalidStyle(input)
if __name__ == "__main__":
 if len(sys.argv) == 2: # allows for debugging via pdb
 fp = open(sys.argv[1])
 else:
 fp = sys.stdin
 for line in fp.readlines():
 line = line.strip()
 if not line:
 continue
 try:
 print style_transform(line)
 except InvalidStyle as e:
 # print e, uncomment to see errors
 print "Error!"
answered May 28, 2014 at 18:51
\$\endgroup\$
1
\$\begingroup\$
  • The online judge rejects your solution probably because this regular expression causes catastrophic backtracking: ^[a-z]([a-z_]*[a-z])*$. Trying to match a string of 24 lowercase letters followed by a non-matching character takes two seconds on my computer. Using this instead takes only 6 microseconds:

    ^[a-z]+(_[a-z]+)*$
    
  • To simplify the generation of the underscore-separated string, make the r regex recognize also the first word that does not begin in upper case:

    r = re.compile("[A-Z]?[a-z]*")
    

    Then use "_".join to construct the result. I added if s because now the regex matches also an empty string in the end.

    print "_".join(s.lower() for s in r.findall(line) if s)
    
answered May 29, 2014 at 17:49
\$\endgroup\$

Your Answer

Draft saved
Draft discarded

Sign up or log in

Sign up using Google
Sign up using Email and Password

Post as a guest

Required, but never shown

Post as a guest

Required, but never shown

By clicking "Post Your Answer", you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.