The following is my solution to Java vs C++. I think the way I have used the re library is inefficient, and possible erroneous as I am getting tle.
import sys
import re
cpp = re.compile("^[a-z]([a-z_]*[a-z])*$")
java= re.compile("^[a-z][a-zA-Z]*$")
r = re.compile("[A-Z][a-z]*")
lines = [line for line in sys.stdin.read().splitlines() if line!=""]
for line in lines:
if cpp.match(line) :
line = line.split("_")
for i in range(1,len(line)):
line[i] = line[i].capitalize()
print "".join(line)
elif java.match(line):
namelist = r.findall(line)
for name in namelist:
line = line.replace(name , name.replace(name[0],"_"+name[0].lower()))
print line
else : print "Error!"
For instance, is there a better way to replace string components inline instead of creating a new string and copying the way I have used :
line = line.replace(name , name.replace(name[0],"_"+name[0].lower()))
or is there a way that is entirely different from my approach?
2 Answers 2
When using the re
module, make sure to use strings marked with a leading 'r' as I do in the code below. This tells the interpreter to not try and interpolate any escaped strings.
By using findall
to test for matches too we end up with more succinct code that still does exactly what we want in a clear fashion.
By putting the logic in a function you get the ability to write unit tests or do something other than print the result.
Using an exception for the failure cases makes the code clear and is more Pythonic in style.
Here is how I would have coded it.
import re
import sys
cpp = re.compile(r"(_[a-z]*)")
java= re.compile(r"([A-Z]+[a-z]*)")
class InvalidStyle(Exception):
pass
def style_transform(input):
if input[0].isupper():
raise InvalidStyle(input)
m = cpp.findall(input)
if m:
# import pdb; pdb.set_trace()
if any(re.match(r"^[A-Z_]$", w) for w in m):
raise InvalidStyle(input)
pos = input.find(m[0])
return input[:pos] + "".join(w.capitalize() for w in m)
m = java.findall(input)
if m:
# import pdb; pdb.set_trace()
pos = input.find(m[0])
words = [input[:pos]] + [w.lower() for w in m]
return "_".join(words)
if input.lower() == input:
return input
else:
raise InvalidStyle(input)
if __name__ == "__main__":
if len(sys.argv) == 2: # allows for debugging via pdb
fp = open(sys.argv[1])
else:
fp = sys.stdin
for line in fp.readlines():
line = line.strip()
if not line:
continue
try:
print style_transform(line)
except InvalidStyle as e:
# print e, uncomment to see errors
print "Error!"
The online judge rejects your solution probably because this regular expression causes catastrophic backtracking:
^[a-z]([a-z_]*[a-z])*$
. Trying to match a string of 24 lowercase letters followed by a non-matching character takes two seconds on my computer. Using this instead takes only 6 microseconds:^[a-z]+(_[a-z]+)*$
To simplify the generation of the underscore-separated string, make the
r
regex recognize also the first word that does not begin in upper case:r = re.compile("[A-Z]?[a-z]*")
Then use
"_".join
to construct the result. I addedif s
because now the regex matches also an empty string in the end.print "_".join(s.lower() for s in r.findall(line) if s)
Explore related questions
See similar questions with these tags.