Question on List processing

Steven D'Aprano steve at pearwood.info
Tue Apr 26 12:29:33 EDT 2016


On 2016年4月27日 01:38 am, subhabangalore at gmail.com wrote:
> I am trying to send you a revised example.
> list1=[u"('koteeswaram/BHPERSN engaged/NA ','class1')",
> u"('koteeswaram/BHPERSN is/NA ','class1')"]

Please don't use generic names that mean nothing like "list1". We can see it
is a list, but what is it for? Use a name that describes what the purpose
of the list is. Even "input" and "output" are better names.
> [('koteeswaram/BHPERSN engaged/NA ','class1'),
> ('koteeswaram/BHPERSN is/NA ','class1')]

What is this? The output? Don't make us guess what things are.
My *guess* is that you have a list of Unicode strings that look like this:
u"('aaa/TAG bbb/TAG ','class1')"
and you want to do six things:
- normalise the string;
- convert the Unicode string to ASCII, ignoring anything that isn't ASCII;
- delete the parentheses in the string;
- delete the leading and trailing single quotes;
- split the string on the comma;
- combine them into a tuple.
So let's make some functions:
# Untested
def remove_parentheses(string):
 if string.startswith("(") and string.endswith(")"):
 string = string[1:-1]
 return string
def remove_single_quotes(string):
 if string.startswith("'") and string.endswith("'"):
 string = string[1:-1]
 return string
def convert(string):
 if not isinstance(string, unicode):
 raise TypeError("expected unicode, but got %s" 
 % type(string).__name__)
 string = unicodedata.normalize('NFKD', string)
 string = string.encode('ascii','ignore')
 string = remove_parentheses(string)
 first_part, second_part = string.split(",")
 first_part = remove_single_quotes(first_part)
 second_part = remove_single_quotes(second_part)
 return (first_part, second_part)
input = [ ... ] # your input strings
output = []
for string in input:
 output.append(convert(string))
-- 
Steven


More information about the Python-list mailing list

AltStyle によって変換されたページ (->オリジナル) /