Return to Question

edited tags

Link

edited Aug 25, 2009 at 12:05

jelovirt

edited Aug 25, 2009 at 12:05

jelovirt

5.9k
8
41
49

improved formatting

Source Link

edited Aug 25, 2009 at 11:49

codeape

edited Aug 25, 2009 at 11:49

codeape

101.6k
26
180
202

Hi I have a problem in python. I try to explain my problem with an example.

I have this string:

string = ×ばつØÙÚÛÜÝÞßàáâãäåæçèéêëìíîïðñòóôõö÷øùúûüýþÿÀÁÂÃ' print string×ばつØÙÚÛÜÝÞßàáâãäåæçèéêëìíîïðñòóôõö÷øùúûüýþÿÀÁÂÃ

>>> string = ×ばつØÙÚÛÜÝÞßàáâãäåæçèéêëìíîïðñòóôõö÷øùúûüýþÿÀÁÂÃ'
>>> print string×ばつØÙÚÛÜÝÞßàáâãäåæçèéêëìíîïðñòóôõö÷øùúûüýþÿÀÁÂÃ

and i want, for example, replace charachters different from Ñ,Ã,ï with ""

i have tried:

rePat = re.compile('[^ÑÃï]',re.UNICODE) print rePat.sub("",string) �Ñ��ï��Ã

>>> rePat = re.compile('[^ÑÃï]',re.UNICODE)
>>> print rePat.sub("",string)
�Ñ�����������������������������ï�������������������Ã

I obtained this �. I think that it's happen because this type of characters in python are represented by two position in the vector: for example \xc3\x91 = Ñ. For this, when i make the regolar expression, all the \xc3 are not substitued. How I can do this type of sub?????

Thanks Franco

Hi I have a problem in python. I try to explain my problem with an example.

I have this string:

string = ×ばつØÙÚÛÜÝÞßàáâãäåæçèéêëìíîïðñòóôõö÷øùúûüýþÿÀÁÂÃ' print string×ばつØÙÚÛÜÝÞßàáâãäåæçèéêëìíîïðñòóôõö÷øùúûüýþÿÀÁÂÃ

and i want, for example, replace charachters different from Ñ,Ã,ï with ""

i have tried:

rePat = re.compile('[^ÑÃï]',re.UNICODE) print rePat.sub("",string) �Ñ��ï��Ã

Thanks Franco

Hi I have a problem in python. I try to explain my problem with an example.

I have this string:

>>> string = ×ばつØÙÚÛÜÝÞßàáâãäåæçèéêëìíîïðñòóôõö÷øùúûüýþÿÀÁÂÃ'
>>> print string×ばつØÙÚÛÜÝÞßàáâãäåæçèéêëìíîïðñòóôõö÷øùúûüýþÿÀÁÂÃ

and i want, for example, replace charachters different from Ñ,Ã,ï with ""

i have tried:

>>> rePat = re.compile('[^ÑÃï]',re.UNICODE)
>>> print rePat.sub("",string)
�Ñ�����������������������������ï�������������������Ã

Thanks Franco

Source Link

asked Aug 25, 2009 at 11:40

Franco

asked Aug 25, 2009 at 11:40

Franco

python - problems with regular expression and unicode

Hi I have a problem in python. I try to explain my problem with an example.

I have this string:

string = ×ばつØÙÚÛÜÝÞßàáâãäåæçèéêëìíîïðñòóôõö÷øùúûüýþÿÀÁÂÃ' print string×ばつØÙÚÛÜÝÞßàáâãäåæçèéêëìíîïðñòóôõö÷øùúûüýþÿÀÁÂÃ

and i want, for example, replace charachters different from Ñ,Ã,ï with ""

i have tried:

rePat = re.compile('[^ÑÃï]',re.UNICODE) print rePat.sub("",string) �Ñ��ï��Ã

Thanks Franco

python regex

lang-py

CollectivesTM on Stack Overflow

Return to Question

python - problems with regular expression and unicode