Skip to main content
Stack Overflow
  1. About
  2. For Teams

You are not logged in. Your edit will be placed in a queue until it is peer reviewed.

We welcome edits that make the post easier to understand and more valuable for readers. Because community members review edits, please try to make the post substantially better than how you found it, for example, by fixing grammar or adding additional resources and hyperlinks.

Required fields*

python regular expression with utf8 issue

I got a file which includes many lines of plain utf-8 text. Such as below, by the by, it's Chinese.

PROCESS:类型:关爱积分[NOTIFY] 交易号:2012022900000109 订单号:W12022910079166 交易金额:0.01元 交易状态:true 2012年2月29日 10:13:08

The file itself was saved in utf-8 format. file name is xx.txt

here is my python code, env is python2.7

#coding: utf-8
import re
pattern = re.compile(r'交易金额:(\d+)元')
for line in open('xx.txt'):
 match = pattern.match(line.decode('utf-8'))
 if match:
 print match.group()

The problematic thing here is I got no results.

I wanna get the decimal string from 交易金额:0.01元, in here, which is 0.01.

Why doesn't this code work? Can anyone explain it to me, I got no clue whatsoever.

Answer*

Draft saved
Draft discarded
Cancel
1
  • 1
    still not working. can u provide your code to accomplish this little task, much appreciated Commented May 11, 2012 at 6:36

lang-py

AltStyle によって変換されたページ (->オリジナル) /