encoding problem with pgsql/python?

Asked 14 years, 6 months ago

Viewed 2k times

I retrieved a bunch of text records from my postgresql database and intend to preprocess these text documents before analyzing them.

I want to tokenize the documents but ran into some problem during tokenizing

 #some other bunch of regex replacements
 #toToken is the text string 
 toTokens = self.regexClitics1.sub(" \1円",toTokens) 
 toTokens = self.regexClitics2.sub(" \1円 \2円",toTokens)
 toTokens = str.strip(toTokens)

The error is TypeError: descriptor 'strip' requires a 'str' object but received a 'unicode' I'm curious, why does this error occurs, when the encoding of the database is UTF-8?

Improve this question

asked Jun 23, 2011 at 6:59

goh's user avatar

goh

29.9k30 gold badges95 silver badges156 bronze badges

Add a comment |

1 Answer 1

Sorted by: Reset to default

Why don't you use toTokens.strip(). No need of str module.

There are 2 string types in Python, str and unicode. Look at this for an explanation.

Improve this answer

answered Jun 23, 2011 at 7:13

Samuel's user avatar

Samuel

2,5102 gold badges19 silver badges21 bronze badges

3 Comments

Eric O. Lebigot

Eric O. Lebigot Over a year ago

+1. A shorter explanation can be found on StackOverflow: stackoverflow.com/questions/4545661/… (shameless plug). :)

2011年06月23日T07:20:37.057Z+00:00

goh

goh Over a year ago

does that means that the strings I get from my queries are unicode? Why is that so?

2011年06月24日T02:50:05.58Z+00:00

Samuel

Samuel Over a year ago

@amateur It seems so. It's strange, because AFAIK psycopg returns str objects unless instructed to do otherwise, but can't know without more information about your setup.

2011年06月24日T08:54:21.36Z+00:00

Your Answer

Draft saved

Draft discarded

Sign up or log in

Post as a guest

Name

Required, but never shown

Post as a guest

Name

Required, but never shown

By clicking "Post Your Answer", you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.

default

CollectivesTM on Stack Overflow

encoding problem with pgsql/python?

1 Answer 1

3 Comments

Your Answer

Sign up or log in

Post as a guest

Post as a guest

Linked

Hot Network Questions

CollectivesTM on Stack Overflow

1 Answer 1

3 Comments

Your Answer

Sign up or log in

Post as a guest

Post as a guest

Linked

Related