How convert a string contain unicode characters to UTF in python?

Asked 6 years, 6 months ago

Viewed 2k times

I have a string contains Unicode characters and I want to convert it to UTF-8 in python.

s = '\u0628\u06cc\u0633\u06a9\u0648\u06cc\u062a'

I want convert s to UTF format.

Improve this question

asked Jul 2, 2019 at 12:12

Javad Karimi's user avatar

Javad Karimi

231 silver badge7 bronze badges

4

Possible duplicate of How to convert a string to utf-8 in Python

GadaaDhaariGeek
– GadaaDhaariGeek

2019年07月02日 12:26:18 +00:00
Commented Jul 2, 2019 at 12:26

Add a comment |

2 Answers 2

Sorted by: Reset to default

Add u as prefix for the string s then encode it in utf-8.

Your code will look like this:

s = u'\u0628\u06cc\u0633\u06a9\u0648\u06cc\u062a'
s_encoded = s.encode('utf-8')
print(s_encoded)

I hope this helps.

Improve this answer

answered Jul 2, 2019 at 12:25

GadaaDhaariGeek's user avatar

GadaaDhaariGeek

1,0401 gold badge15 silver badges33 bronze badges

1 Comment

lenz

lenz Over a year ago

If the OP is using Python 3 (it seems so), then the u prefix isn't necessary. But the .encode('utf8') is definitely right.

2019年07月02日T17:59:56.273Z+00:00

Add the below line in the top of your .py file.

# -*- coding: utf-8 -*-

It allows you to encode strings directly in your python script, like this:

# -*- coding: utf-8 -*-
s = '\u0628\u06cc\u0633\u06a9\u0648\u06cc\u062a'
print(s)

Output :

بیسکویت

Improve this answer

answered Jul 2, 2019 at 12:19

Usman's user avatar

Usman

2,0292 gold badges18 silver badges30 bronze badges

2 Comments

lenz

lenz Over a year ago

The source encoding declaration doesn't really apply here, because the string is entered with ASCII-only characters. It would be different if the string literal was actually composed of Arabic letters (not escape sequences).

2019年07月02日T17:55:14.14Z+00:00

Mark Tolonen

Mark Tolonen Over a year ago

A coding line declares the encoding of the source file only. If you have only ASCII characters in the source (as above) it does nothing. In fact, in Python 3, UTF-8 is the default source encoding if undeclared.

2019年07月03日T06:06:16.587Z+00:00

Your Answer

Draft saved

Draft discarded

Sign up or log in

Post as a guest

Name

Required, but never shown

Post as a guest

Name

Required, but never shown

By clicking "Post Your Answer", you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.

lang-py

CollectivesTM on Stack Overflow

How convert a string contain unicode characters to UTF in python?

2 Answers 2

1 Comment

2 Comments

Your Answer

Sign up or log in

Post as a guest

Post as a guest

Linked

Hot Network Questions

CollectivesTM on Stack Overflow

2 Answers 2

1 Comment

2 Comments

Your Answer

Sign up or log in

Post as a guest

Post as a guest

Linked

Related