Error in writing unicode to file python

Question 1

This is my first program in python and need some help in writing utf-8 data to file.

The intention is to read data from excel file and write comma seperated data to text file and below is the code I am running which is giving the below pasted error.

import xlrd
import csv
import codecs
wb = xlrd.open_workbook('/etl/dev/input/CustList.xls')
sh = wb.sheet_by_index(1)
file_output = codecs.open('/etl/dev/input/CustList.csv', 'w', 'utf-8')
for rownum in xrange(sh.nrows):
 file_output.write(sh.row_values(rownum))
file_output.close()

and here is the error

Traceback (most recent call last):
 File "TestXls2Csv.py", line 20, in <module>
 file_output.write(sh.row_values(rownum))
 File "/fstools/gptools/ext/python/lib/python2.6/codecs.py", line 686, in write
 return self.writer.write(data)
 File "/fstools/gptools/ext/python/lib/python2.6/codecs.py", line 351, in write
 data, consumed = self.encode(object, self.errors)
TypeError: coercing to Unicode: need string or buffer, list found

Any help is highly appreciated.

Thanks Zulfi

Tried the below

 row_values = [str(val) for val in sh.row_values(rownum)]
 file_output.write(",".join(row_values) + "\n")

It seems to work fine for one sheet of the excel but is giving the below error for the other sheet

Traceback (most recent call last): File "TestXls2Csv.py", line 12, in file_output.write(",".join(sh.row_values(rownum)) + "\n") TypeError: sequence item 8: expected string or Unicode, float foundI had initially tried using csv.writer but there is a \xa0 character in one of the cells which was causing a lot of trouble hence installed codecs and battling to get it to work.

Below is info on the excel document if that gives any insight

=== File: CustList.xls ===
Open took 3.03 seconds

BIFF version: 8; datemode: 0 codepage: 1200 (encoding: utf_16_le); countries: (1, 1) Last saved by: u'Rajesh, Vatha' Number of data sheets: 2 Use mmap: 1; Formatting: 0; On demand: 0 Ragged rows: 0 Load time: 0.01 seconds (stage 1) 1.86 seconds (stage 2)

sheet 0: name = u'MEMBER'; nrows = 29966; ncols = 11

sheet 1: name = u'PHYSICANS'; nrows = 1619; ncols = 19

command took 0.20 secondsPlease suggest.

Thanks Zulfi

Question 2

i think the problem is that sh.row_values(rownum) is a list (row_values Returns a slice of the values of the cells in the given row.) and not a string, so you can't pass it to write(). If you want to write all the row you should iterate each cell

Question 3

How should the output file look like? Give some example.

Question 4

The output should contain the data in each excel row as comma seperated columns in text file

Question 5

You mixed up your question update. Did you try my new answer?

Question 6

dciriello was right, because file_output.write should take string as its arguments, but sh.row_values(rownum) return a list, that's the main reason.

here is what to do, if you want to copy a file from xls to csv.

import xlrd
import csv
import codecs
wb = xlrd.open_workbook('/etl/dev/input/CustList.xls')
table = wb.sheet_by_index(1)
nrows = table.nrows
with codecs.open('/etl/dev/input/CustList.csv', 'w', 'utf-8') as file_output:
 spamwriter = csv.writer(file_output)
 for i in range(nrows):
 spamwriter.writerow(table.row_values(i))

Question 7

Many thanks for your reply. I was trying to avoid using csv.writer because I was getting the below error with csv writer UnicodeEncodeError: 'ascii' codec can't encode character u'\xa0' in position 10: ordinal not in range(128)

Question 8

If you want the values to be comma-separated in your output file, you could simply change your write command to join the list of values to a comma-separated string.

But at first you have to convert every value in the list to a string, because row_values() is returning a list oft string and floating values.

...
row_values = [str(val) for val in sh.row_values(rownum)]
file_output.write(",".join(row_values) + "\n")
...

Question 9

Thanks for the reply. Tried the above and this time I guess the issues pops up while placing the data in the string. This u'\xa0' is bugging a lot :( I wish there is a simple alternative to dump whatever character this is from excel to text file.

Traceback (most recent call last): File "TestXls2Csv.py", line 12, in <module> row_values = [str(val) for val in sh.row_values(rownum)] UnicodeEncodeError: 'ascii' codec can't encode character u'\xa0' in position 10: ordinal not in range(128)

Question 10

changed str(val) to repr(val) without having any idea of what's the difference and it worked except that the strings were single quoted and prefixed with 'u' .

Chan 113 bronze badges · Answer 1 · 2014-06-04 08:24:28Z

dciriello was right, because file_output.write should take string as its arguments, but sh.row_values(rownum) return a list, that's the main reason.

here is what to do, if you want to copy a file from xls to csv.

import xlrd
import csv
import codecs
wb = xlrd.open_workbook('/etl/dev/input/CustList.xls')
table = wb.sheet_by_index(1)
nrows = table.nrows
with codecs.open('/etl/dev/input/CustList.csv', 'w', 'utf-8') as file_output:
 spamwriter = csv.writer(file_output)
 for i in range(nrows):
 spamwriter.writerow(table.row_values(i))

Many thanks for your reply. I was trying to avoid using csv.writer because I was getting the below error with csv writer UnicodeEncodeError: 'ascii' codec can't encode character u'\xa0' in position 10: ordinal not in range(128)

miindlek 3,55316 silver badges25 bronze badges · Answer 2 · 2014-06-04 08:32:40Z

1

If you want the values to be comma-separated in your output file, you could simply change your write command to join the list of values to a comma-separated string.

But at first you have to convert every value in the list to a string, because row_values() is returning a list oft string and floating values.

...
row_values = [str(val) for val in sh.row_values(rownum)]
file_output.write(",".join(row_values) + "\n")
...

Share

Improve this answer

edited Jun 5, 2014 at 8:17

answered Jun 4, 2014 at 8:32

miindlek's user avatar

miindlek

3,55316 silver badges25 bronze badges

2 Comments

zulfi123786

zulfi123786 Over a year ago

Thanks for the reply. Tried the above and this time I guess the issues pops up while placing the data in the string. This u'\xa0' is bugging a lot :( I wish there is a simple alternative to dump whatever character this is from excel to text file.

Traceback (most recent call last): File "TestXls2Csv.py", line 12, in <module> row_values = [str(val) for val in sh.row_values(rownum)] UnicodeEncodeError: 'ascii' codec can't encode character u'\xa0' in position 10: ordinal not in range(128)

2014年06月11日T19:38:37.14Z+00:00

zulfi123786

zulfi123786 Over a year ago

changed str(val) to repr(val) without having any idea of what's the difference and it worked except that the strings were single quoted and prefixed with 'u' .

2014年06月12日T12:39:19.91Z+00:00

CollectivesTM on Stack Overflow

Error in writing unicode to file python

2 Answers 2

1 Comment

2 Comments

Your Answer

Sign up or log in

Post as a guest

Post as a guest

Hot Network Questions

CollectivesTM on Stack Overflow

2 Answers 2

1 Comment

2 Comments

Your Answer

Sign up or log in

Post as a guest

Post as a guest

Related