Skip to main content
Stack Overflow
  1. About
  2. For Teams

Return to Question

Post Timeline

added 147 characters in body; added 72 characters in body
Source Link
Khelben
  • 6.5k
  • 6
  • 37
  • 46

I am connecting to a MS SQL server through SQL Alchemy, using pyodbc module. Everything appears to be working fine, until I began having problems with the encodings. Some of the non-ascii characters are being replaced with '?'

The DB has a collation 'Latin1_General_CI_AS' (I've checked also the specific fields and they keep the same collation). I started selecting the encoding 'latin1' in the call of create_engine and that appears to work for Western European character (like French or Spanish, characters like é) but not for Easter European characters. Specifically, I have a problem with the character ć

I have been trying to select other encodings as stated on Python documentation, specifically the Microsoft ones, like cp1250 and cp1252, but I keep facing the same problem.

Does anyone knows how to solve those differences? Does the collation 'Latin1_General_CI_AS' has an equivalence on Python encodings?

The code for my current connection is the following

for sqlalchemy import *
def connect():
 return pyodbc.connect('DSN=database;UID=uid;PWD=password')
engine = create_engine('mssql://', creator=connect, encoding='latin1')
connection = engine.connect()

Clarifications and comments:

  • This problems happens when retrieving information from the DB. I don't need to store anything.
  • At the beginning I didn't specify the encoding, and the result was that, whenever a non ascii character was encountered on the DB, pyodbc raises a UnicodeDecodeError. I corrected that using 'latin1' as encoding, but that doesn't solve the problem for all the characters.
  • I admit that the server is not on latin1, the comment is incorrect. I have been checking both the database collation and the specific fields collations and appears to be all in 'Latin1_General_CI_AS', then, how can ć be stored? Maybe I'm not correctly understanding collations.
  • I corrected a little the question, specifically, I have tried more encodings than latin1, also cp1250 and cp1252 (which apparently is the one used on 'Latin1_General_CI_AS', according to msdn)

UPDATE:

OK, Following these steps, I foundget that the encoding isused by the DB appears to be cp1252. I followed those steps: http://bytes.com/topic/sql-server/answers/142972-characters-encoding Anyway, that appears to be a bad assumption as reflected on answers.

UPDATE2: Anyway, after configuring properly the odbc driver, I don't need to specify the encoding on the Python code.

I am connecting to a MS SQL server through SQL Alchemy, using pyodbc module. Everything appears to be working fine, until I began having problems with the encodings. Some of the non-ascii characters are being replaced with '?'

The DB has a collation 'Latin1_General_CI_AS' (I've checked also the specific fields and they keep the same collation). I started selecting the encoding 'latin1' in the call of create_engine and that appears to work for Western European character (like French or Spanish, characters like é) but not for Easter European characters. Specifically, I have a problem with the character ć

I have been trying to select other encodings as stated on Python documentation, specifically the Microsoft ones, like cp1250 and cp1252, but I keep facing the same problem.

Does anyone knows how to solve those differences? Does the collation 'Latin1_General_CI_AS' has an equivalence on Python encodings?

The code for my current connection is the following

for sqlalchemy import *
def connect():
 return pyodbc.connect('DSN=database;UID=uid;PWD=password')
engine = create_engine('mssql://', creator=connect, encoding='latin1')
connection = engine.connect()

Clarifications and comments:

  • This problems happens when retrieving information from the DB. I don't need to store anything.
  • At the beginning I didn't specify the encoding, and the result was that, whenever a non ascii character was encountered on the DB, pyodbc raises a UnicodeDecodeError. I corrected that using 'latin1' as encoding, but that doesn't solve the problem for all the characters.
  • I admit that the server is not on latin1, the comment is incorrect. I have been checking both the database collation and the specific fields collations and appears to be all in 'Latin1_General_CI_AS', then, how can ć be stored? Maybe I'm not correctly understanding collations.
  • I corrected a little the question, specifically, I have tried more encodings than latin1, also cp1250 and cp1252 (which apparently is the one used on 'Latin1_General_CI_AS', according to msdn)

UPDATE:

OK, I found that the encoding is cp1252. I followed those steps: http://bytes.com/topic/sql-server/answers/142972-characters-encoding

I am connecting to a MS SQL server through SQL Alchemy, using pyodbc module. Everything appears to be working fine, until I began having problems with the encodings. Some of the non-ascii characters are being replaced with '?'

The DB has a collation 'Latin1_General_CI_AS' (I've checked also the specific fields and they keep the same collation). I started selecting the encoding 'latin1' in the call of create_engine and that appears to work for Western European character (like French or Spanish, characters like é) but not for Easter European characters. Specifically, I have a problem with the character ć

I have been trying to select other encodings as stated on Python documentation, specifically the Microsoft ones, like cp1250 and cp1252, but I keep facing the same problem.

Does anyone knows how to solve those differences? Does the collation 'Latin1_General_CI_AS' has an equivalence on Python encodings?

The code for my current connection is the following

for sqlalchemy import *
def connect():
 return pyodbc.connect('DSN=database;UID=uid;PWD=password')
engine = create_engine('mssql://', creator=connect, encoding='latin1')
connection = engine.connect()

Clarifications and comments:

  • This problems happens when retrieving information from the DB. I don't need to store anything.
  • At the beginning I didn't specify the encoding, and the result was that, whenever a non ascii character was encountered on the DB, pyodbc raises a UnicodeDecodeError. I corrected that using 'latin1' as encoding, but that doesn't solve the problem for all the characters.
  • I admit that the server is not on latin1, the comment is incorrect. I have been checking both the database collation and the specific fields collations and appears to be all in 'Latin1_General_CI_AS', then, how can ć be stored? Maybe I'm not correctly understanding collations.
  • I corrected a little the question, specifically, I have tried more encodings than latin1, also cp1250 and cp1252 (which apparently is the one used on 'Latin1_General_CI_AS', according to msdn)

UPDATE:

OK, Following these steps, I get that the encoding used by the DB appears to be cp1252: http://bytes.com/topic/sql-server/answers/142972-characters-encoding Anyway, that appears to be a bad assumption as reflected on answers.

UPDATE2: Anyway, after configuring properly the odbc driver, I don't need to specify the encoding on the Python code.

added 150 characters in body
Source Link
Khelben
  • 6.5k
  • 6
  • 37
  • 46

I am connecting to a MS SQL server through SQL Alchemy, using pyodbc module. Everything appears to be working fine, until I began having problems with the encodings. Some of the non-ascii characters are being replaced with '?'

The DB has a collation 'Latin1_General_CI_AS' (I've checked also the specific fields and they keep the same collation). I started selecting the encoding 'latin1' in the call of create_engine and that appears to work for Western European character (like French or Spanish, characters like é) but not for Easter European characters. Specifically, I have a problem with the character ć

I have been trying to select other encodings as stated on Python documentation, specifically the Microsoft ones, like cp1250 and cp1252, but I keep facing the same problem.

Does anyone knows how to solve those differences? Does the collation 'Latin1_General_CI_AS' has an equivalence on Python encodings?

The code for my current connection is the following

for sqlalchemy import *
def connect():
 return pyodbc.connect('DSN=database;UID=uid;PWD=password')
engine = create_engine('mssql://', creator=connect, encoding='latin1')
connection = engine.connect()

Clarifications and comments:

  • This problems happens when retrieving information from the DB. I don't need to store anything.
  • At the beginning I didn't specify the encoding, and the result was that, whenever a non ascii character was encountered on the DB, pyodbc raises a UnicodeDecodeError. I corrected that using 'latin1' as encoding, but that doesn't solve the problem for all the characters.
  • I admit that the server is not on latin1, the comment is incorrect. I have been checking both the database collation and the specific fields collations and appears to be all in 'Latin1_General_CI_AS', then, how can ć be stored? Maybe I'm not correctly understanding collations.
  • I corrected a little the question, specifically, I have tried more encodings than latin1, also cp1250 and cp1252 (which apparently is the one used on 'Latin1_General_CI_AS', according to msdn)

UPDATE:

OK, I found that the encoding is cp1252. I followed those steps: http://bytes.com/topic/sql-server/answers/142972-characters-encoding

I am connecting to a MS SQL server through SQL Alchemy, using pyodbc module. Everything appears to be working fine, until I began having problems with the encodings. Some of the non-ascii characters are being replaced with '?'

The DB has a collation 'Latin1_General_CI_AS' (I've checked also the specific fields and they keep the same collation). I started selecting the encoding 'latin1' in the call of create_engine and that appears to work for Western European character (like French or Spanish, characters like é) but not for Easter European characters. Specifically, I have a problem with the character ć

I have been trying to select other encodings as stated on Python documentation, specifically the Microsoft ones, like cp1250 and cp1252, but I keep facing the same problem.

Does anyone knows how to solve those differences? Does the collation 'Latin1_General_CI_AS' has an equivalence on Python encodings?

The code for my current connection is the following

for sqlalchemy import *
def connect():
 return pyodbc.connect('DSN=database;UID=uid;PWD=password')
engine = create_engine('mssql://', creator=connect, encoding='latin1')
connection = engine.connect()

Clarifications and comments:

  • This problems happens when retrieving information from the DB. I don't need to store anything.
  • At the beginning I didn't specify the encoding, and the result was that, whenever a non ascii character was encountered on the DB, pyodbc raises a UnicodeDecodeError. I corrected that using 'latin1' as encoding, but that doesn't solve the problem for all the characters.
  • I admit that the server is not on latin1, the comment is incorrect. I have been checking both the database collation and the specific fields collations and appears to be all in 'Latin1_General_CI_AS', then, how can ć be stored? Maybe I'm not correctly understanding collations.
  • I corrected a little the question, specifically, I have tried more encodings than latin1, also cp1250 and cp1252 (which apparently is the one used on 'Latin1_General_CI_AS', according to msdn)

I am connecting to a MS SQL server through SQL Alchemy, using pyodbc module. Everything appears to be working fine, until I began having problems with the encodings. Some of the non-ascii characters are being replaced with '?'

The DB has a collation 'Latin1_General_CI_AS' (I've checked also the specific fields and they keep the same collation). I started selecting the encoding 'latin1' in the call of create_engine and that appears to work for Western European character (like French or Spanish, characters like é) but not for Easter European characters. Specifically, I have a problem with the character ć

I have been trying to select other encodings as stated on Python documentation, specifically the Microsoft ones, like cp1250 and cp1252, but I keep facing the same problem.

Does anyone knows how to solve those differences? Does the collation 'Latin1_General_CI_AS' has an equivalence on Python encodings?

The code for my current connection is the following

for sqlalchemy import *
def connect():
 return pyodbc.connect('DSN=database;UID=uid;PWD=password')
engine = create_engine('mssql://', creator=connect, encoding='latin1')
connection = engine.connect()

Clarifications and comments:

  • This problems happens when retrieving information from the DB. I don't need to store anything.
  • At the beginning I didn't specify the encoding, and the result was that, whenever a non ascii character was encountered on the DB, pyodbc raises a UnicodeDecodeError. I corrected that using 'latin1' as encoding, but that doesn't solve the problem for all the characters.
  • I admit that the server is not on latin1, the comment is incorrect. I have been checking both the database collation and the specific fields collations and appears to be all in 'Latin1_General_CI_AS', then, how can ć be stored? Maybe I'm not correctly understanding collations.
  • I corrected a little the question, specifically, I have tried more encodings than latin1, also cp1250 and cp1252 (which apparently is the one used on 'Latin1_General_CI_AS', according to msdn)

UPDATE:

OK, I found that the encoding is cp1252. I followed those steps: http://bytes.com/topic/sql-server/answers/142972-characters-encoding

added 520 characters in body; added 196 characters in body
Source Link
Khelben
  • 6.5k
  • 6
  • 37
  • 46

I am connecting to a MS SQL server through SQL Alchemy, using pyodbc module. Everything appears to be working fine, until I began having problems with the encodings. Some of the non-ascii characters are being replaced with '?'

The DB has a collation 'Latin1_General_CI_AS' (I've checked also the specific fields and they keep the same collation). I started selecting the encoding 'latin1' in the call of create_engine and that appears to work for Western European character (like French or Spanish, characters like é) but not for Easter European characters. Specifically, I have a problem with the character ć

I have been trying to select other latin1 encodings as stated on Python documentation, specifically the Microsoft ones, like cp1250 and cp1252, but I keep facing the same problem.

Does anyone knows how to solve those differences? Does the collation 'Latin1_General_CI_AS' has an equivalence on Python encodings?

The code for my current connection is the following

for sqlalchemy import *
def connect():
 return pyodbc.connect('DSN=database;UID=uid;PWD=password')
# The server is encoded is latin1
engine = create_engine('mssql://', creator=connect, encoding='latin1')
connection = engine.connect()

Clarifications and comments:

  • This problems happens when retrieving information from the DB. I don't need to store anything.
  • At the beginning I didn't specify the encoding, and the result was that, whenever a non ascii character was encountered on the DB, pyodbc raises a UnicodeDecodeError. I corrected that using 'latin1' as encoding, but that doesn't solve the problem for all the characters.
  • I admit that the server is not on latin1, the comment is incorrect. I have been checking both the database collation and the specific fields collations and appears to be all in 'Latin1_General_CI_AS', then, how can ć be stored? Maybe I'm not correctly understanding collations.
  • I corrected a little the question, specifically, I have tried more encodings than latin1, also cp1250 and cp1252 (which apparently is the one used on 'Latin1_General_CI_AS', according to msdn)

I am connecting to a MS SQL server through SQL Alchemy, using pyodbc module. Everything appears to be working fine, until I began having problems with the encodings. Some of the non-ascii characters are being replaced with '?'

The DB has a collation 'Latin1_General_CI_AS' (I've checked also the specific fields and they keep the same collation). I started selecting the encoding 'latin1' in the call of create_engine and that appears to work for Western European character (like French or Spanish, characters like é) but not for Easter European characters. Specifically, I have a problem with the character ć

I have been trying to select other latin1 encodings as stated on Python documentation, specifically the Microsoft ones, like cp1250 and cp1252, but I keep facing the same problem.

Does anyone knows how to solve those differences? Does the collation 'Latin1_General_CI_AS' has an equivalence on Python encodings?

The code for my current connection is the following

for sqlalchemy import *
def connect():
 return pyodbc.connect('DSN=database;UID=uid;PWD=password')
# The server is encoded is latin1
engine = create_engine('mssql://', creator=connect, encoding='latin1')
connection = engine.connect()

Clarifications and comments:

  • This problems happens when retrieving information from the DB. I don't need to store anything.

I am connecting to a MS SQL server through SQL Alchemy, using pyodbc module. Everything appears to be working fine, until I began having problems with the encodings. Some of the non-ascii characters are being replaced with '?'

The DB has a collation 'Latin1_General_CI_AS' (I've checked also the specific fields and they keep the same collation). I started selecting the encoding 'latin1' in the call of create_engine and that appears to work for Western European character (like French or Spanish, characters like é) but not for Easter European characters. Specifically, I have a problem with the character ć

I have been trying to select other encodings as stated on Python documentation, specifically the Microsoft ones, like cp1250 and cp1252, but I keep facing the same problem.

Does anyone knows how to solve those differences? Does the collation 'Latin1_General_CI_AS' has an equivalence on Python encodings?

The code for my current connection is the following

for sqlalchemy import *
def connect():
 return pyodbc.connect('DSN=database;UID=uid;PWD=password')
engine = create_engine('mssql://', creator=connect, encoding='latin1')
connection = engine.connect()

Clarifications and comments:

  • This problems happens when retrieving information from the DB. I don't need to store anything.
  • At the beginning I didn't specify the encoding, and the result was that, whenever a non ascii character was encountered on the DB, pyodbc raises a UnicodeDecodeError. I corrected that using 'latin1' as encoding, but that doesn't solve the problem for all the characters.
  • I admit that the server is not on latin1, the comment is incorrect. I have been checking both the database collation and the specific fields collations and appears to be all in 'Latin1_General_CI_AS', then, how can ć be stored? Maybe I'm not correctly understanding collations.
  • I corrected a little the question, specifically, I have tried more encodings than latin1, also cp1250 and cp1252 (which apparently is the one used on 'Latin1_General_CI_AS', according to msdn)
added 137 characters in body
Source Link
Khelben
  • 6.5k
  • 6
  • 37
  • 46
Loading
added 1 characters in body
Source Link
Khelben
  • 6.5k
  • 6
  • 37
  • 46
Loading
Source Link
Khelben
  • 6.5k
  • 6
  • 37
  • 46
Loading
default

AltStyle によって変換されたページ (->オリジナル) /