I am working on a base-datatable with a VARBINARY variable. Now I want to read the table via SQLAlchemy into a pandas dataframe. Going the usual way
df = pandas.read_sql_query("select key from xxx", engine)
I get an uninterpretable memoryview as data type. I can convert this via lambda function
df.key.apply(lambda x: x.tobytes().hex())
into the desired readable format. But I would like to know if the casting can also be placed directly into the pandas.read_sql_query()-statement:
via numpy dtypes or maybe
directly into the SQL-query
Many greetings and best thanks
1 Answer 1
I am not sure if this will help you but, thanks to @JGFMK response, I was able to come up with something similar on my program:
- Defined the property using
Mappedto convert the value that has aDATA_TYPEfrom the SQL Server. - Used a
decodemethod on the property, according to the collation my SQL Server is using, going toProperties > Generalon my Database.
This was the result:
# `db_sql.VARBINARY` informs the `DATA_TYPE` from the table field.
# Mapped does the conversion from table type to python type.
order_description_binary: Mapped[bytes] = db_sql.Column('TJ_OBSERVA', db_sql.VARBINARY, nullable=True, default=None)
@property
def ordem_description(self):
if self.order_description_binary is not None:
try:
description_text = self.order_description_binary .decode('latin1')
description_text = description_text.replace('\x00', '')
return description_text
except UnicodeDecodeError as e:
print(f"Error decoding order_description_binary: {e}")
return None
else:
return None
So, in your case you can try using decode alongside replace on the specific data on your dataframe that has VARBINARY type.
Example:
df = pandas.read_sql_query("select key from xxx", engine)
df.specific_data.decode('latin1').replace('\x00', '')
I believe something like that would work.
Comments
Explore related questions
See similar questions with these tags.
castfunction but I receivedUnicodeDecodeError: 'utf-8' codec can't decode byte 0xe7 in position 6: invalid continuation byteerror.Latin1and it worked, didtexto_observacao = self.ordem_observacao_binario.decode('latin1')using a@propertydecorator from SQLAlchemy on the following binary info:ordem_observacao_binario: Mapped[bytes] = db_sql.Column('TJ_OBSERVA', db_sql.VARBINARY, nullable=True, default=None). Thanks for the tip.