0

I am working on a base-datatable with a VARBINARY variable. Now I want to read the table via SQLAlchemy into a pandas dataframe. Going the usual way

df = pandas.read_sql_query("select key from xxx", engine)

I get an uninterpretable memoryview as data type. I can convert this via lambda function

df.key.apply(lambda x: x.tobytes().hex())

into the desired readable format. But I would like to know if the casting can also be placed directly into the pandas.read_sql_query()-statement:

  • via numpy dtypes or maybe

  • directly into the SQL-query

Many greetings and best thanks

asked Mar 23, 2023 at 18:11
5
  • SELECT CONVERT(varchar(5000), yourvarbincolumn, 0) seemed to do the trick within MS-SQL Server. I am guessing call native SQL within SQLAlchemy.. will investigate further. See here: stackoverflow.com/questions/3289988/… geeksforgeeks.org/how-to-execute-raw-sql-in-sqlalchemy Commented Jul 27, 2023 at 13:27
  • @JGFMK So there is no way to do this using built-in functions from SQLAlchemy? I just tried doing exactly this with a cast function but I received UnicodeDecodeError: 'utf-8' codec can't decode byte 0xe7 in position 6: invalid continuation byte error. Commented Mar 5, 2024 at 18:05
  • 1
    @RaulChiarella That might be down to something like the collation of the database. You didn't tag the question with a specific underlying DB, but for SQL Server you can right click on a DB from the Object Explorer in SSMS (SQL Server Management Studio) and Under the General Tab you see the Collation under Maintenance subgroup. On one I use it's SQL_Latin1_General_CP1_CI_AS. Commented Mar 5, 2024 at 18:20
  • stackoverflow.com/questions/11972363/… Commented Mar 5, 2024 at 18:23
  • I see. Thank you. I used Latin1 and it worked, did texto_observacao = self.ordem_observacao_binario.decode('latin1') using a @property decorator from SQLAlchemy on the following binary info: ordem_observacao_binario: Mapped[bytes] = db_sql.Column('TJ_OBSERVA', db_sql.VARBINARY, nullable=True, default=None). Thanks for the tip. Commented Mar 5, 2024 at 18:57

1 Answer 1

0

I am not sure if this will help you but, thanks to @JGFMK response, I was able to come up with something similar on my program:

  1. Defined the property using Mapped to convert the value that has a DATA_TYPE from the SQL Server.
  2. Used a decode method on the property, according to the collation my SQL Server is using, going to Properties > General on my Database.

This was the result:

 # `db_sql.VARBINARY` informs the `DATA_TYPE` from the table field.
 # Mapped does the conversion from table type to python type.
 order_description_binary: Mapped[bytes] = db_sql.Column('TJ_OBSERVA', db_sql.VARBINARY, nullable=True, default=None)
 @property
 def ordem_description(self):
 if self.order_description_binary is not None:
 try:
 description_text = self.order_description_binary .decode('latin1')
 description_text = description_text.replace('\x00', '')
 return description_text
 except UnicodeDecodeError as e:
 print(f"Error decoding order_description_binary: {e}")
 return None
 else:
 return None

So, in your case you can try using decode alongside replace on the specific data on your dataframe that has VARBINARY type.

Example:

df = pandas.read_sql_query("select key from xxx", engine)
df.specific_data.decode('latin1').replace('\x00', '')

I believe something like that would work.

answered Mar 5, 2024 at 19:05
Sign up to request clarification or add additional context in comments.

Comments

Your Answer

Draft saved
Draft discarded

Sign up or log in

Sign up using Google
Sign up using Email and Password

Post as a guest

Required, but never shown

Post as a guest

Required, but never shown

By clicking "Post Your Answer", you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.