I have some Python code that runs as part of an Azure Synapse Analytics Apache Spark Notebook (or Synapse Notebook) and would like to add effective error handling. The code simply executes a given SQL script against the database. The code runs but I sometimes see errors like attempt to use closed connection
. I would like to do the following:
- Improve code that I wrote through peer review
(削除) Can I improve the error handling? eg pseudo-codeif connection still open close connection
(削除ここまで)(削除) The code using SQL auth works. I would like to authenticate as the Managed Identity, I've tried using the object id of the MI in the connection string withAuthentication=ActiveDirectoryMsi
but it didn't work (削除ここまで)
Cell1 - parameters
pAccountName = 'someStorageAccount'
pContainerName = 'someContainer'
pRelativePath = '/raw/sql/some_sql_files/'
pFileName = 'someSQLscript.sql'
Cell 2 - main
import pyodbc
from pyspark import SparkFiles
try:
# Add the file to the cluster so we can view it
sqlFilepath = f"""abfss://{pContainerName}""" + "@" + f"""{pAccountName}.dfs.core.windows.net{pRelativePath}{pFileName}"""
sc.addFile(sqlFilepath, False)
# Open the file for reading
with open(SparkFiles.getRootDirectory() + f'/{pFileName}', 'r') as f:
lines = f.read() ## read all text from a file into a string
# Open the database connection
conn = pyodbc.connect( 'DRIVER={ODBC Driver 17 for SQL Server};'
'SERVER=someServer.sql.azuresynapse.net;'
'DATABASE=someDatabase;UID=batchOnlyUser;'
'PWD=youWish;', autocommit = True )
# Split the script into batches separated by "GO"
for batch in lines.split("GO"):
conn.execute(batch) # execute the SQL statement
except:
raise
finally:
# Tidy up
conn.close()
1 Answer 1
Ignoring your closed-connection and managed-identity questions, since (a) I don't know how to answer them and (b) on their own they're off topic, there is still review to be done:
- A Pythonic way to tell the standard library to format your URI instead of you doing it yourself is to call into
urlunparse
- Consider using
pathlib
connect
accepts kwargs as an alternative to the conn string, and the former will lay out your connection parameters more nicelyexcept / raise
is redundant and should be deleted- Your
try
starts too early and should only start after the connection has been established - You're not reading a lines list; you're reading content, which would be a better variable name
Suggested
from pathlib import Path
from urllib.parse import urlunparse, ParseResult
import pyodbc
from pyspark import SparkFiles
pAccountName = 'someStorageAccount'
pContainerName = 'someContainer'
pRelativePath = '/raw/sql/some_sql_files/'
pFileName = 'someSQLscript.sql'
# Add the file to the cluster so we can view it
sql_filename = urlunparse(
ParseResult(
scheme='abfss',
netloc=f'{pContainerName}@{pAccountName}.dfs.core.windows.net',
path=f'{pRelativePath}{pFileName}',
params=None, query=None, fragment=None,
)
)
sc.addFile(sql_filename, False)
# Open the file for reading
sql_file_path = Path(SparkFiles.getRootDirectory()) / pFileName
with sql_file_path.open() as f:
content = f.read()
# Open the database connection
conn = pyodbc.connect(
DRIVER='{ODBC Driver 17 for SQL Server}',
SERVER='someServer.sql.azuresynapse.net',
DATABASE='someDatabase',
UID='batchOnlyUser',
PWD='youWish',
autocommit=True,
)
try:
# Split the script into batches separated by "GO"
for batch in content.split("GO"):
conn.execute(batch) # execute the SQL statement
finally:
conn.close()
select @@version
to big create table, insert statements etc \$\endgroup\$Authentication=ActiveDirectoryMsi
doesn't work. We can only review working code. \$\endgroup\$