4
\$\begingroup\$

I've written a script which creates a new database, a table within the database, insert some values into it and fetch the values in one go. The way I've created the script below to achieve what I just said seems not to be an ideal way as there are too many repetitions in there. To be specific this two functions create_database() and connect() are almost identical. Moreover, I had to use mycursor = conn.cursor() twice within main function.

import mysql.connector
def create_database():
 mydb = mysql.connector.connect(
 host="localhost",
 user="root",
 passwd = "",
 database=""
 )
 return mydb
def connect(databasename):
 mydb = mysql.connector.connect(
 host="localhost",
 user="root",
 passwd = "",
 database=databasename
 )
 return mydb
def store_data(item_name,item_link):
 mycursor.execute("INSERT INTO webdata (item_name,item_link) VALUES (%s,%s)", (item_name,item_link))
 conn.commit()
if __name__ == '__main__':
 db_name = "newdatabase"
 conn = create_database()
 mycursor = conn.cursor()
 try:
 mycursor.execute(f"CREATE DATABASE {db_name}")
 except mysql.connector.errors.DatabaseError:
 pass
 conn = connect(db_name)
 mycursor = conn.cursor()
 mycursor.execute("DROP TABLE if exists webdata")
 mycursor.execute("CREATE TABLE if not exists webdata (item_name VARCHAR(255), item_link VARCHAR(255))")
 store_data("all questions","https://stackoverflow.com/questions/tagged/web-scraping")
 mycursor.execute("SELECT * FROM webdata")
 for item in mycursor.fetchall():
 print(item)

What it prints (expected result):

('all questions', 'https://stackoverflow.com/questions/tagged/web-scraping')
asked Feb 28, 2021 at 7:50
\$\endgroup\$

2 Answers 2

4
\$\begingroup\$

From the title alone, this sounds like something that should not be done, particularly since this is not a lightweight database such as SQLite. Creating the database, its tables, columns, constraints etc. is not, and should not, be the job of the application - but rather an external setup script with administrative permissions. The application, then, should not have such permissions and should only be able to insert/select/update/delete.

For some reason I thought autocommit is enabled by default for the MySQL Python connector, but apparently it's not:

https://dev.mysql.com/doc/connector-python/en/connector-python-connectargs.html

So your explicit call to commit is fine.

I can't tell from the documentation whether the cursor and connection objects support context management. Try putting both of those in with statements, and if it complains that there's no __enter__, switch to a try/finally that guarantees cursor and connection closure.

answered Feb 28, 2021 at 15:55
\$\endgroup\$
1
  • 1
    \$\begingroup\$ It seems you concluded in 2020 that context management is not supported by default. According to the bugtracker, the functionality has been added since then. However, it might be a good idea to double-check that. \$\endgroup\$ Commented Feb 28, 2021 at 19:44
2
\$\begingroup\$

Python functions can have optional arguments with default values. If you adjust the signature of connect() to have a default, the create_database() function will be unnecessary.

def connect(database_name = ''):
 return mysql.connector.connect(
 host = 'localhost',
 user = 'root',
 passwd = '',
 database = database_name
 )

Contrary to your question text, you don't have a main function; you just have top-level code nested under an if-conditional. Top-level code (other than constants or imports) is generally a bad idea because it is inflexible and not easily tested or experimented with. Move that code into a proper main() function, and then just invoke it in the if-conditional. It's also not a bad idea to prepare for the future of your script by including the ability to handle simple command-line arguments -- often handy for debugging and experimentation even if they are never part of an intended use case of the script.

import sys
def main(args):
 ...
if __name__ == '__main__':
 main(sys.argv[1:])

After you make that change, the store_data() function will be broken, because it depends on having access to the global mycursor variable. That's another illustration of the problems with top-level code: it can camouflage dependencies and, in some cases, create situations that are difficult to debug and disentangle if the volume of such code grows large enough. Instead, the store_data() function should take the DB connection as an explicit argument.

A rigorous approach of putting all code inside of functions will seem like a small hassle at first; but in my experience, it nearly always pays off in the form of fewer bugs, greater flexibility, improved code readability, and various other benefits.

answered Feb 28, 2021 at 17:33
\$\endgroup\$

Your Answer

Draft saved
Draft discarded

Sign up or log in

Sign up using Google
Sign up using Email and Password

Post as a guest

Required, but never shown

Post as a guest

Required, but never shown

By clicking "Post Your Answer", you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.