2

The heading pretty much describes it all, here are some more information on the Amazon RDS I am using.

  • Engine: MySQL Community 8.0.28
  • Class db.r6g.large
  • Multi-AZ no. The data volume is very small, not even a few megs, as I am just doing some PoC to test the connectivity.
  • Latency from MySQL commandline: 10ms
  • Availability zone: ap-southeast-1
  • Set up the inbound traffic rule for port 3306 of the relevant security group.

These are the client details

  • Ubuntu 22.04
  • Python 3.10
  • SQL Alchemy 1.4.44
  • mysql Ver 8.0.32-0ubuntu0.22.04.2 for Linux on x86_64 ((Ubuntu))
  • Geographic location: Singapore

Here is a sample code snippet (hopefully, self-explanatory)

#!/usr/bin/env python3
# encoding: utf-8
uri:str = 'mysql+mysqlconnector://della:random_password@tensorflow-dump.apj-xjge.ap-southeast-1.rds.amazonaws.com:3306/convnet'
with create_engine(url=uri).connect() as out_client: # This line throws error
 logging.info(msg=f'Connected to {uri}.') 
 results.to_sql(name='machine_name', con=out_client,
 if_exists='append', index=False)
 logging.info(msg=f'Pushed data.')

So, I am encountering the operational error on a sporadic basis, once every 3-4 attempts of running the above code. It is not reproducible, but makes the database unsuitable for production usage. When I do get the error, the flow waits for about 12 minutes at the create_engine command before throwing the error.

Here is the stacktrace.

Traceback (most recent call last):
 File "/home/della/supply-chain-dev/clustering_results_postprocess/src/push.py", line 119, in <module>
 asyncio.run(main=main())
 File "/usr/lib/python3.10/asyncio/runners.py", line 44, in run
 return loop.run_until_complete(main)
 File "/usr/lib/python3.10/asyncio/base_events.py", line 646, in run_until_complete
 return future.result()
 File "/home/della/supply-chain-dev/clustering_results_postprocess/src/push.py", line 111, in main
 with create_engine(url=out_dbase_uri).connect() as out_client:
 File "/home/della/.local/lib/python3.10/site-packages/sqlalchemy/engine/base.py", line 3315, in connect
 return self._connection_cls(self, close_with_result=close_with_result)
 File "/home/della/.local/lib/python3.10/site-packages/sqlalchemy/engine/base.py", line 96, in __init__
 else engine.raw_connection()
 File "/home/della/.local/lib/python3.10/site-packages/sqlalchemy/engine/base.py", line 3394, in raw_connection
 return self._wrap_pool_connect(self.pool.connect, _connection)
 File "/home/della/.local/lib/python3.10/site-packages/sqlalchemy/engine/base.py", line 3364, in _wrap_pool_connect
 Connection._handle_dbapi_exception_noconnection(
 File "/home/della/.local/lib/python3.10/site-packages/sqlalchemy/engine/base.py", line 2198, in _handle_dbapi_exception_noconnection
 util.raise_(
 File "/home/della/.local/lib/python3.10/site-packages/sqlalchemy/util/compat.py", line 210, in raise_
 raise exception
 File "/home/della/.local/lib/python3.10/site-packages/sqlalchemy/engine/base.py", line 3361, in _wrap_pool_connect
 return fn()
 File "/home/della/.local/lib/python3.10/site-packages/sqlalchemy/pool/base.py", line 325, in connect
 return _ConnectionFairy._checkout(self)
 File "/home/della/.local/lib/python3.10/site-packages/sqlalchemy/pool/base.py", line 888, in _checkout
 fairy = _ConnectionRecord.checkout(pool)
 File "/home/della/.local/lib/python3.10/site-packages/sqlalchemy/pool/base.py", line 491, in checkout
 rec = pool._do_get()
 File "/home/della/.local/lib/python3.10/site-packages/sqlalchemy/pool/impl.py", line 145, in _do_get
 with util.safe_reraise():
 File "/home/della/.local/lib/python3.10/site-packages/sqlalchemy/util/langhelpers.py", line 70, in __exit__
 compat.raise_(
 File "/home/della/.local/lib/python3.10/site-packages/sqlalchemy/util/compat.py", line 210, in raise_
 raise exception
 File "/home/della/.local/lib/python3.10/site-packages/sqlalchemy/pool/impl.py", line 143, in _do_get
 return self._create_connection()
 File "/home/della/.local/lib/python3.10/site-packages/sqlalchemy/pool/base.py", line 271, in _create_connection
 return _ConnectionRecord(self)
 File "/home/della/.local/lib/python3.10/site-packages/sqlalchemy/pool/base.py", line 386, in __init__
 self.__connect()
 File "/home/della/.local/lib/python3.10/site-packages/sqlalchemy/pool/base.py", line 684, in __connect
 with util.safe_reraise():
 File "/home/della/.local/lib/python3.10/site-packages/sqlalchemy/util/langhelpers.py", line 70, in __exit__
 compat.raise_(
 File "/home/della/.local/lib/python3.10/site-packages/sqlalchemy/util/compat.py", line 210, in raise_
 raise exception
 File "/home/della/.local/lib/python3.10/site-packages/sqlalchemy/pool/base.py", line 680, in __connect
 self.dbapi_connection = connection = pool._invoke_creator(self)
 File "/home/della/.local/lib/python3.10/site-packages/sqlalchemy/engine/create.py", line 578, in connect
 return dialect.connect(*cargs, **cparams)
 File "/home/della/.local/lib/python3.10/site-packages/sqlalchemy/engine/default.py", line 598, in connect
 return self.dbapi.connect(*cargs, **cparams)
 File "/home/della/.local/lib/python3.10/site-packages/mysql/connector/pooling.py", line 293, in connect
 return CMySQLConnection(*args, **kwargs)
 File "/home/della/.local/lib/python3.10/site-packages/mysql/connector/connection_cext.py", line 118, in __init__
 self.connect(**kwargs)
 File "/home/della/.local/lib/python3.10/site-packages/mysql/connector/abstracts.py", line 1178, in connect
 self._open_connection()
 File "/home/della/.local/lib/python3.10/site-packages/mysql/connector/connection_cext.py", line 293, in _open_connection
 raise get_mysql_exception(
sqlalchemy.exc.OperationalError: (mysql.connector.errors.OperationalError) 2013 (HY000): Lost connection to MySQL server at 'reading authorization packet', system error: 104
(Background on this error at: https://sqlalche.me/e/14/e3q8)

So I am not sure if it is a

  • resource issue (need a machine with more RAM, or more powerful database instance)
  • something to do with credentials, connection string, port firewall etc. (in that case, the code would fail with a guarantee)
  • merely the internet connection at my work is not reliable (there is no other obvious sign of the connection being snappy, the usual apps, browsing etc. work very well).

When I try with mysql> command line, the connectivity works very well consistently.

asked Feb 28, 2023 at 2:27
7
  • Just checking, are you using replication at all? Commented Mar 6, 2023 at 10:43
  • @dbdemon I am not, is it something enabled by default, or should I enable it explicitly? Any how-to documentation? Commented Mar 6, 2023 at 10:59
  • I was just thinking that if replication is enabled, then it could explain the intermittent nature of the error. With replication there would presumably be some kind of load-balancer that would forward connections to different replicas. If one of the replicas is faulty but the load-balancer still forwards connections to it, then in some cases you will get a working connection and other times you will get an error. Commented Mar 7, 2023 at 11:15
  • @dbdemon thanks, it certainly is a possibility. In that case, it is entirely Amazon's internal issue, and basically I cannot do anything? Commented Mar 7, 2023 at 12:14
  • This looks like a routing issue where there is no route back from the database server over the load balancer(s) to the client host. Commented Mar 7, 2023 at 17:01

2 Answers 2

2

From the looks of the code you gave in the post, you did not set the connect_time or the pool_recycle value for the create_engine command.

Please note that the mysql command line is a different client-side platform. There are the variables interactive_timeout and wait_timeout, which are usually set very high by defaults (28800 (which is in seconds), is also 8 hours). So you would not get a session time out so intermittently.

RECOMMENDATION

You should login to the RDS Instance and run this

mysql> SELECT @@global.interactive_timeout,@@global.wait_timeout;

Then, set your pool_recycle to whichever value is smaller.

answered Mar 12, 2023 at 17:40
2
  • Hey @Della, please have a look at this answer !!! Commented Mar 12, 2023 at 19:02
  • FWIW, I don't bother with interactive_timeout because the only client that uses this is the mysql CLI. Commented Mar 12, 2023 at 19:51
0

This sounds like a networking/routing issue because of the system error 104.

answered Mar 1, 2023 at 3:17
2
  • Is the numbering scheme following the same system as HTTP error codes, as in 200 means success, 404 means resource missing etc.? Commented Mar 1, 2023 at 3:32
  • 2
    Your answer could be improved with additional supporting information. Please edit to add further details, such as citations or documentation, so that others can confirm that your answer is correct. You can find more information on how to write good answers in the help center. Commented Mar 1, 2023 at 3:32

Your Answer

Draft saved
Draft discarded

Sign up or log in

Sign up using Google
Sign up using Email and Password

Post as a guest

Required, but never shown

Post as a guest

Required, but never shown

By clicking "Post Your Answer", you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.