I have a function, that does multiple queries on the same dataset and I want to ensure all the queries would see exactly the same data.
In terms of SQL, this means REPEATABLE READ isolation level for the databases that support it. I don't mind having higher level or even a complete lockdown if the database isn't capable.
As far as I see, this isn't the case. I.e. if I run something like this code in one Python shell:
with transaction.atomic():
for t in range(0, 60):
print("{0}: {1}".format(t, MyModel.objects.count()))
time.sleep(1)
As soon as I do MyModel.objects.create(...) in another, the value seen by the running loop increase immediately. Which is exactly what I want to avoid. Further tests shows the behavior matches READ COMMITTED level, which is too lax for my tastes.
I'd also want to stress the point, I want stricter isolation level only for a single function, not for the whole project.
What are my best options to achieve this?
In my particular case, the only database I care of is PostgreSQL 9.3+, but I also want some compatibility with SQLite3 in which case even completely locking the whole database is okay with me. Yet, obviously, the more general the solution is, the more preferred it is.
-
is caching this metadata is an acceptable approach?Jossef Harush Kadouri– Jossef Harush Kadouri2015年06月20日 15:35:20 +00:00Commented Jun 20, 2015 at 15:35
-
Unfortunately, no. The queries I do calculate various statistics over raw event data and to have consistent view, I'll have to pull the whole dataset in memory, which is something I really don't want to do.drdaeman– drdaeman2015年06月20日 15:41:18 +00:00Commented Jun 20, 2015 at 15:41
3 Answers 3
You're right, default transaction isolation level in postgres is READ COMMITTED. You can easily change it in settings to test whether it would fit your needs: https://docs.djangoproject.com/en/1.8/ref/databases/#isolation-level
Also I doubt you will face some performance issues because postgres operates very efficiently while working with transactions. Even in SERIALIZABLE mode. Also mysql has REPEATABLE READ default isolation level and as we see it doesn't hurt performance too.
Anyway you can set isolation mode manually whenever you need like this: http://initd.org/psycopg/docs/extensions.html#isolation-level-constants
To set custom transaction isolation level you can try smth like:
from django.db import connection
with transaction.atomic():
cursor = connection.cursor()
cursor.execute('SET TRANSACTION ISOLATION LEVEL REPEATABLE READ')
# logic
Also I would suggest you to change default mode in settings first (if you can). Then if will fit your needs you can remove it and modify code in special places.
6 Comments
SET TRANSACTION ISOLATION LEVEL REPEATABLE READ before any query and it's not simple to be done in Djangodjango-pgtransaction offers an extension of the django.db.transaction.atomic context manager/decorator which allows one to dynamically set the isolation level when opening a transaction, as well as specifying a retry policy for when an operation in that transaction results in a Postgres locking exception:
@pgtransaction.atomic(isolation_level=pgtransaction.SERIALIZABLE, retry=3)
def do_queries():
# Do queries...
Comments
There are 2 ways to set REPEATABLE READ in Django.
<The 1st way (My recommendation)>
You can run the raw query to set isolation level after database settings in settings.py as shown below. *If running the raw query before database settings, error occurs:
# "settings.py"
from django.db import connection
# ...
DATABASES = {
'default':{
'ENGINE':'django.db.backends.postgresql',
'NAME':'postgres',
'USER':'postgres',
'PASSWORD':'admin',
'HOST':'localhost',
'PORT':'5432',
},
}
# ↓ ↓ ↓ Set isolation level ↓ ↓ ↓
cursor = connection.cursor()
query = """
ALTER DATABASE postgres
SET DEFAULT_TRANSACTION_ISOLATION
TO 'REPEATABLE READ';
"""
cursor.execute(query)
# ↓ ↓ ↓ Check isolation level ↓ ↓ ↓
cursor.execute('SHOW default_transaction_isolation;')
print(cursor.fetchone()) # ('repeatable read',)
*settings.py is run every time Django Server is run with the command below or every time Django Server is reloaded by writing code so transaction is set every time Django Server is run with the command below or every time Django Server is reloaded by writing code:
python manage.py runserver 0.0.0.0:8000
<The 2nd way>
You can directly set REPEATABLE READ with psql as shown below:
postgres=# ALTER DATABASE postgres SET DEFAULT_TRANSACTION_ISOLATION TO 'REPEATABLE READ';
Actually, what the documentation explains as shown below doesn't work for me with Django 3.2.16 on Windows 11. That's why I show the 2 ways above:
# "settings.py"
import psycopg2.extensions
DATABASES = {
'default':{
'ENGINE':'django.db.backends.postgresql',
'NAME':'postgres',
'USER':'postgres',
'PASSWORD':'admin',
'HOST':'localhost',
'PORT':'5432',
},
'OPTIONS': { # ↓ ↓ ↓ ↓ ↓ ↓ ↓ ↓ ↓ ↓ Doesn't work ↓ ↓ ↓ ↓ ↓ ↓ ↓ ↓ ↓ ↓
'isolation_level': psycopg2.extensions.ISOLATION_LEVEL_SERIALIZABLE,
},
}
Comments
Explore related questions
See similar questions with these tags.