Below is a dummy example to demonstrate how the query results are different, the real query is more more complex so the query structure may seem overkill in this example. Set up a connection to a sqlite3 database and add these records to start with:
import sqlite3
connection = sqlite3.connect(
'file:test_database',
detect_types=sqlite3.PARSE_DECLTYPES,
isolation_level=None,
check_same_thread=False,
uri=True
)
cursor = connection.cursor()
tableA_records = [(1, 202003), (2, 202003), (3, 202003), (4, 202004), (5, 202004), (6, 202004), (7, 202004), (8, 202004), ]
tableB_records = [(1, 202004), (2, 202004), (3, 202004), (4, 202004), (5, 202004),]
tableA_ddl = """
create table tableA
(
ID int,
RunYearMonth int
);
"""
tableB_ddl = """
create table tableB
(
ID int,
RunYearMonth int
);
"""
cursor.execute(tableA_ddl)
cursor.execute(tableB_ddl)
cursor.executemany("INSERT INTO tableA VALUES (?, ?)", tableA_records)
cursor.executemany("INSERT INTO tableB VALUES (?, ?)", tableB_records)
Now we have two tables (A and B) with 8 and 5 records, respectively. I want to count records that have the same ID and date between the two when the date is 202004.
I have this query now:
SELECT COUNT(*)
FROM (
SELECT *
FROM `tableA`
WHERE `RunYearMonth` = 202004
) AS `A`
INNER JOIN (
SELECT *
FROM `tableB`
WHERE `RunYearMonth` = 202004
) AS `B`
ON `A`.`ID` = `B`.`ID`
AND `A`.`RunYearMonth` = `B`.`RunYearMonth`
This, as expected, returns 2 when run in a sqlite console.
When run in Python, though, you get a different result.
q = """
SELECT COUNT(*)
FROM (
SELECT *
FROM `tableA`
WHERE `RunYearMonth` = 202004
) AS `map1`
INNER JOIN (
SELECT *
FROM `tableB`
WHERE `RunYearMonth` = 202004
) AS `map2`
ON `map1`.`ID` = `map2`.`ID`
AND `map1`.`RunYearMonth` = `map2`.`RunYearMonth`
"""
cursor.execute(q)
print(cursor.fetchall())
This returns instead 5 which effectively ignores the WHERE clauses in the subqueries and the join condition they have the same RunYearMonth, there are records 1-5 in both.
What would cause this difference? Does Python not simply pass the query string through?
Pertinent versions:
sqlite3.version == 2.6.0
sqlite3.sqlite_version == 3.31.1
sys.version == 3.6.5
-
I'm getting 2 for the query run in Python. My SQLite version is 3.22.0 and my Python version is 3.8.0mechanical_meat– mechanical_meat2020年04月24日 03:34:58 +00:00Commented Apr 24, 2020 at 3:34
-
2@mechanical_meat The relevant bug was introduced in 3.31.Shawn– Shawn2020年04月24日 04:54:39 +00:00Commented Apr 24, 2020 at 4:54
1 Answer 1
I created a test database using your first script, and then opened it in a sqlite3 shell. Your query returns 5 in it, not the 2 you're getting. After changing it to show all the rows, not just the count, it results in:
ID RunYearMonth ID RunYearMonth
---------- ------------ ---------- ------------
1 202003 1 202004
2 202003 2 202004
3 202003 3 202004
4 202004 4 202004
5 202004 5 202004
I'm not sure why those rows from tableA with RunYearMonth of 202003 are getting included; I'd think they would be filtered out by the subquery's WHERE.
This appears to be a bug in Sqlite3 - using an older version (3.11.0) gives the expected results, and a slight tweak to the query to remove the AND map1.RunYearMonth = map2.RunYearMonth produces the correct results on 3.31.1.
Anyways, that query can be cleaned up significantly, like so:
SELECT count(*)
FROM tableA AS A
JOIN tableB AS B ON A.ID = B.ID
AND A.RunYearMonth = B.RunYearMonth
WHERE A.RunYearMonth = 202004;
which does return the expected count of 2.