-
Couldn't load subscription status.
- Fork 347
Querying table and selecting on a few of the columns included in a primary key is generating a DISTINCT result. #1484
-
All - could use some help trying to figure out why this is happening. I've written a PowerShell module that wraps various database providers to make it easier to interact with databases through PowerShell (the module is called SimplySql). Anyways, I did a major rewrite last year and in the process changed my MySql provider to MySqlConnector. One of my users has reported this issue.
Basically, if a table is created with a primary key with multiple columns, but a select statement with only primary key columns included (but not all of them), then a DISTINCT result is happening.
-- Here is the SQL to create the two tables -- Primary key CREATE TABLE test3 (colA VARCHAR(25), colB VARCHAR(25), colC VARCHAR(25), colD VARCHAR(25), PRIMARY KEY (colA, colB, colC)); INSERT INTO test3 (colA, colB, colC, colD) VALUES ('A', 'A', 'A', 'A'); INSERT INTO test3 (colA, colB, colC, colD) VALUES ('A', 'B', 'A', 'A'); INSERT INTO test3 (colA, colB, colC, colD) VALUES ('B', 'A', 'A', 'A'); INSERT INTO test3 (colA, colB, colC, colD) VALUES ('C', 'A', 'A', 'A'); -- No primary Key CREATE TABLE test4 (colA VARCHAR(25), colB VARCHAR(25), colC VARCHAR(25), colD VARCHAR(25)); INSERT INTO test4 (colA, colB, colC, colD) VALUES ('A', 'A', 'A', 'A'); INSERT INTO test4 (colA, colB, colC, colD) VALUES ('A', 'B', 'A', 'A'); INSERT INTO test4 (colA, colB, colC, colD) VALUES ('B', 'A', 'A', 'A'); INSERT INTO test4 (colA, colB, colC, colD) VALUES ('C', 'A', 'A', 'A');
Ok, now here is the PowerShell and the interesting results
<# Install-Module SimplySql #use this to install the module, make sure its version 2.0 or greater Open-MySqlConnection -Server <servername> -Database <dbname> -Credential (Get-Credential #> # Notice how the following statement will only return 3 rows Invoke-SqlQuery -Query "SELECT colA FROM test3" # This one returns 4 rows... the difference? test4 does not have a primary key Invoke-SqlQuery -Query "SELECT colA FROM test4"
Beta Was this translation helpful? Give feedback.
All reactions
Replies: 2 comments 5 replies
-
I can't reproduce this problem using standard C# to use the ADO.NET interfaces
var connectionString = "server=localhost;database=mysqltest;userid=root;password=pass"; using var mscConnection = new MySqlConnector.MySqlConnection(connectionString); mscConnection.Open(); using var msdConnection = new MySql.Data.MySqlClient.MySqlConnection(connectionString); msdConnection.Open(); using (var mscCommand = mscConnection.CreateCommand()) { mscCommand.CommandText = """ DROP TABLE IF EXISTS test3; DROP TABLE IF EXISTS test4; -- Here is the SQL to create the two tables -- Primary key CREATE TABLE test3 (colA VARCHAR(25), colB VARCHAR(25), colC VARCHAR(25), colD VARCHAR(25), PRIMARY KEY (colA, colB, colC)); INSERT INTO test3 (colA, colB, colC, colD) VALUES ('A', 'A', 'A', 'A'); INSERT INTO test3 (colA, colB, colC, colD) VALUES ('A', 'B', 'A', 'A'); INSERT INTO test3 (colA, colB, colC, colD) VALUES ('B', 'A', 'A', 'A'); INSERT INTO test3 (colA, colB, colC, colD) VALUES ('C', 'A', 'A', 'A'); -- No primary Key CREATE TABLE test4 (colA VARCHAR(25), colB VARCHAR(25), colC VARCHAR(25), colD VARCHAR(25)); INSERT INTO test4 (colA, colB, colC, colD) VALUES ('A', 'A', 'A', 'A'); INSERT INTO test4 (colA, colB, colC, colD) VALUES ('A', 'B', 'A', 'A'); INSERT INTO test4 (colA, colB, colC, colD) VALUES ('B', 'A', 'A', 'A'); INSERT INTO test4 (colA, colB, colC, colD) VALUES ('C', 'A', 'A', 'A'); """; mscCommand.ExecuteNonQuery(); } foreach (var connection in new DbConnection[] { mscConnection, msdConnection }) { foreach (var sql in new[] { "SELECT colA FROM test3", "SELECT colA FROM test4" }) { using var command = connection.CreateCommand(); command.CommandText = sql; Console.WriteLine("{0} --- {1}", command.CommandText, command.GetType().FullName); using var reader = command.ExecuteReader(); { while (reader.Read()) { Console.WriteLine(reader.GetValue(0)); } } Console.WriteLine(); } }
Output:
SELECT colA FROM test3 --- MySqlConnector.MySqlCommand
A
A
B
C
SELECT colA FROM test4 --- MySqlConnector.MySqlCommand
A
A
B
C
SELECT colA FROM test3 --- MySql.Data.MySqlClient.MySqlCommand
A
A
B
C
SELECT colA FROM test4 --- MySql.Data.MySqlClient.MySqlCommand
A
A
B
C
Beta Was this translation helpful? Give feedback.
All reactions
-
Are you sure the problem is in what MySqlConnector returns and not in the PowerShell that's consuming it?
Beta Was this translation helpful? Give feedback.
All reactions
-
It looks like you might be using MySqlDataAdapter: https://github.com/mithrandyr/SimplySql/blob/master/source/SimplySql.Engine/MySQL/MySqlProvider.vb#L33-L45
I rewrote the code to test that instead:
using var da = command is MySqlConnector.MySqlCommand ? (DbDataAdapter) new MySqlConnector.MySqlDataAdapter((MySqlConnector.MySqlCommand) command) : new MySql.Data.MySqlClient.MySqlDataAdapter((MySql.Data.MySqlClient.MySqlCommand) command); Console.WriteLine("{0} --- {1}", command.CommandText, da.GetType().FullName); var ds = new DataSet(); da.ReturnProviderSpecificTypes = true; da.Fill(ds); foreach (DataRow row in ds.Tables[0].Rows) Console.WriteLine(row[0]); Console.WriteLine();
Same output:
SELECT colA FROM test3 --- MySqlConnector.MySqlDataAdapter
A
A
B
C
SELECT colA FROM test4 --- MySqlConnector.MySqlDataAdapter
A
A
B
C
SELECT colA FROM test3 --- MySql.Data.MySqlClient.MySqlDataAdapter
A
A
B
C
SELECT colA FROM test4 --- MySql.Data.MySqlClient.MySqlDataAdapter
A
A
B
C
Beta Was this translation helpful? Give feedback.
All reactions
-
@bgrainger -- thanks for taking a look. I know my code is a bit confusing (there are multiple branches, supporting a base implementation that is provider agnostic that can be overridden if a provider needs special handling. Anyways, here is the code that is actually being executed.
Using cmd As IDbCommand = GetCommand(query, cmdTimeout, params) Try Dim ds As New DataSet Using dr As IDataReader = cmd.ExecuteReader Do Dim dt As New DataTable dt.Load(dr) If dt.Rows.Count > 0 Then ds.Tables.Add(dt) Loop While Not dr.IsClosed AndAlso dr.NextResult() End Using Return ds Catch ex As Exception ex.AddQueryDetails(query, params) Throw End Try End Using
When I debug and step through it, I noticed that after dt.Load(dr) -- the DataTable only holds 3 rows.
So now I wrote a sample console application to try and reproduce and i was able to. The key is, DataReader by itself seems to be find, but DataReader with DataTable.Load() seems to give an issue.
Using conn As New MySqlConnection(connSB.ToString) conn.Open() Console.WriteLine("==[select cola from test3]==") Using cmd As New MySqlCommand("select cola from test3", conn) Using dr = cmd.ExecuteReader While dr.Read Console.WriteLine(dr.GetString("cola")) End While End Using Dim dt As New Data.DataTable Using dr = cmd.ExecuteReader dt.Load(dr) Console.WriteLine($"DT Rows: {dt.Rows.Count}") End Using End Using Console.WriteLine() Console.WriteLine("==[select cola from test4]==") Using cmd As New MySqlCommand("select cola from test4", conn) Using dr = cmd.ExecuteReader While dr.Read Console.WriteLine(dr.GetString("cola")) End While End Using Dim dt As New Data.DataTable Using dr = cmd.ExecuteReader dt.Load(dr) Console.WriteLine($"DT Rows: {dt.Rows.Count}") End Using End Using End Using
Results
image
Beta Was this translation helpful? Give feedback.
All reactions
-
as an aside -- I've having the same results with SQLite databases.. but MSSQL, Oracle and Postgres are working fine.
Beta Was this translation helpful? Give feedback.
All reactions
-
One difference I can think of (between test3 and test4) is that MySqlDataReader.GetColumnSchema() returns true for DbColumn.IsKey. Perhaps this causes DataTable to de-duplicate rows when it loads?
I'm not sure if this is a bug or not. The column is certainly part of a key, but not every column in the PK is returned in the DbDataReader.
Beta Was this translation helpful? Give feedback.
All reactions
-
@bgrainger -- I think it might be a bug. When DataTable.Load() runs, it looks like for any column that has DbColumn.IsKey set to true then it sets the Unique property of DataColumn to true as well -- thus causing the deduplication.
I did a quick check with MSSQL, when query a table with a primary key, it does not set Unique to true for any DataColumn in the DataTable.
Beta Was this translation helpful? Give feedback.