Skip to content

Navigation Menu

Sign in
Appearance settings

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Sign up
Appearance settings

Querying table and selecting on a few of the columns included in a primary key is generating a DISTINCT result. #1484

Unanswered
mithrandyr asked this question in Q&A
Discussion options

All - could use some help trying to figure out why this is happening. I've written a PowerShell module that wraps various database providers to make it easier to interact with databases through PowerShell (the module is called SimplySql). Anyways, I did a major rewrite last year and in the process changed my MySql provider to MySqlConnector. One of my users has reported this issue.

Basically, if a table is created with a primary key with multiple columns, but a select statement with only primary key columns included (but not all of them), then a DISTINCT result is happening.

-- Here is the SQL to create the two tables
-- Primary key
CREATE TABLE test3 (colA VARCHAR(25), colB VARCHAR(25), colC VARCHAR(25), colD VARCHAR(25), PRIMARY KEY (colA, colB, colC));
INSERT INTO test3 (colA, colB, colC, colD) VALUES ('A', 'A', 'A', 'A');
INSERT INTO test3 (colA, colB, colC, colD) VALUES ('A', 'B', 'A', 'A');
INSERT INTO test3 (colA, colB, colC, colD) VALUES ('B', 'A', 'A', 'A');
INSERT INTO test3 (colA, colB, colC, colD) VALUES ('C', 'A', 'A', 'A');
-- No primary Key
CREATE TABLE test4 (colA VARCHAR(25), colB VARCHAR(25), colC VARCHAR(25), colD VARCHAR(25));
INSERT INTO test4 (colA, colB, colC, colD) VALUES ('A', 'A', 'A', 'A');
INSERT INTO test4 (colA, colB, colC, colD) VALUES ('A', 'B', 'A', 'A');
INSERT INTO test4 (colA, colB, colC, colD) VALUES ('B', 'A', 'A', 'A');
INSERT INTO test4 (colA, colB, colC, colD) VALUES ('C', 'A', 'A', 'A');

Ok, now here is the PowerShell and the interesting results

<#
Install-Module SimplySql #use this to install the module, make sure its version 2.0 or greater
Open-MySqlConnection -Server <servername> -Database <dbname> -Credential (Get-Credential
#>
# Notice how the following statement will only return 3 rows
Invoke-SqlQuery -Query "SELECT colA FROM test3"
# This one returns 4 rows... the difference? test4 does not have a primary key
Invoke-SqlQuery -Query "SELECT colA FROM test4"

image

You must be logged in to vote

Replies: 2 comments 5 replies

Comment options

I can't reproduce this problem using standard C# to use the ADO.NET interfaces

var connectionString = "server=localhost;database=mysqltest;userid=root;password=pass";
using var mscConnection = new MySqlConnector.MySqlConnection(connectionString);
mscConnection.Open();
using var msdConnection = new MySql.Data.MySqlClient.MySqlConnection(connectionString);
msdConnection.Open();
using (var mscCommand = mscConnection.CreateCommand())
{
	mscCommand.CommandText = """
		DROP TABLE IF EXISTS test3;
		DROP TABLE IF EXISTS test4;
		-- Here is the SQL to create the two tables
		-- Primary key
		CREATE TABLE test3 (colA VARCHAR(25), colB VARCHAR(25), colC VARCHAR(25), colD VARCHAR(25), PRIMARY KEY (colA, colB, colC));

		INSERT INTO test3 (colA, colB, colC, colD) VALUES ('A', 'A', 'A', 'A');
		INSERT INTO test3 (colA, colB, colC, colD) VALUES ('A', 'B', 'A', 'A');
		INSERT INTO test3 (colA, colB, colC, colD) VALUES ('B', 'A', 'A', 'A');
		INSERT INTO test3 (colA, colB, colC, colD) VALUES ('C', 'A', 'A', 'A');

		-- No primary Key
		CREATE TABLE test4 (colA VARCHAR(25), colB VARCHAR(25), colC VARCHAR(25), colD VARCHAR(25));

		INSERT INTO test4 (colA, colB, colC, colD) VALUES ('A', 'A', 'A', 'A');
		INSERT INTO test4 (colA, colB, colC, colD) VALUES ('A', 'B', 'A', 'A');
		INSERT INTO test4 (colA, colB, colC, colD) VALUES ('B', 'A', 'A', 'A');
		INSERT INTO test4 (colA, colB, colC, colD) VALUES ('C', 'A', 'A', 'A');
		""";
	mscCommand.ExecuteNonQuery();
}
foreach (var connection in new DbConnection[] { mscConnection, msdConnection })
{
	foreach (var sql in new[] { "SELECT colA FROM test3", "SELECT colA FROM test4" })
	{
		using var command = connection.CreateCommand();
		command.CommandText = sql;
		Console.WriteLine("{0} --- {1}", command.CommandText, command.GetType().FullName);
		using var reader = command.ExecuteReader();
		{
			while (reader.Read())
			{
				Console.WriteLine(reader.GetValue(0));
			}
		}
		Console.WriteLine();
	}
}

Output:

SELECT colA FROM test3 --- MySqlConnector.MySqlCommand
A
A
B
C
SELECT colA FROM test4 --- MySqlConnector.MySqlCommand
A
A
B
C
SELECT colA FROM test3 --- MySql.Data.MySqlClient.MySqlCommand
A
A
B
C
SELECT colA FROM test4 --- MySql.Data.MySqlClient.MySqlCommand
A
A
B
C
You must be logged in to vote
1 reply
Comment options

Are you sure the problem is in what MySqlConnector returns and not in the PowerShell that's consuming it?

Comment options

It looks like you might be using MySqlDataAdapter: https://github.com/mithrandyr/SimplySql/blob/master/source/SimplySql.Engine/MySQL/MySqlProvider.vb#L33-L45

I rewrote the code to test that instead:

		using var da = command is MySqlConnector.MySqlCommand ?
			(DbDataAdapter) new MySqlConnector.MySqlDataAdapter((MySqlConnector.MySqlCommand) command) :
			new MySql.Data.MySqlClient.MySqlDataAdapter((MySql.Data.MySqlClient.MySqlCommand) command);
		Console.WriteLine("{0} --- {1}", command.CommandText, da.GetType().FullName);
		var ds = new DataSet();
		da.ReturnProviderSpecificTypes = true;
		da.Fill(ds);
		foreach (DataRow row in ds.Tables[0].Rows)
			Console.WriteLine(row[0]);
		Console.WriteLine();

Same output:

SELECT colA FROM test3 --- MySqlConnector.MySqlDataAdapter
A
A
B
C
SELECT colA FROM test4 --- MySqlConnector.MySqlDataAdapter
A
A
B
C
SELECT colA FROM test3 --- MySql.Data.MySqlClient.MySqlDataAdapter
A
A
B
C
SELECT colA FROM test4 --- MySql.Data.MySqlClient.MySqlDataAdapter
A
A
B
C
You must be logged in to vote
4 replies
Comment options

@bgrainger -- thanks for taking a look. I know my code is a bit confusing (there are multiple branches, supporting a base implementation that is provider agnostic that can be overridden if a provider needs special handling. Anyways, here is the code that is actually being executed.

 Using cmd As IDbCommand = GetCommand(query, cmdTimeout, params)
 Try
 Dim ds As New DataSet
 Using dr As IDataReader = cmd.ExecuteReader
 Do
 Dim dt As New DataTable
 dt.Load(dr)
 If dt.Rows.Count > 0 Then ds.Tables.Add(dt)
 Loop While Not dr.IsClosed AndAlso dr.NextResult()
 End Using
 Return ds
 Catch ex As Exception
 ex.AddQueryDetails(query, params)
 Throw
 End Try
 End Using

When I debug and step through it, I noticed that after dt.Load(dr) -- the DataTable only holds 3 rows.

So now I wrote a sample console application to try and reproduce and i was able to. The key is, DataReader by itself seems to be find, but DataReader with DataTable.Load() seems to give an issue.

 Using conn As New MySqlConnection(connSB.ToString)
 conn.Open()
 Console.WriteLine("==[select cola from test3]==")
 Using cmd As New MySqlCommand("select cola from test3", conn)
 Using dr = cmd.ExecuteReader
 While dr.Read
 Console.WriteLine(dr.GetString("cola"))
 End While
 End Using
 Dim dt As New Data.DataTable
 Using dr = cmd.ExecuteReader
 dt.Load(dr)
 Console.WriteLine($"DT Rows: {dt.Rows.Count}")
 End Using
 End Using
 Console.WriteLine()
 Console.WriteLine("==[select cola from test4]==")
 Using cmd As New MySqlCommand("select cola from test4", conn)
 Using dr = cmd.ExecuteReader
 While dr.Read
 Console.WriteLine(dr.GetString("cola"))
 End While
 End Using
 Dim dt As New Data.DataTable
 Using dr = cmd.ExecuteReader
 dt.Load(dr)
 Console.WriteLine($"DT Rows: {dt.Rows.Count}")
 End Using
 End Using
 End Using

Results
image

Comment options

as an aside -- I've having the same results with SQLite databases.. but MSSQL, Oracle and Postgres are working fine.

Comment options

One difference I can think of (between test3 and test4) is that MySqlDataReader.GetColumnSchema() returns true for DbColumn.IsKey. Perhaps this causes DataTable to de-duplicate rows when it loads?

I'm not sure if this is a bug or not. The column is certainly part of a key, but not every column in the PK is returned in the DbDataReader.

Comment options

@bgrainger -- I think it might be a bug. When DataTable.Load() runs, it looks like for any column that has DbColumn.IsKey set to true then it sets the Unique property of DataColumn to true as well -- thus causing the deduplication.

I did a quick check with MSSQL, when query a table with a primary key, it does not set Unique to true for any DataColumn in the DataTable.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Category
Q&A
Labels
None yet

AltStyle によって変換されたページ (->オリジナル) /