I try to sync data from MS SQL to postgres with SSIS. I configure ODBC destination using psqlodbc_12_01 driver downloaded from here
My SSIS package has a single data flow task which just gets data from a single MS SQL table and transfer it to Postgres. Here is MS SQL table
CREATE TABLE [dbo].[DailyGgrNgr](
[Date] [DATETIME] NOT NULL,
[PlayerId] [INT] NOT NULL,
[GameId] [INT] NOT NULL,
[GGR] [DECIMAL](18, 4) NOT NULL,
[NGR] [DECIMAL](18, 4) NOT NULL,
[BetCount] [INT] NOT NULL,
[WinCount] [INT] NOT NULL,
[BetAmount] [DECIMAL](18, 4) NOT NULL,
[WinAmount] [DECIMAL](18, 4) NOT NULL
)
And this the PostgreSQl table
CREATE TABLE public."DailyGgrNgr" (
"Date" timestamp NOT NULL,
"PlayerId" int4 NOT NULL,
"GameId" int4 NOT NULL,
"GGR" numeric(18,4) NOT NULL,
"NGR" numeric(18,4) NOT NULL,
"BetCount" int4 NOT NULL,
"WinCount" int4 NOT NULL,
"BetAmount" numeric(18,4) NOT NULL,
"WinAmount" numeric(18,4) NOT NULL
);
The problem is that data sync is done row by row, instead of bulk as expected. I understand this by profiling my postgres server and there was only a single insert like this
INSERT INTO "public"."DailyGgrNgr" ( "Date","PlayerId","GameId","GGR","NGR","BetCount","WinCount","BetAmount","WinAmount") VALUES (1,ドル 2,ドル 3,ドル 4,ドル 5,ドル 6,ドル 7,ドル 8,ドル 9ドル)
SSIS Destination is configured to process rows in batches as you can see from the schreenshot below
However it still inserts data row by row. Below you can see my ODBC Data Source configs. enter image description here enter image description here enter image description here
I believe the problem is with driver because when I use other commercial drivers they works fine. Also according to Microsoft - "If you select Batch and the provider does not support this method, the ODBC destination automatically switches to the Row-by-row mode."
I wonder if there is a possibility to configure the driver to process the rows in batches instead of row by row or is there another free ODBC driver for postgres that allows batch processing.
My Postgres version is PostgreSQL 12.2, SQL Server 2016.
-
Does it have to be ODBC, or are you open to using a .NET script task?Mr.Brownstone– Mr.Brownstone2020年05月02日 17:58:48 +00:00Commented May 2, 2020 at 17:58
-
I am not familiar with .Net scripting and it will be much easier for me to perform this with ODBC destination.Artashes Khachatryan– Artashes Khachatryan2020年05月02日 18:49:38 +00:00Commented May 2, 2020 at 18:49
-
What is the data access mode set you on your connection manager?Mr.Brownstone– Mr.Brownstone2020年05月02日 19:38:18 +00:00Commented May 2, 2020 at 19:38
-
Data access mode is set to "batch". You can see "InsertMethod" param in the screenshoot attached (0 - row by row, 1 - batch).Artashes Khachatryan– Artashes Khachatryan2020年05月02日 19:53:42 +00:00Commented May 2, 2020 at 19:53
2 Answers 2
Isn't this a case for a tools rethink ? How about BCP from SQL Server to write out a csv text file & Bulk Load on the PostgresSQL side to do the load ?
-
1Disks are slow, memory is fast. When you are working with SSIS it fetches the data from Source to memory and then pass it to destination, so nothing is wrote to disk. Another possible problem might be if I want to do some transformation of data before writing it to destination. With tools like SSIS it is possible, with just writing data to csv it would be hard to implement.Artashes Khachatryan– Artashes Khachatryan2020年05月07日 06:17:51 +00:00Commented May 7, 2020 at 6:17
You are using psqlodbc_12_01, however, batch execution support was introduced by this commit and should be available in release 12_02.