I have created a simple SSIS package that queries a table and extracts data to a flat CSV file. In production this extract could be millions of rows and I want to split the flat file destinations into multiple files based on row count.
So create new file each time we hit 100000 rows with filenames something like
- SomeName_01_date.csv
- SomeName_02_date.csv
I have found a paid for tool by ZappySys that will do this but cannot work out how to do it with just the normal SSIS toolbox, I may be missing something really simple. I have found other posts and videos but some of them involve using additional code outside of the normal tool set, such as the techbrothersit website.
Edit:
After reading up and from the comments this looks to be harder than expected.
If I change process to split the flat files based on a date column in the table would that be more straight forward?
Table has a short date column in this format 2020年07月30日, each CSV file would contain just one days worth of extracted data (could be 100K+), that data is then deleted from the table. The deletion will occur after all data has been extracted.
I am trying to use a foreach/forloop container but struggling as this is totally new to me, any help would be appreciated.
-
There are some interesting things here: stackoverflow.com/questions/1001776/… - may be easier to write the file once out of SSIS, then split using a simple C# program called in a script task. This is much easier in UNIX with SPLIT (kb.iu.edu/d/afar) - if you've installed a BASH environment you could conceivably do that.user212533– user2125332021年07月29日 16:21:54 +00:00Commented Jul 29, 2021 at 16:21
-
Thanks for the suggestion mate, I have edited the question now as this feels simpler, unsure if you have any guidance?Stockburn– Stockburn2021年07月30日 02:03:29 +00:00Commented Jul 30, 2021 at 2:03
1 Answer 1
Why not limit the data via the query that gets executed, either with the OFFSET
and FETCH
clauses or with a predicate you can slide the window on with each iteration in SSIS?
-
Thanks for this mate, I have edited the question after thinking about it unsure if you have any advice?Stockburn– Stockburn2021年07月30日 02:01:57 +00:00Commented Jul 30, 2021 at 2:01
-
@Stockburn No problem, sounds like you're going with the predicate approach by using a date column to filter on. That will probably work for you.J.D.– J.D.2021年07月30日 03:29:08 +00:00Commented Jul 30, 2021 at 3:29
-
1Just for those interested I solved this by using a script task to obtain a list of archive dates from the table, dropped the results into a variable and used a foreachloop container to loop through and create flat files.Stockburn– Stockburn2021年08月17日 22:39:45 +00:00Commented Aug 17, 2021 at 22:39