2

I have n SQL Server databases with the same data schema.

On the other hand i have a web application that needs to retrieve, for some entities, the union of the information stored into the n databases.

A possible solution is to apply the ETL operations (Extract, transform and load) using SSIS (SQL Server Integration Services) and BIDS (Business Intelligence Development Studio).

With the application of the ETL operations I obtain a database (warehouse) that can be used from my web application. See the picture below for a basic sketch.

schema1

The problem is that the web application not only access the warehouse for read, but also for write and update information.

If an entity is updating into the warehouse, the same update need to be applied into the original source.

NB. the information into the source and warehouse can be updating frequently.

Can replication be a good approach to sync the update into warehouse to the right source or there is another tools that I can use?

Maybe there is another strategy instead of ETL and warehouse that I can consider?

marc_s
9,0626 gold badges46 silver badges52 bronze badges
asked Oct 26, 2013 at 7:57

4 Answers 4

1

If your update/insert always happens on a single one of the n databases AND you read a low number of rows on each request, you can use Linked Servers and Distributed Partitioned Views (DPV) to create the illusion of a single database: http://technet.microsoft.com/en-us/library/ms188299(v=sql.105).aspx. This can be done without introducing the complexity of replication.

DPV have several limitations and may give you poor performance if you are not careful. Without knowing more details of the queries you want to run - I can only advise that you try it and see if solves your problem.

answered Dec 28, 2013 at 15:06
0

One option would be to replace your n identical SQL Server databases with writable views against your data warehouse.

In this way, you would keep all of your data in one place, but you could use views to filter the data such that each subset is viewable as a distinct entity. Since you really only have one copy of the data, it won't matter whether the insert/update/delete is done to the filtered view or to the unfiltered table. Both sides will see it at the same time and there will never be a synchronization problem.

Note that in order for a view to be writable, certain conditions must be enforced. See here.

answered Oct 26, 2013 at 12:50
4
  • Sorry, but i don't understand very well what you say. Each Database needs to have updatable views in it? My web application needs to access in read/write to those views? Commented Oct 28, 2013 at 13:05
  • My suggestion is that you create views in the main database instead of using tables in multiple databases. This would allow many users to have distinct subsets of your master data without having two data sources to get out of sync. Commented Oct 28, 2013 at 22:30
  • so, if i have 2 database, for example A and B, i can create a View in B linked to a table in A (by linked server?) and i can read and write over that View? One of the pre-requisites of the writeble view isn't that needs to stay in the same database (A in this case)? Commented Oct 29, 2013 at 8:28
  • @ced - I'm suggesting you do away with database B and just keep one database (A - your data warehouse). Commented Oct 29, 2013 at 10:47
0

As far as I know, the update of the Warehouse data is not a good practice. But, if you really need to perform this action, you can make the following trick:

  1. Replication is a good idea in case when you can consent a delay between data changing and receiving the changed data in "select" query.
  2. The updates made by your application must be performed against the needed Data Source and, by replication mechanism, will be transferred to the Warehouse Data. This can be made by using mediator, main task of which will be the needed Data Source for change making definition.
  3. Transactional replication need to be used from the Data Sources to the Warehouse Data, which will synchronize you data in the Warehouse Data database.

In this case you can avoid the lock during the Warehouse Data updating.

answered Oct 27, 2013 at 15:08
2
  • The warehouse is only a possibile strategy. Any other possibile approaches is accepted! Commented Oct 28, 2013 at 15:12
  • Warehouse data is the name of your database. The main idea was to separate input and output threads. All input (changes) need to be followed to the Data Source database and the output need to be taken from the Warehouse Data. Transfer from the Data Source databases to the Warehouse Data will be made by replication mechanism. Commented Oct 28, 2013 at 15:35
0

Another option would be to leverage Change Data Capture to look for specific writes to the Data Warehouse that need to be written back to your source DB's. However, if you are using CDC on your source DB's as well, this can be tricky since each update will essentially be round tripped.

This approach would allow you to use SSIS as the method to do your updates both directions.

answered Nov 1, 2013 at 19:41

Your Answer

Draft saved
Draft discarded

Sign up or log in

Sign up using Google
Sign up using Email and Password

Post as a guest

Required, but never shown

Post as a guest

Required, but never shown

By clicking "Post Your Answer", you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.