DB Replication or Table Replication via triggers?

Anthony Paul

Hello everyone,

I am involved in a scenario where there is a huge (SQL Server 2005)
production database containing tables that are updated multiple times
per second. End-user reports need to be generated against the data in
this database, and so the powers-that-be came to the conclusion that a
reporting database is necessary in order to offload report processing
from production; of course, this means that data will have to be
replicated to the reporting database. However, we do not need all of
the data in the production database, and perhaps a filtering criteria
can be established where only certain rows are replicated over to the
reporting database as they're inserted (and possibly updated/deleted).
The current though process is that the programmers designing the
queries/reports will know exactly what data they need from production
and be able to modify the replication criteria as needed. For example,
programmer A might write a report where the data he needs can be
expressed in a simple replication criteria for table T where column X
= "WOOD" and column Y = "MAHOGANY". Programmer B might come along a
month later and write a report whose relies on the same table T where
column X = "METAL" and column Z in (12, 24, 36). Programmer B will
have to modify Programmer A's replication criteria in such a way as to
accomodate both reports, in this case something like "Copy rows from
table T where (col X = "WOOD" and col Y = "MAHOGANY") or (col X =
"METAL" and col Z in (12, 24, 36))". The example I gave is really
trivial of course but is sufficient to give you an idea of what the
current thought-process is.

I assume that this is a requirement that many of you may have
encountered in the past and I am wondering what solutions you were
able to come up with. Personally, I believe that the above method is
prone to error (in this case the use of triggers to specify
replication criteria) and I'd much rather use replication services to
copy tables in their entirety. However, this does not seem to be an
option in my case due to the sheer size of certain tables. Is there
anything out there that performs replication based on complex
programmer defined criteria? Are triggers a viable alternative? Any
alternative out-of-the-box solutions?

Any feedback would be appreciated.

Regards!

Anthony

Apr 17 '07 #1

Subscribe Post Reply

7334

Paul Ibison

I'd recommend transactional replication with a nosync initialization. This
is where the initial setup on the reporting server is achieved by using a
restore of the database and after that, only subsequent changes are sent
down. If you are using SQL Server 2005, greater concurrency can be achieved
by using the read committed snapshot isolation level.
Cheers,
Paul Ibison SQL Server MVP, www.replicationanswers.com

Apr 17 '07 #2

Anthony Paul

Hello Paul,

Thank you for your response. I will look into SQL Server 2005's
replication and static row level filtering; are there any books or web-
sites you might recommend? I will need to be able to set up and modify
the row-filter criteria programatically, and the reviews @ Barnes and
Noble on "Pro SQL Server 2005 Replication" are pretty dismal.

Regards,

Anthony

On Apr 17, 3:09 pm, "Paul Ibison" <Paul.Ibi...@Pygmalion.Comwrote:

I'd recommend transactional replication with a nosync initialization. This
is where the initial setup on the reporting server is achieved by using a
restore of the database and after that, only subsequent changes are sent
down. If you are using SQL Server 2005, greater concurrency can be achieved
by using the read committed snapshot isolation level.
Cheers,
Paul Ibison SQL Server MVP,www.replicationanswers.com

Apr 17 '07 #3

Paul Ibison

Have a look at Hilary's book for snapshot and transactional, but if you're
after more merge info and don't like the Pro book then it's really BOL that
you need and then doing some scenarios for yourself to gain experience. As
for websites, I have some useful info on the site below and there are other
articles out on the various SQL Server sites you can get by googling, but
nothing I think specific to your requirements.
BTW this doesn't really lend itself to modifying the filters dynamically. At
least this is not as straightforward as you might think. Normally the
partitions are well designed to start with. If you want something more
dynamic, then I'd not filter at all in replication and I'd use filters on
the client application instead.
Cheers,
Paul Ibison SQL Server MVP, www.replicationanswers.com

Apr 17 '07 #4

Anthony Paul

Hello Paul,
On Apr 17, 6:26 pm, "Paul Ibison" <Paul.Ibi...@Pygmalion.Comwrote:

Have a look at Hilary's book for snapshot and transactional, but if you're
after more merge info and don't like the Pro book then it's really BOL that
you need and then doing some scenarios for yourself to gain experience. As
for websites, I have some useful info on the site below and there are other
articles out on the various SQL Server sites you can get by googling, but
nothing I think specific to your requirements.

Will do.

BTW this doesn't really lend itself to modifying the filters dynamically. At
least this is not as straightforward as you might think. Normally the
partitions are well designed to start with. If you want something more
dynamic, then I'd not filter at all in replication and I'd use filters on
the client application instead.
Cheers,
Paul Ibison SQL Server MVP,www.replicationanswers.com

Ahhh.... then that's a problem, I'd definitely need the ability to be
able to programatically and dynamically change the filtering criteria
as the need arises, in this case every time a new report is requested
that needs a subset of data not being captured by the replication
process. You would think that this is such a common scenario... Also,
filtering on the client side is not an option either since that would
mean that all of the data would get replicated to the reporting db. I
could have sworn that I read in msdn that the filters could be changed
via stored procs though... I'll have to look that up.

Thanks for your help Paul!

Anthony

Apr 17 '07 #5

Ed Murphy

Anthony Paul wrote:

I am involved in a scenario where there is a huge (SQL Server 2005)
production database containing tables that are updated multiple times
per second. End-user reports need to be generated against the data in
this database, and so the powers-that-be came to the conclusion that a
reporting database is necessary in order to offload report processing
from production; of course, this means that data will have to be
replicated to the reporting database. However, we do not need all of
the data in the production database, and perhaps a filtering criteria
can be established where only certain rows are replicated over to the
reporting database as they're inserted (and possibly updated/deleted).
The current though process is that the programmers designing the
queries/reports will know exactly what data they need from production
and be able to modify the replication criteria as needed. For example,
programmer A might write a report where the data he needs can be
expressed in a simple replication criteria for table T where column X
= "WOOD" and column Y = "MAHOGANY". Programmer B might come along a
month later and write a report whose relies on the same table T where
column X = "METAL" and column Z in (12, 24, 36). Programmer B will
have to modify Programmer A's replication criteria in such a way as to
accomodate both reports, in this case something like "Copy rows from
table T where (col X = "WOOD" and col Y = "MAHOGANY") or (col X =
"METAL" and col Z in (12, 24, 36))". The example I gave is really
trivial of course but is sufficient to give you an idea of what the
current thought-process is.

I assume that this is a requirement that many of you may have
encountered in the past and I am wondering what solutions you were
able to come up with. Personally, I believe that the above method is
prone to error (in this case the use of triggers to specify
replication criteria) and I'd much rather use replication services to
copy tables in their entirety. However, this does not seem to be an
option in my case due to the sheer size of certain tables. Is there
anything out there that performs replication based on complex
programmer defined criteria? Are triggers a viable alternative? Any
alternative out-of-the-box solutions?

Is it possible to create views, then configure things so that just
those views are replicated as tables on the second server?

Apr 18 '07 #6

Anthony Paul

Hello Ed,

That's a very good idea, if it turns out that the filter isn't
dynamically configurable then perhaps using a view as a filter and
replicating the view can compensate since views can be modified at any
time. However, I doubt that replication can be done on a view rather
than on a table. I'll have to check it out!

Regards,

Anthony

On Apr 17, 9:46 pm, Ed Murphy <emurph...@socal.rr.comwrote:

Anthony Paul wrote:
I am involved in a scenario where there is a huge (SQL Server 2005)
production database containing tables that are updated multiple times
per second. End-user reports need to be generated against the data in
this database, and so the powers-that-be came to the conclusion that a
reporting database is necessary in order to offload report processing
from production; of course, this means that data will have to be
replicated to the reporting database. However, we do not need all of
the data in the production database, and perhaps a filtering criteria
can be established where only certain rows are replicated over to the
reporting database as they're inserted (and possibly updated/deleted).
The current though process is that the programmers designing the
queries/reports will know exactly what data they need from production
and be able to modify the replication criteria as needed. For example,
programmer A might write a report where the data he needs can be
expressed in a simple replication criteria for table T where column X
= "WOOD" and column Y = "MAHOGANY". Programmer B might come along a
month later and write a report whose relies on the same table T where
column X = "METAL" and column Z in (12, 24, 36). Programmer B will
have to modify Programmer A's replication criteria in such a way as to
accomodate both reports, in this case something like "Copy rows from
table T where (col X = "WOOD" and col Y = "MAHOGANY") or (col X =
"METAL" and col Z in (12, 24, 36))". The example I gave is really
trivial of course but is sufficient to give you an idea of what the
current thought-process is.

I assume that this is a requirement that many of you may have
encountered in the past and I am wondering what solutions you were
able to come up with. Personally, I believe that the above method is
prone to error (in this case the use of triggers to specify
replication criteria) and I'd much rather use replication services to
copy tables in their entirety. However, this does not seem to be an
option in my case due to the sheer size of certain tables. Is there
anything out there that performs replication based on complex
programmer defined criteria? Are triggers a viable alternative? Any
alternative out-of-the-box solutions?

Is it possible to create views, then configure things so that just
those views are replicated as tables on the second server?- Hide quoted text -

- Show quoted text -

Apr 18 '07 #7

Anthony Paul

I just finished looking up using indexed views versus a filter for
replication and it turns out that the view is much slower than a
filter (about 3x as slow) because the log reader has to log each
transaction twice, once for the view and once for the table. In my
case performance is of utmost concern so the overhead involved in this
is not something they can live with. Sigh...

Is it possible to create views, then configure things so that just
those views are replicated as tables on the second server?- Hide quoted text -

- Show quoted text -- Hide quoted text -

- Show quoted text -

Apr 18 '07 #8

Damien

On Apr 18, 12:24 am, Anthony Paul <anthonypa...@gmail.comwrote:

Hello Paul,

On Apr 17, 6:26 pm, "Paul Ibison" <Paul.Ibi...@Pygmalion.Comwrote:

Have a look at Hilary's book for snapshot and transactional, but if you're
after more merge info and don't like the Pro book then it's really BOL that
you need and then doing some scenarios for yourself to gain experience. As
for websites, I have some useful info on the site below and there are other
articles out on the various SQL Server sites you can get by googling, but
nothing I think specific to your requirements.

Will do.

BTW this doesn't really lend itself to modifying the filters dynamically. At
least this is not as straightforward as you might think. Normally the
partitions are well designed to start with. If you want something more
dynamic, then I'd not filter at all in replication and I'd use filters on
the client application instead.
Cheers,
Paul Ibison SQL Server MVP,www.replicationanswers.com

Ahhh.... then that's a problem, I'd definitely need the ability to be
able to programatically and dynamically change the filtering criteria
as the need arises, in this case every time a new report is requested
that needs a subset of data not being captured by the replication
process. You would think that this is such a common scenario... Also,
filtering on the client side is not an option either since that would
mean that all of the data would get replicated to the reporting db. I
could have sworn that I read in msdn that the filters could be changed
via stored procs though... I'll have to look that up.

Thanks for your help Paul!

Anthony

This may sound like a stupid question, but are you sure replicating
the whole database isn't an option? I know you've described the
database as huge, but one mans huge is another mans insignificant (or
the other way around).

It just sounds like you're putting in a lot of work when you may be
able to keep it simple. Apologies if this is a path you've already
worn smooth, just wondering what has made you sure that bog standard
replication isn't the way to go.

Damien

Apr 18 '07 #9

Anthony Paul

Hello Damien,

I would *love* to have a full replication going and not have to worry
about the added complexity of creating dynamic filters or triggers,
but the powers that be simply do not consider it an option. Since I'm
not the one that makes the decisions, I can only go by whatever
options are available. They want ONLY a subset of data to be captured,
nothing more. That would be fine with me if the filter was static (ie.
not subject to change every time a new report is requested) but given
the requirements I am in the same camp as you that a full replication
would be best.

Regards,

Anthony

This may sound like a stupid question, but are you sure replicating
the whole database isn't an option? I know you've described the
database as huge, but one mans huge is another mans insignificant (or
the other way around).

It just sounds like you're putting in a lot of work when you may be
able to keep it simple. Apologies if this is a path you've already
worn smooth, just wondering what has made you sure that bog standard
replication isn't the way to go.

Damien- Hide quoted text -

- Show quoted text -

Apr 18 '07 #10

Ed Murphy

Anthony Paul wrote:

I would *love* to have a full replication going and not have to worry
about the added complexity of creating dynamic filters or triggers,
but the powers that be simply do not consider it an option. Since I'm
not the one that makes the decisions, I can only go by whatever
options are available. They want ONLY a subset of data to be captured,
nothing more. That would be fine with me if the filter was static (ie.
not subject to change every time a new report is requested) but given
the requirements I am in the same camp as you that a full replication
would be best.

TPTB may start considering it an option if you give them a cost
analysis, depending on whether their previous motivation was "wouldn't
it be nice if" (yes, but) or "we think this is cheaper" (no it isn't)
or "this is required for security reasons" (ugh, okay) or whatever.

Apr 18 '07 #11

Erland Sommarskog

Anthony Paul (an**********@gmail.com) writes:

I would *love* to have a full replication going and not have to worry
about the added complexity of creating dynamic filters or triggers,
but the powers that be simply do not consider it an option. Since I'm
not the one that makes the decisions, I can only go by whatever
options are available. They want ONLY a subset of data to be captured,
nothing more. That would be fine with me if the filter was static (ie.
not subject to change every time a new report is requested) but given
the requirements I am in the same camp as you that a full replication
would be best.

From my meager experience of replication, it seems clear that the database
has to be really huge - several terabytes - to make a dynamic filtering
defensible from a cost perspective. It would be difficult to develop,
difficult to maintain and manage.

The only serious option I see to full replication is a static subset.
That is define what will be supported in replication V1. If a new reqiure-
ment that is not covered, it would have to wait to V2. The idea would
of course to only strip really big stuff with low proability to be included.

And this is what you should tell the powers that be: replicating the entire
database will be far less expensive than changing what is replicated
dynamically.

--
Erland Sommarskog, SQL Server MVP, es****@sommarskog.se

Books Online for SQL Server 2005 at
http://www.microsoft.com/technet/pro...ads/books.mspx
Books Online for SQL Server 2000 at
http://www.microsoft.com/sql/prodinf...ons/books.mspx

Apr 19 '07 #12

Similar topics

Trigger based replication

by: Chris W | last post by:

A friend of mine wants me to create a php web application that will show what tables or table/cols are being replicated and the current status for each replication. He is telling me it won't be...

Oracle Database

replication problems with structure/trigger/identity

by: Marko Damaschke | last post by:

Hello alltogether, i'm working on a database-plattform which should work with fail-over-technics. Therefor 2 identic machines with internal RAID are available, which work in...

Microsoft SQL Server

Instead of Delete in replication

by: cfxchange | last post by:

I am looking into work-arounds for what seems to be a flaw, or "undocumented feature" of SQL Server replication and Instead of Delete triggers not playing together. It seems that if you want to...

Microsoft SQL Server

Triggers and Replication

by: drpepper | last post by:

If a table is updated via replication, will an INSERT trigger go off if a new row is being added? --Bob

Microsoft SQL Server

DB2 Replication having slow apply (update) at the target server

by: Trent | last post by:

Hello,everyone. I'm setting a db2 replication environment using UDB version 8.1.5 running on Windows 2000 servers. The source server is on a Windows server with the capture program running while...

DB2 Database

Triggers-Replication-unique key

by: chuckiedz | last post by:

I'm looking into triggers as a potential fix to a problem I'm having at work. The basic situation is that we are using db2 replication, and if something goes wrong, there is the chance that a...

DB2 Database

Replication

by: Pailloncy Jean-Gérard | last post by:

Hi, I just see that Mysql will propose at the end of the month a full synchronous replication system with auto-recovery. http://www.mysql.com/products/cluster/ We need to see when stable...

PostgreSQL Database

How do you conditionally fire a trigger (mimic replication)

by: steven | last post by:

Does anyone know how to do the following. I'm trying to mimic replication with triggers. I have 2 databases, each have these 2 tables. 1. USERS ID int NAME varchar(20)

Microsoft SQL Server

Merge Replication and Trigger Problem

by: Benzine | last post by:

Hi, I have an issue with my replication at the moment. I will try to describe the scenario accurately. I am using MS SQL 2000 SP4 with Merge Replication. Subscribers connect to the publisher...

Microsoft SQL Server

Easy Steps to Fix "Canon Printer Won't Connect to WiFi Network"

by: taylorcarr | last post by:

A Canon printer is a smart device known for being advanced, efficient, and reliable. It is designed for home, office, and hybrid workspace use and can also be used for a variety of purposes. However,...

General

Basic Javascript concepts

by: aa123db | last post by:

Variable and constants Use var or let for variables and const fror constants. Var foo ='bar'; Let foo ='bar';const baz ='bar'; Functions function $name$ ($parameters$) { } ...

Javascript

Batch import of multiple excel files into the database

by: ryjfgjl | last post by:

If we have dozens or hundreds of excel to import into the database, if we use the excel import function provided by database editors such as navicat, it will be extremely tedious and time-consuming...

Data Management

Merging data from multiple Excel files

by: ryjfgjl | last post by:

In our work, we often receive Excel tables with data in the same format. If we want to analyze these data, it can be difficult to analyze them because the data is spread across multiple Excel files...

Data Management

Migrating Website to Cloud - Emmanuel Katto

by: emmanuelkatto | last post by:

Hi All, I am Emmanuel katto from Uganda. I want to ask what challenges you've faced while migrating a website to cloud. Please let me know. Thanks! Emmanuel

General

Navigating the Data Structures and Algorithms (DSA)

by: BarryA | last post by:

What are the essential steps and strategies outlined in the Data Structures and Algorithms (DSA) roadmap for aspiring data scientists? How can individuals effectively utilize this roadmap to progress...

Algorithms / Advanced Math

Is that possible of reading the .csv file in column wise and the column have different lengths ?

by: Sonnysonu | last post by:

This is the data of csv file 1 2 3 1 2 3 1 2 3 1 2 3 2 3 2 3 3 the lengths should be different i have to store the data by column-wise with in the specific length. suppose the i have to...

C / C++

How to build RAID in BIOS?

by: Hystou | last post by:

There are some requirements for setting up RAID: 1. The motherboard and BIOS support RAID configuration. 2. The motherboard has 2 or more available SATA protocol SSD/HDD slots (including MSATA, M.2...

Computer Hardware

What is ONU?

by: marktang | last post by:

ONU (Optical Network Unit) is one of the key components for providing high-speed Internet services. Its primary function is to act as an endpoint device located at the user's premises. However,...

General