473,569 Members | 2,634 Online
Bytes | Software Development & Data Engineering Community
+ Post

Home Posts Topics Members FAQ

Verifying Data Integrity

I'd like to write a program to verify the consistency between various
database replicas in our environment. The rules are as follows:

- I'm given two servers, each with a database
- I don't know the schema of the database
- The databases are large (some tables with 10,000,000+ rows)

What would be the best way to perform a pretty-good-guess analysis of
whether or not these two databases were in sync?

Currently I do the following:

- Give me a list of tables. If the table lists don't match, then call the
databases inconsistent.
- Give me a row count of each table. If the row counts don't match, call
the table inconsistent

What I'd like to do is actually do something with data, but these are large
databases and I don't think it's reasonable to compare each row. What do
you guys suggest?
Michael
Jul 20 '05 #1
4 2721
I suggest you use php or other languages, and look at the database
files. In mysql, each table consists of multiple physical files
(data, index, etc.). You could try to compare the size of files, and
this should give you a good idea.

I would guess that can catch most of the inconsistency problems. If
you want to go deeper, then we get to row by row comparsion IMHO.

steve

--
http://www.dbForumz.com/ This article was posted by author's request
Articles individually checked for conformance to usenet standards
Topic URL: http://www.dbForumz.com/mySQL-Verify...ict120306.html
Visit Topic URL to contact author (reg. req'd). Report abuse: http://www.dbForumz.com/eform.php?p=403043
Jul 20 '05 #2
"steve" <Us************ @dbForumz.com> wrote in message
I suggest you use php or other languages, and look at the database
files. In mysql, each table consists of multiple physical files
(data, index, etc.). You could try to compare the size of files, and
this should give you a good idea.

I would guess that can catch most of the inconsistency problems. If
you want to go deeper, then we get to row by row comparsion IMHO.


Hmm...thanks, Steve. I actually had a further question and wonder if anyone
here has the answer. I've tried doing the following:

- Grab the last row of table 1 (DBI, fetchrow())
- Grab the last row of table 2
- Do an MD5 hash on each (Digest::MD5, md5_base64)

I've found that for some reason, I get the same hash even though the data is
completely different. I'm hashing the output of fetchrow() rather than
parsing it first because I don't know the actual schema of the table that I
have to examine. Can someone tell me why my hashes are coming out the same?
Michael
Jul 20 '05 #3
Michael wrote:
"steve" <Us************ @dbForumz.com> wrote in message
I suggest you use php or other languages, and look at the database files. In mysql, each table consists of multiple physical files
(data, index, etc.). You could try to compare the size of files, and this should give you a good idea.

I would guess that can catch most of the inconsistency problems. If you want to go deeper, then we get to row by row comparsion IMHO.
Hmm...thanks, Steve. I actually had a further question and wonder

if anyone here has the answer. I’ve tried doing the following:

- Grab the last row of table 1 (DBI, fetchrow())
- Grab the last row of table 2
- Do an MD5 hash on each (Digest::MD5, md5_base64)

I’ve found that for some reason, I get the same hash even though the data is completely different. I’m hashing the output of fetchrow() rather than parsing it first because I don’t know the actual schema of the table that I have to examine. Can someone tell me why my hashes are coming out the same?

Michael

Michale,
I don’t know what your problem is, but I am sure someone else can
help.

Still, I think the easiest is to do an "ls" on the db directory, and
the resulting data set has to be exactly equivalent. I would use
"ls" without the parameters that would show dates, since dates would
be different.

If the other approach does not work, this would be fairly easy to do
with Perl or PHP.

--
http://www.dbForumz.com/ This article was posted by author's request
Articles individually checked for conformance to usenet standards
Topic URL: http://www.dbForumz.com/mySQL-Verify...ict120306.html
Visit Topic URL to contact author (reg. req'd). Report abuse: http://www.dbForumz.com/eform.php?p=406100
Jul 20 '05 #4
Sorry, I meant "Michael"!

Actually my approach may not work, if mysql does some kind of file
optimization automatically where the file sizes may then be different.
Please disregard.

Your approach is a good one. Are you using Perl? Then I would go to
a Perl discussion group to get a better response.

--
http://www.dbForumz.com/ This article was posted by author's request
Articles individually checked for conformance to usenet standards
Topic URL: http://www.dbForumz.com/mySQL-Verify...ict120306.html
Visit Topic URL to contact author (reg. req'd). Report abuse: http://www.dbForumz.com/eform.php?p=406101
Jul 20 '05 #5

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

10
3550
by: Chris Sharman | last post by:
I'm doing a rough validation of an email address client-side (using js), but it's not enough - our customer service people are apparently incapable of typing in an email address without error - we get spaces, missing '@', multiple '@' which we can field, and simple mis-spellings, which we can't. I'd like to automate a 'mailto:' popup for...
0
1893
by: Magic1812 | last post by:
Magic Software invites you to join us this coming Tuesday (January 27th, 2004) at 12:00 EDT / 17:00 GMT for a FREE live Webinar: Title: Data Integrity Using eDeveloper Date: January 27, 2004 Time: 12:00 PM EST / 17:00 GMT Presenter: Yuval Asheri
4
1562
by: HD | last post by:
Hi, I was wondering if there is a way of verifying information that is in the registry of the user's computer??? Or is there a way of checking if a file is on the c:\ of the user's computer?? I want to make a certain verification of who is entering my website... but the IP address is never the same (since it is randomly given to the user)...
2
9967
by: Brice | last post by:
Hello, Sorry if this is a basic question but I can't seem to find the answer in the DB2 tutorial series or my DB2 manuals. How does one check the data integrity and referential integrity of an instance before running a backup? I see how one can run "db2ckbkp" after the backup has run, and I've seen some discussion of checksums in past...
5
1545
by: DS | last post by:
How does one set-up a macro to do the following. When you enter an Employee Number in a log-in box Access checks the Employee Table to see if thats a valid Employee number. if it is the Macro continues along its way, if not it gives a message box saying that "This is not a valid Employee ID" DS
41
4648
by: laimis | last post by:
Hey guys, I just recently got introduced to data mappers (DTO mapper). So now I have a SqlHelper being used by DTOMapper and then business layer is using DTOMapper when it needs to persist object to database or load them back. Everything is working nicely so far. My question is, is it OK practice to use DTOMapper rfom the presentation...
5
3997
by: Geisler, Jim | last post by:
So, as far as I know, PostgreSQL does not have any way of verifying the loss of referential integrity. Are there any recommended methods or utilities for checking referential integrity in a PostgreSQL database?
1
1948
by: foothills bhc | last post by:
I have a problem with verifying content of controls on a form before closing the form or moving to the next form "record" (i.e., when moving to the next row of my form's record source). HERE'S THE LONG EXPLANATION OF THE PROBLEM First, A Description of the Form: I have developed a form for entering survey questionnaire data. The data to...
16
5642
by: Brian Tkatch | last post by:
Is there a way to check the order in which SET INTEGRITY needs to be applied? This would be for a script with a dynamic list of TABLEs. B.
0
7701
marktang
by: marktang | last post by:
ONU (Optical Network Unit) is one of the key components for providing high-speed Internet services. Its primary function is to act as an endpoint device located at the user's premises. However, people are often confused as to whether an ONU can Work As a Router. In this blog post, we’ll explore What is ONU, What Is Router, ONU & Router’s main...
0
7615
by: Hystou | last post by:
Most computers default to English, but sometimes we require a different language, especially when relocating. Forgot to request a specific language before your computer shipped? No problem! You can effortlessly switch the default language on Windows 10 without reinstalling. I'll walk you through it. First, let's disable language...
0
8130
jinu1996
by: jinu1996 | last post by:
In today's digital age, having a compelling online presence is paramount for businesses aiming to thrive in a competitive landscape. At the heart of this digital strategy lies an intricately woven tapestry of website design and digital marketing. It's not merely about having a website; it's about crafting an immersive digital experience that...
1
7677
by: Hystou | last post by:
Overview: Windows 11 and 10 have less user interface control over operating system update behaviour than previous versions of Windows. In Windows 11 and 10, there is no way to turn off the Windows Update option using the Control Panel or Settings app; it automatically checks for updates and installs any it finds, whether you like it or not. For...
0
7979
tracyyun
by: tracyyun | last post by:
Dear forum friends, With the development of smart home technology, a variety of wireless communication protocols have appeared on the market, such as Zigbee, Z-Wave, Wi-Fi, Bluetooth, etc. Each protocol has its own unique characteristics and advantages, but as a user who is planning to build a smart home system, I am a bit confused by the...
0
5219
by: conductexam | last post by:
I have .net C# application in which I am extracting data from word file and save it in database particularly. To store word all data as it is I am converting the whole word file firstly in HTML and then checking html paragraph one by one. At the time of converting from word file to html my equations which are in the word document file was convert...
0
3653
by: TSSRALBI | last post by:
Hello I'm a network technician in training and I need your help. I am currently learning how to create and manage the different types of VPNs and I have a question about LAN-to-LAN VPNs. The last exercise I practiced was to create a LAN-to-LAN VPN between two Pfsense firewalls, by using IPSEC protocols. I succeeded, with both firewalls in...
0
3643
by: adsilva | last post by:
A Windows Forms form does not have the event Unload, like VB6. What one acts like?
0
940
bsmnconsultancy
by: bsmnconsultancy | last post by:
In today's digital era, a well-designed website is crucial for businesses looking to succeed. Whether you're a small business owner or a large corporation in Toronto, having a strong online presence can significantly impact your brand's success. BSMN Consultancy, a leader in Website Development in Toronto offers valuable insights into creating...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.