Paul -
The only information in the event logs is the text of the failed
assertion error itself. I have never seen any OS-reported problems
with the hardware.
I hate to seem stupid, but can you be more specific about what you
mean when you say "hardware diagnostics?" Are you talking about the
simple Windows CheckDisk utility or something more advanced? This is
the first time I've used a hardware RAID controller - is Windows even
capable of checking the hardware-controlled disk array, or do I need
to use a utility provided by the RAID controller manufacturer?
Or would you suggest some sort of third-party utility for "burning in"
the hardware? Would you suspect disk drives, memory, or what? Could
it be ANY of the hradware, or just specific things?
One last question: What's the most effective method for contacting
product support if I need to do so?
Thanks,
Morgan Leppink
"Paul S Randal [MS]" <prandal@online.microsoft.com> wrote in message news:<40b684c7$1@news.microsoft.com>...[color=blue]
> Hi Morgan,
>
> Have you actually checked the event logs and run hardware diagnostics on
> your IO system to see if there are hardware problems?
>
> If so and there's no clues there, you should call Product Support to help
> you diagnose the problem.
>
> Regards.
>
> --
> Paul Randal
> Dev Lead, Microsoft SQL Server Storage Engine
>
> This posting is provided "AS IS" with no warranties, and confers no rights.
>
> "Morgan Leppink" <mleppink@hotmail.com> wrote in message
> news:806e6d7.0405271455.1bf6a2d4@posting.google.co m...[color=green]
> > Hey all -
> >
> > We are running SQL 2000 with ALL available service packs, etc.
> > applied. We just built a brand new database server, which has dual
> > 2Ghz XEONs, 2GB memory, and the following disk configuration:
> >
> > RAID 1 array (2 disks) Operating System Windows Server 2003
> > RAID 1 array (2 disks) Database Logs
> > RAID 10 array (4 disks) Database Data
> >
> > Disks are SATA, with a 3Ware hardware RAID controller. The machine
> > SCREAMS.
> >
> > We run 5 databases on this machine. 2 of these are fairly large (by
> > our standards, anyway). The second largest database (and the busiest
> > and most important) is consistently generating consistency errors that
> > bring many important queries down. These are almost ALWAYS in the
> > form of index corruption on one single table. The corruption does not
> > normally occur on other tables, although it DOES happen once in a
> > while - rarely - on one of the other tables), nor does it EVER occur
> > on any other databases on the server.
> >
> > The corruption seems to happen right in the neighborhood of midnight
> > ALMOST every day, give or take a few minutes, but does not seem
> > directly associated with any of our MANY scheduled database cleanup
> > tasks (believe me, we've tried desperately to find an association
> > using SQL profiler). At midnight, our database traffic is fairly low,
> > so it does not seem associated with a high traffic level.
> >
> > We are using the FULL recovery model, with log backups every 15
> > minutes, and full backups daily at 12:15am. However, the corruption
> > happens consistently BEFORE 12:15, like between 11:50pm and 12:10am.
> > The most frustrating thing is, the database can go WEEKS without any
> > corruption at all, and then it'll go 4 or 5 days in a row with this
> > strange corruption stuff.
> >
> > ************************************************** ***********************
> > Typical query errors when the corruption exists include:
> > ************************************************** ***********************
> >
> > SQL Server Assertion: File:
> > <p:\sql\ntdbms\storeng\drs\include\record.inl>, line=1447
> > Failed Assertion = 'm_SizeRec > 0 && m_SizeRec <= MAXDATAROW'.
> >
> >
> > SQL Server Assertion: File: <recbase.cpp>, line=1378
> > Failed Assertion = 'm_offBeginVar < m_SizeRec'.
> >
> >
> > Server: Msg 3624, Level 20, State 1, Line 7
> > Location: recbase.cpp:1374
> > Expression: m_nVars > 0
> >
> >
> > Connection Broken
> >
> > ************************************************** ***********************
> >
> > Most of the responses to this type of issue (failed assertions) on the
> > newgroups appear to point to hardware failures. However, this is
> > brand new hardware, AND, it seems to us that if this was a hardware
> > issue, other databases, tables, and indexes would be affected
> > randomly. Isn't that a valid assumption (that if it was hardware,
> > particularly the RAID controller, the corruption would not be in such
> > a predictable place)? What if we moved the physical database files to
> > another location on the disk? Would/could that help?
> >
> > If anyone could offer some suggestions as to what may be causing this
> > corruption, we would be eternally grateful. It is getting to be a
> > real pain in the A*** to run DBCC CHECKDB with REPAIR_ALLOW_DATA_LOSS
> > every day or two (it always seems to solve the problem without data
> > loss, but still...).
> >
> > Again, thanks in advance for your response.
> >
> >
> > Sincerely,
> >
> >
> > Morgan Leppink
> >
mleppink@hotmail.com[/color][/color]