Many data entry tables - which ones hold records?

heromull

We have an asp.net app with about 200 data entry forms. Customers may
enter data into any number of forms. Each form's data is persisted in
a corresponding sql table. When data entry is complete, it needs to be
processed. Here's where the questions start.

How can we easily determine in which tables a customer has data and how
best to select that data?

We're not opposed to putting all the data in a single table. This
table would wind up having ~15 million records and constantly have CRUD
operations performed against it by up to 5000 users simultaneously.
With sufficient hardware, is this too much to ask of the db?

Oct 23 '05 #1

Subscribe Post Reply

2018

Erland Sommarskog

heromull (he******@gmail.com) writes:

We have an asp.net app with about 200 data entry forms. Customers may
enter data into any number of forms. Each form's data is persisted in
a corresponding sql table. When data entry is complete, it needs to be
processed. Here's where the questions start.

How can we easily determine in which tables a customer has data and how
best to select that data?

We're not opposed to putting all the data in a single table. This
table would wind up having ~15 million records and constantly have CRUD
operations performed against it by up to 5000 users simultaneously.
With sufficient hardware, is this too much to ask of the db?

Whether the 200 tables should be 1, 10, 74, or 200 is impossible to tell
from without knowledge about what's in them.

But from a performance point of view, it would not really matter whether
it's one or two hundred tables. Provided, that is, the the single table
has proper indexing.

There are a couple a ways of finding to find data to process:

1) Timestamp column. A timestamp column is automatically updated by SQL
Server with a database-unique value that is monotonically increasing.
(Binary, completely unrelated to date and time). The process that looks
for data would keep track of the most recent timestamp per table, and
retrieve the rows with higher timestamp value. If the process updates the
rows itself, it needs to combine the lookup with a status column. The
drawback with this solution is that the timestamp column must be indexed,
and since it's updated each time the row is updated, there will be a lot
of shuffling around in that index.

2) IDENTITY column. All tables would have an identity column, and then
the process would keep track of the most recently processed value. With
this solution you can only handle inserts, not if users update existing
data.

3) Having triggers on the that enters data about rows to process into a
table. Again, you may need a mechanism to differentiate between user-entered
changes and changes from your processing.

--
Erland Sommarskog, SQL Server MVP, es****@sommarskog.se

Books Online for SQL Server SP3 at
http://www.microsoft.com/sql/techinf...2000/books.asp

Oct 23 '05 #2

heromull

Thanks, the data to select for processing is not time based, it's based
on whatever tables a customer has data in at the time of processing,
which is determined by the customer. So, option three would be the
best and is similar to our current process. Also, the data is not
updated by the processing, only selected. Even with a table that tells
us which tables have data, how would we build the select statement(s)
to only include those tables with data?

Oct 23 '05 #3

Erland Sommarskog

heromull (he******@gmail.com) writes:

Thanks, the data to select for processing is not time based, it's based
on whatever tables a customer has data in at the time of processing,
which is determined by the customer. So, option three would be the
best and is similar to our current process.
Not that see what time has to do with it. I did say timestamp, but
the timestamp datatype has nothing to do with time. Then again, if you
already have a process similar to option three, then go with that.
Also, the data is not updated by the processing, only selected. Even
with a table that tells us which tables have data, how would we build
the select statement(s) to only include those tables with data?

I can not say that. I don't know your tables. I don't know in which context
this process runs etc. I would assume that since there are 200 tables,
that you would have a stored procedure or a SELECT statement for that
table, as I would assume that all tables would generally have a different
set of columns. If they all have the same schema, then there is a strong
indication of that you should have one single table.
--
Erland Sommarskog, SQL Server MVP, es****@sommarskog.se

Books Online for SQL Server SP3 at
http://www.microsoft.com/sql/techinf...2000/books.asp

Oct 23 '05 #4

David Portas

"heromull" <he******@gmail.com> wrote in message
news:11*********************@g14g2000cwa.googlegro ups.com...

We have an asp.net app with about 200 data entry forms. Customers may
enter data into any number of forms. Each form's data is persisted in
a corresponding sql table. When data entry is complete, it needs to be
processed. Here's where the questions start.

How can we easily determine in which tables a customer has data and how
best to select that data?

We're not opposed to putting all the data in a single table. This
table would wind up having ~15 million records and constantly have CRUD
operations performed against it by up to 5000 users simultaneously.
With sufficient hardware, is this too much to ask of the db?

Don't design a database schema around a user interface. Design your database
schema around your DATA and BUSINESS RULES, then build a data access layer
that supports the UI. You say "Each form's data is persisted in a
corresponding sql table". This make little sense to me as a description of a
data model. I suspect (admittedly on the basis of too little information)
that with a better design your perceived problems would disappear. The
stataement "We're not opposed to putting all the data in a single table"
also suggests a very arbitrary approach to database design. Is your current
design a normalized one? If so, I don't understand your confusion about "How
can we easily determine in which tables a customer has data". The answer is
presumably that you use the Customer key - whatever that is in your model.

--
David Portas
SQL Server MVP
--

Oct 23 '05 #5

heromull

Erland Sommarskog wrote:
....

If they all have the same schema, then there is a strong
indication of that you should have one single table. ....

Yes, this is the case. They share the same schema.
David Portas wrote:
Don't design a database schema around a user interface.
I think this is our first mistake.
Design your database
schema around your DATA and BUSINESS RULES, then build a data access layer
that supports the UI. You say "Each form's data is persisted in a
corresponding sql table". This make little sense to me as a description of a
data model. I suspect (admittedly on the basis of too little information)
that with a better design your perceived problems would disappear.
I think you're right. It's that "better design" that we're
researching.
The
stataement "We're not opposed to putting all the data in a single table"
also suggests a very arbitrary approach to database design.
It does and I say that only to mean that it's not too late for us to
implement a better design.
Is your current
design a normalized one? If so, I don't understand your confusion about "How
can we easily determine in which tables a customer has data". The answer is
presumably that you use the Customer key - whatever that is in your model.

Here's DDL that describes our current "model" (I hear you laughing
already). Notice that all 200 talbes have the same schema. If we
determine a single table approach would be a better design, what would
be some areas of concern? Obviously indexing is on the list.

CREATE TABLE [Customer] (
[CustomerId] [int] IDENTITY (1, 1) NOT NULL ,
[Name] [varchar] (100),
CONSTRAINT [PK_Customer] PRIMARY KEY NONCLUSTERED
(
[CustomerId]
)
)
--we have ~200 tables with this table's schema
CREATE TABLE [Table1] ( --Table1 thru Table200
[Table1Id] [int] IDENTITY (1, 1) NOT NULL ,
[CustomerId] [int] NOT NULL ,
[Value] [nchar] (1024),
CONSTRAINT [PK_Table1] PRIMARY KEY NONCLUSTERED
(
[Table1Id]
) ,
CONSTRAINT [FK_Table1_Customer] FOREIGN KEY
(
[CustomerId]
) REFERENCES [Customer] (
[CustomerId]
)
)
GO
--each of the 200 tables have a trigger
--similar to this (i'll omit the delete trigger from the DDL)
create trigger
Table1CreateInterview
on
Table1
for insert
as
begin
declare @CustomerId int
declare @RecordId int
select
@RecordId = Table1Id,
@CustomerId = CustomerId
from
inserted

insert into
interview
(
CustomerId,

TableId,--represents the table caused the trigger eg. 1=Table1
RecId--the value of the primary key of the record causing the trigger
)
values
(
@CustomerId,
1,
@RecordId
)
end
GO
--a record is inserted into this table each
--time a record is written to any of the 200 tables
--a record is deleted from this table each time
--a record is deleted from any of the 200 tables
CREATE TABLE [Interview] (
[InterviewId] [int] IDENTITY (1, 1) NOT NULL ,
[CustomerId] [int] NULL ,
[TableId] [int] NOT NULL ,
[RecId] [int] NULL
CONSTRAINT [PK_Interview] PRIMARY KEY NONCLUSTERED
(
[InterviewId]
),
CONSTRAINT [FK_Interview_Customer] FOREIGN KEY
(
[CustomerId]
) REFERENCES [Customer] (
[CustomerId]
)
)
GO
insert into customer(name) values ('some name')
insert into table1(customerid, value) values (1, 'my value in table1')

Oct 24 '05 #6

Erland Sommarskog

heromull (he******@gmail.com) writes:

Don't design a database schema around a user interface.
I think this is our first mistake.

An unusually candid confession!

If they all have the same schema, then there is a strong
indication of that you should have one single table.

...

Yes, this is the case. They share the same schema.

You should make them one table, adding one more column that specifies
the entity that is now hidden in the table name. That table would look
like:

CREATE TABLE [Answers] ( --Table1 thru Table200
[CustomerId] [int] NOT NULL ,
TableNo int NOT NULL,
RowNo smallint NOT NULL,
[Value] [nchar] (1024),
PRIMARY KEY (CustomerID, TableNo, RowNo)

I've added RowNo, beause I don't know if one customer can add more
than one value in the same form. If he can't RowNo should not be
there. I've very deliberate taken out the IDENTITY column, because
this table should have a composite key. My RowNo is indeed a surrogate,
but only in the sub-realm of CustomerId, TableNo. (And TableNo is just
a name I use, in lack of knowledge about the business domain.)

I put the key on CustomerID, TableNo, but depending how you use the
table, you may also have a need for an index on (TableNo, CustomerID).
CustomerID first is good for queries like "what is customer 1233 up to"?,
but not for "What do we have in table 12?".
It does and I say that only to mean that it's not too late for us to
implement a better design.
That's great to hear!
begin
declare @CustomerId int
declare @RecordId int
select
@RecordId = Table1Id,
@CustomerId = CustomerId
from
inserted
Uh-uh, classic mistake. A trigger fires once per statement, not once
per row. Yes, as long as data through that form, it will only come
one by one, but then suddenly there is a batch processing loading lots
of data at the same time. So write your trigger as:

INSERT interview (...)
SELECT ...
FROM inserted
insert into
interview

And INSERT without a column list is very poor practice in production code.
Someone adds a column to the table, and the statement blows up. That's
bad.

--
Erland Sommarskog, SQL Server MVP, es****@sommarskog.se

Books Online for SQL Server SP3 at
http://www.microsoft.com/sql/techinf...2000/books.asp

Oct 24 '05 #7

heromull

Erland Sommarskog wrote:

An unusually candid confession!

Acceptance is the first step right?

Anyway, thank you so much for the support! I think the suggestions are
going to help us out a lot. A lightbulb went off over my head when I
read your composit key explaination.

Oct 25 '05 #8

heromull

Thank you so much for the support! I think this will work for us. My
only other question would be about managing the surrogate column
(RowNo). During Inserts, is there a way to manage it within SQL Server
or would my client apps need to select max(RowNo) and increase it by 1?

Again, thanks!

Oct 26 '05 #9

Hugo Kornelis

On 26 Oct 2005 07:55:17 -0700, heromull wrote:

Thank you so much for the support! I think this will work for us. My
only other question would be about managing the surrogate column
(RowNo). During Inserts, is there a way to manage it within SQL Server
or would my client apps need to select max(RowNo) and increase it by 1?

Hi heromull,

A typical INSERT statement would roughly look like this:

INSERT INTO Answers (CustomerId, TableNo, RowNo, [Value])
SELECT @CustomerId, @TableNo, COALESCE(MAX(RowNo),0) + 1, @Value
FROM Answers
WHERE CustomerId = @CustomerId
AND TableNo = @TableNo

Best, Hugo
--

(Remove _NO_ and _SPAM_ to get my e-mail address)

Oct 26 '05 #10

heromull

Thanks! Works perfectly.

Oct 26 '05 #11

by: serge | last post by:

I have a scenario where two tables are in a One-to-Many relationship and I need to move the data from the Many table to the One table so that it becomes a One-to-One relationship. I need to...

Microsoft SQL Server

Replacing data in a table WITHOUT changing the numbers created from a running sum..

by: StenKoll | last post by:

Help needed in order to create a register of stocks in a company. In accordance with local laws I need to give each individual share a number. I have accomplished this by establishing three tables...

Microsoft Access / VBA

statically allocated data structures source?

by: Thomas Paul Diffenbach | last post by:

Can anyone point me to an open source library of /statically allocated/ data structures? I'm writing some code that would benefit from trees, preferably self balancing, but on an embedded system...

C / C++

Refreshing the data in an XMLReader object

by: Angus Lepper | last post by:

I'm writing a stock ticker for a stock market simulation, and can load the data into the xmlreader in the first place, but can't figure out how to refresh/update the data in it. Any ideas? Code:...

Visual Basic .NET

Microsoft Access, Double Data Entry and breaking the Normalization rule

by: hippomedon | last post by:

Hello everyone, I'm looking for some advice on whether I should break the normalization rule. Normally, I would not consider it, but this seems to be a special case. I have created an...

Microsoft Access / VBA

How to update data after normalising a database?

by: OzNet | last post by:

I have inherited a database for placing job seekers into jobs. The original table has 3 fields which contain job preferences using codes which had to be looked up in a book. There are around 5000...

Microsoft Access / VBA

grab 2 sets of data from single table based on a condition

by: Shukaido | last post by:

I'm a VERY novice SQL user. I've got an Access ADP project connected to an SQL 2000 Server. My dilemma is that I have a single table that stores all data for my particular project, with links...

Microsoft SQL Server

too many indexes in Access 2003 database

by: dgardner | last post by:

I have created a database with approximately 60 tables. One table "tblClients" needs to be linked with about 40 tables. Some are simply look up tables. Others hold multiple records for each client,...

Microsoft Access / VBA

Joining multiple tables with one to many relationships

by: Neekos | last post by:

Im working on web tool that will allow supervisors to see a list of their agents and their call stats for a call center. I have one main table that holds employee IDs and their supervisor names....

MySQL Database

How to turn on java script in a villaon keypad mobile phone

by: Charles Arthur | last post by:

How do i turn on java script on a villaon, callus and itel keypad mobile phone

Java

Batch import of multiple excel files into the database

by: ryjfgjl | last post by:

If we have dozens or hundreds of excel to import into the database, if we use the excel import function provided by database editors such as navicat, it will be extremely tedious and time-consuming...

Data Management

Merging data from multiple Excel files

by: ryjfgjl | last post by:

In our work, we often receive Excel tables with data in the same format. If we want to analyze these data, it can be difficult to analyze them because the data is spread across multiple Excel files...

Data Management

Migrating Website to Cloud - Emmanuel Katto

by: emmanuelkatto | last post by:

Hi All, I am Emmanuel katto from Uganda. I want to ask what challenges you've faced while migrating a website to cloud. Please let me know. Thanks! Emmanuel

General

Navigating the Data Structures and Algorithms (DSA)

by: BarryA | last post by:

What are the essential steps and strategies outlined in the Data Structures and Algorithms (DSA) roadmap for aspiring data scientists? How can individuals effectively utilize this roadmap to progress...

Algorithms / Advanced Math

Looking to do Android software development, any suggestions? Is flutter better?

by: nemocccc | last post by:

hello, everyone, I want to develop a software for my android phone for daily needs, any suggestions?

General

Changing the language in Windows 10

by: Hystou | last post by:

Most computers default to English, but sometimes we require a different language, especially when relocating. Forgot to request a specific language before your computer shipped? No problem! You can...

Windows Server

Problem With Comparison Operator <=> in G++

by: Oralloy | last post by:

Hello folks, I am unable to find appropriate documentation on the type promotion of bit-fields when using the generalised comparison operator "<=>". The problem is that using the GNU compilers,...

C / C++

Maximizing Business Potential: The Nexus of Website Design and Digital Marketing

by: jinu1996 | last post by:

In today's digital age, having a compelling online presence is paramount for businesses aiming to thrive in a competitive landscape. At the heart of this digital strategy lies an intricately woven...

Online Marketing

Many data entry tables - which ones hold records?

Similar topics