By using this site, you agree to our updated Privacy Policy and our Terms of Use. Manage your Cookies Settings.
448,781 Members | 1,656 Online
Bytes IT Community
+ Ask a Question
Need help? Post your question and get tips & solutions from a community of 448,781 IT Pros & Developers. It's quick & easy.

Opinion about design needed (splitting string data)

P: n/a
Hi to everyone,

My problem is, that I'm not so quite sure, which way should I go.

The user is inputing by second part application a long string (let's
say 128 characters), which are separated by semiclon.
Example:

A20;BU;AC40;MA50;E;E;IC;GREEN

Now: each from this position, is already defined in any other table, as
a separate record. These are the keys lets say. It means, a have some
properities for A20, BU, aso.

Because this long inputed string, is a property of device (whih also
has a lot of different properities) I could do two different ways of
storing data:

1. By writing, in SP, just encapsulate each of the position separated
by semicolon, and write into a different table with index of device,
and the position in long stirng nearly in this way:

Major device data table
ID AnyData1 AnyData2 ... AnyData3
123 MZD12 XX77 .... any comment text
124 MZD13 XY55 ... any other comment

String data Table
fk_deviceId position value
123 1 A20
123 2 BU
123 3 AC40
.....
123 8 GREEN

The device table, contains also a pointer (position), which might
change, to "hglight" specified position.

Then, I can very easly find all necessary data. The problem is, I need
to move the device record data (from other table) very often into other
history table (by each update). That will mean, that I also need to
move all these records from 1 -8 for example to a separate history
table, holding the index for a history device dataset. This is a little
inconvinience in this, and in my opinion, it will use to much storage
data, and by programming, I need always to shift this properities into
history table, whith indexes to a history table of other properities.

2. Table will be build nearly in this way:

Major device data table
ID AnyData1 AnyData2 ... AnyData3 stringProperty pointer
123 MZD12 XX77 .... any comment text A20;BU;AC40;MA50;E;E;IC;GREEN 3
124 MZD13 XY55 ... any other comment A20;BU;AC40;MA50;E;E;IC;GREEN 2

By writng into device table, there will be just a additional field for
this string, and I will have a function, which according to specified
pointer, will get me the string part on the fly, while I need it.
This will not require the other table, and will reduce the amout of
data, not a lot ... but always.
This solution, has a inconvinance, that it will be not so fast doing a
search over the part of this strings, while there will be no real index
on this.
If I woould like to search all devices, by which the curent pointer
value is equal GREEN, then I need to use function for getting the
value, and this one will be not indexed, means, by a lot amount of
data, might be slow.

I would like to know Your opinion about booth solutions.
Also, if you might point me the other problems with any of this
solution, I might not have noticed.

With Best Regards

Matik

Dec 14 '06 #1
Share this Question
Share on Google+
7 Replies


P: n/a
Matik (ma****@sauron.xo.pl) writes:
1. By writing, in SP, just encapsulate each of the position separated
by semicolon, and write into a different table with index of device,
and the position in long stirng nearly in this way:

Major device data table
ID AnyData1 AnyData2 ... AnyData3
123 MZD12 XX77 .... any comment text
124 MZD13 XY55 ... any other comment

String data Table
fk_deviceId position value
123 1 A20
123 2 BU
123 3 AC40
....
123 8 GREEN

The device table, contains also a pointer (position), which might
change, to "hglight" specified position.
This is the normal design in this situation.
Major device data table
ID AnyData1 AnyData2 ... AnyData3 stringProperty pointer
123 MZD12 XX77 .... any comment text A20;BU;AC40;MA50;E;E;IC;GREEN 3
124 MZD13 XY55 ... any other comment A20;BU;AC40;MA50;E;E;IC;GREEN 2
This design violates a basic principle in relational design: no repeating
groups.

Every rule is made to break, and I have occasionally put repeating groups in
the database I maintain, but this is a clearcut case: don't even think
about it. This sort of data is very difficult to work with in a
relational database, simply because it's not meant that you should
store data in this way.
Then, I can very easly find all necessary data. The problem is, I need
to move the device record data (from other table) very often into other
history table (by each update). That will mean, that I also need to
move all these records from 1 -8 for example to a separate history
table, holding the index for a history device dataset. This is a little
inconvinience in this, and in my opinion, it will use to much storage
data,
With a sub-table you need to repeat the ID. There will also be a cost
of two bytes for the length of each column. There is also the cost for
the field number, but since you don't have any semi-colon, this is a
net cost of one byte. There is also some overhead for each row. But
all and all, I would say that the overhead is about neglible.
and by programming, I need always to shift this properities into
history table, whith indexes to a history table of other properities.
Don't really know what you mean here.

For completeness sake I should say that there is a third alternative,
and that is one table, but eight columns. This could also be considered
a repeating group. Then again, if the different fields represents
different attributes, it isn't really an repetition. This solution
is better my opinion than a seprated list, but the pointer you talk
about may be more difficult to implement.

--
Erland Sommarskog, SQL Server MVP, es****@sommarskog.se

Books Online for SQL Server 2005 at
http://www.microsoft.com/technet/pro...ads/books.mspx
Books Online for SQL Server 2000 at
http://www.microsoft.com/sql/prodinf...ons/books.mspx
Dec 14 '06 #2

P: n/a
Erland Sommarskog wrote:
Matik (ma****@sauron.xo.pl) writes:
>and by programming, I need always to shift this properities into
history table, whith indexes to a history table of other properities.

Don't really know what you mean here.
Probably something along the lines of: (oversimplified for brevity)

insert into FooHistory select * from CurrentFoo
delete from CurrentFoo
Dec 15 '06 #3

P: n/a
Then maybe you could create a view for the second table.

Matik wrote:
Hi to everyone,

My problem is, that I'm not so quite sure, which way should I go.

The user is inputing by second part application a long string (let's
say 128 characters), which are separated by semiclon.
Example:

A20;BU;AC40;MA50;E;E;IC;GREEN

Now: each from this position, is already defined in any other table, as
a separate record. These are the keys lets say. It means, a have some
properities for A20, BU, aso.

Because this long inputed string, is a property of device (whih also
has a lot of different properities) I could do two different ways of
storing data:

1. By writing, in SP, just encapsulate each of the position separated
by semicolon, and write into a different table with index of device,
and the position in long stirng nearly in this way:

Major device data table
ID AnyData1 AnyData2 ... AnyData3
123 MZD12 XX77 .... any comment text
124 MZD13 XY55 ... any other comment

String data Table
fk_deviceId position value
123 1 A20
123 2 BU
123 3 AC40
....
123 8 GREEN

The device table, contains also a pointer (position), which might
change, to "hglight" specified position.

Then, I can very easly find all necessary data. The problem is, I need
to move the device record data (from other table) very often into other
history table (by each update). That will mean, that I also need to
move all these records from 1 -8 for example to a separate history
table, holding the index for a history device dataset. This is a little
inconvinience in this, and in my opinion, it will use to much storage
data, and by programming, I need always to shift this properities into
history table, whith indexes to a history table of other properities.

2. Table will be build nearly in this way:

Major device data table
ID AnyData1 AnyData2 ... AnyData3 stringProperty pointer
123 MZD12 XX77 .... any comment text A20;BU;AC40;MA50;E;E;IC;GREEN 3
124 MZD13 XY55 ... any other comment A20;BU;AC40;MA50;E;E;IC;GREEN 2

By writng into device table, there will be just a additional field for
this string, and I will have a function, which according to specified
pointer, will get me the string part on the fly, while I need it.
This will not require the other table, and will reduce the amout of
data, not a lot ... but always.
This solution, has a inconvinance, that it will be not so fast doing a
search over the part of this strings, while there will be no real index
on this.
If I woould like to search all devices, by which the curent pointer
value is equal GREEN, then I need to use function for getting the
value, and this one will be not indexed, means, by a lot amount of
data, might be slow.

I would like to know Your opinion about booth solutions.
Also, if you might point me the other problems with any of this
solution, I might not have noticed.

With Best Regards

Matik
Dec 15 '06 #4

P: n/a
First of all, thank you for your reply!

Now, some additional explenations maybe:

That was just an example, with 8 positions separated by semicolon as a
one property. The problem is, there number of this is various. That's
why, I couldyn't solve issue with fix number of column.

With shifting data into history, I've ment, that by each change of data
in primary table, whole record should be copied to the history table
(nearly same construction as primary table).
This is than an issue with the second table, storing semicolon
separated field in one column (splitted) in different table. This need
to be shifted then also, to a second historical table.

Of course, I could ommit using 'working' table, and have only history,
with inserts, and having a primary table containing a pointer to last -
newest record as my primary table, to get the newest record.
The problem is, I'm afraid a little of performance, sice there is all
other actions done on the primary table (select, searches aso.)
Having a big historical table, I will still need to get countinous
joins, to get the newest record, and even having a good indexing and
relation set up, it might be slow while table can be big.

This semicolon devided string, as example was shown pretty simmilar,
but it can be also various:

A10;B13;c20;bubu;lala;GREEN;RED
A13;BUBU;GREEN;YELLOW;mama
C25;YELLOW
BLUE;pleple;B13
aso.

The pointer I was talking about, is just a index, to which position in
this semicolon devided string, is curently activated.
Best regards

Matik

Dec 17 '06 #5

P: n/a
Matik wrote:
The problem is, I'm afraid a little of performance,
This has "premature optimization" written all over it. Build the
database cleanly first; then, if you /actually/ have performance
issues, then consider how to improve it (but breaking 1NF with "a;b;c"
type columns should still be a last resort).
Dec 17 '06 #6

P: n/a
Matik (ma****@sauron.xo.pl) writes:
With shifting data into history, I've ment, that by each change of data
in primary table, whole record should be copied to the history table
(nearly same construction as primary table).
This is than an issue with the second table, storing semicolon
separated field in one column (splitted) in different table. This need
to be shifted then also, to a second historical table.
I'm not sure that I see the problem. With a regular design, you would
have two tables for current data, and two tables for historical data.
Of course, I could ommit using 'working' table, and have only history,
with inserts, and having a primary table containing a pointer to last -
newest record as my primary table, to get the newest record.
The problem is, I'm afraid a little of performance, sice there is all
other actions done on the primary table (select, searches aso.)
Like Ed said, get the design right first, and do performance tuning
when everything else is working. But some basic ideas for performance
are good when designing for performance. For instance no repeating
groups (i.e. semicolon-separated lists.)

--
Erland Sommarskog, SQL Server MVP, es****@sommarskog.se

Books Online for SQL Server 2005 at
http://www.microsoft.com/technet/pro...ads/books.mspx
Books Online for SQL Server 2000 at
http://www.microsoft.com/sql/prodinf...ons/books.mspx
Dec 17 '06 #7

P: n/a
Matik wrote:
>
Of course, I could ommit using 'working' table, and have only history,
with inserts, and having a primary table containing a pointer to last -
newest record as my primary table, to get the newest record.
The problem is, I'm afraid a little of performance, sice there is all
other actions done on the primary table (select, searches aso.)
Having a big historical table, I will still need to get countinous
joins, to get the newest record, and even having a good indexing and
relation set up, it might be slow while table can be big.
The way to optimise is with good indexes and good query design. You say
"it might be slow" so obviously you haven't reached that stage yet. On
the other hand you know for sure that a redundant copy of the data will
have an additional performance cost, both for updates and queries.

--
David Portas, SQL Server MVP

Whenever possible please post enough code to reproduce your problem.
Including CREATE TABLE and INSERT statements usually helps.
State what version of SQL Server you are using and specify the content
of any error messages.

SQL Server Books Online:
http://msdn2.microsoft.com/library/m...S,SQL.90).aspx
--

Dec 17 '06 #8

This discussion thread is closed

Replies have been disabled for this discussion.