Opinion about design needed (splitting string data)

Matik

Hi to everyone,

My problem is, that I'm not so quite sure, which way should I go.

The user is inputing by second part application a long string (let's
say 128 characters), which are separated by semiclon.
Example:

A20;BU;AC40;MA5 0;E;E;IC;GREEN

Now: each from this position, is already defined in any other table, as
a separate record. These are the keys lets say. It means, a have some
properities for A20, BU, aso.

Because this long inputed string, is a property of device (whih also
has a lot of different properities) I could do two different ways of
storing data:

1. By writing, in SP, just encapsulate each of the position separated
by semicolon, and write into a different table with index of device,
and the position in long stirng nearly in this way:

Major device data table
ID AnyData1 AnyData2 ... AnyData3
123 MZD12 XX77 .... any comment text
124 MZD13 XY55 ... any other comment

String data Table
fk_deviceId position value
123 1 A20
123 2 BU
123 3 AC40
.....
123 8 GREEN

The device table, contains also a pointer (position), which might
change, to "hglight" specified position.

Then, I can very easly find all necessary data. The problem is, I need
to move the device record data (from other table) very often into other
history table (by each update). That will mean, that I also need to
move all these records from 1 -8 for example to a separate history
table, holding the index for a history device dataset. This is a little
inconvinience in this, and in my opinion, it will use to much storage
data, and by programming, I need always to shift this properities into
history table, whith indexes to a history table of other properities.

2. Table will be build nearly in this way:

Major device data table
ID AnyData1 AnyData2 ... AnyData3 stringProperty pointer
123 MZD12 XX77 .... any comment text A20;BU;AC40;MA5 0;E;E;IC;GREEN 3
124 MZD13 XY55 ... any other comment A20;BU;AC40;MA5 0;E;E;IC;GREEN 2

By writng into device table, there will be just a additional field for
this string, and I will have a function, which according to specified
pointer, will get me the string part on the fly, while I need it.
This will not require the other table, and will reduce the amout of
data, not a lot ... but always.
This solution, has a inconvinance, that it will be not so fast doing a
search over the part of this strings, while there will be no real index
on this.
If I woould like to search all devices, by which the curent pointer
value is equal GREEN, then I need to use function for getting the
value, and this one will be not indexed, means, by a lot amount of
data, might be slow.

I would like to know Your opinion about booth solutions.
Also, if you might point me the other problems with any of this
solution, I might not have noticed.

With Best Regards

Matik

Dec 14 '06 #1

Subscribe Reply

1740

Erland Sommarskog

Matik (ma****@sauron. xo.pl) writes:

1. By writing, in SP, just encapsulate each of the position separated
by semicolon, and write into a different table with index of device,
and the position in long stirng nearly in this way:

Major device data table
ID AnyData1 AnyData2 ... AnyData3
123 MZD12 XX77 .... any comment text
124 MZD13 XY55 ... any other comment

String data Table
fk_deviceId position value
123 1 A20
123 2 BU
123 3 AC40
....
123 8 GREEN

The device table, contains also a pointer (position), which might
change, to "hglight" specified position.

This is the normal design in this situation.

Major device data table
ID AnyData1 AnyData2 ... AnyData3 stringProperty pointer
123 MZD12 XX77 .... any comment text A20;BU;AC40;MA5 0;E;E;IC;GREEN 3
124 MZD13 XY55 ... any other comment A20;BU;AC40;MA5 0;E;E;IC;GREEN 2

This design violates a basic principle in relational design: no repeating
groups.

Every rule is made to break, and I have occasionally put repeating groups in
the database I maintain, but this is a clearcut case: don't even think
about it. This sort of data is very difficult to work with in a
relational database, simply because it's not meant that you should
store data in this way.

Then, I can very easly find all necessary data. The problem is, I need
to move the device record data (from other table) very often into other
history table (by each update). That will mean, that I also need to
move all these records from 1 -8 for example to a separate history
table, holding the index for a history device dataset. This is a little
inconvinience in this, and in my opinion, it will use to much storage
data,

With a sub-table you need to repeat the ID. There will also be a cost
of two bytes for the length of each column. There is also the cost for
the field number, but since you don't have any semi-colon, this is a
net cost of one byte. There is also some overhead for each row. But
all and all, I would say that the overhead is about neglible.

and by programming, I need always to shift this properities into
history table, whith indexes to a history table of other properities.

Don't really know what you mean here.

For completeness sake I should say that there is a third alternative,
and that is one table, but eight columns. This could also be considered
a repeating group. Then again, if the different fields represents
different attributes, it isn't really an repetition. This solution
is better my opinion than a seprated list, but the pointer you talk
about may be more difficult to implement.

--
Erland Sommarskog, SQL Server MVP, es****@sommarsk og.se

Books Online for SQL Server 2005 at
http://www.microsoft.com/technet/pro...ads/books.mspx
Books Online for SQL Server 2000 at
http://www.microsoft.com/sql/prodinf...ons/books.mspx

Dec 14 '06 #2

Ed Murphy

Erland Sommarskog wrote:

Matik (ma****@sauron. xo.pl) writes:

>and by programming, I need always to shift this properities into
history table, whith indexes to a history table of other properities.

Don't really know what you mean here.

Probably something along the lines of: (oversimplified for brevity)

insert into FooHistory select * from CurrentFoo
delete from CurrentFoo

Dec 15 '06 #3

othellomy

Then maybe you could create a view for the second table.

Matik wrote:

Hi to everyone,

My problem is, that I'm not so quite sure, which way should I go.

The user is inputing by second part application a long string (let's
say 128 characters), which are separated by semiclon.
Example:

A20;BU;AC40;MA5 0;E;E;IC;GREEN

Now: each from this position, is already defined in any other table, as
a separate record. These are the keys lets say. It means, a have some
properities for A20, BU, aso.

Because this long inputed string, is a property of device (whih also
has a lot of different properities) I could do two different ways of
storing data:

1. By writing, in SP, just encapsulate each of the position separated
by semicolon, and write into a different table with index of device,
and the position in long stirng nearly in this way:

Major device data table
ID AnyData1 AnyData2 ... AnyData3
123 MZD12 XX77 .... any comment text
124 MZD13 XY55 ... any other comment

String data Table
fk_deviceId position value
123 1 A20
123 2 BU
123 3 AC40
....
123 8 GREEN

The device table, contains also a pointer (position), which might
change, to "hglight" specified position.

Then, I can very easly find all necessary data. The problem is, I need
to move the device record data (from other table) very often into other
history table (by each update). That will mean, that I also need to
move all these records from 1 -8 for example to a separate history
table, holding the index for a history device dataset. This is a little
inconvinience in this, and in my opinion, it will use to much storage
data, and by programming, I need always to shift this properities into
history table, whith indexes to a history table of other properities.

2. Table will be build nearly in this way:

Major device data table
ID AnyData1 AnyData2 ... AnyData3 stringProperty pointer
123 MZD12 XX77 .... any comment text A20;BU;AC40;MA5 0;E;E;IC;GREEN 3
124 MZD13 XY55 ... any other comment A20;BU;AC40;MA5 0;E;E;IC;GREEN 2

By writng into device table, there will be just a additional field for
this string, and I will have a function, which according to specified
pointer, will get me the string part on the fly, while I need it.
This will not require the other table, and will reduce the amout of
data, not a lot ... but always.
This solution, has a inconvinance, that it will be not so fast doing a
search over the part of this strings, while there will be no real index
on this.
If I woould like to search all devices, by which the curent pointer
value is equal GREEN, then I need to use function for getting the
value, and this one will be not indexed, means, by a lot amount of
data, might be slow.

I would like to know Your opinion about booth solutions.
Also, if you might point me the other problems with any of this
solution, I might not have noticed.

With Best Regards

Matik

Dec 15 '06 #4

Matik

First of all, thank you for your reply!

Now, some additional explenations maybe:

That was just an example, with 8 positions separated by semicolon as a
one property. The problem is, there number of this is various. That's
why, I couldyn't solve issue with fix number of column.

With shifting data into history, I've ment, that by each change of data
in primary table, whole record should be copied to the history table
(nearly same construction as primary table).
This is than an issue with the second table, storing semicolon
separated field in one column (splitted) in different table. This need
to be shifted then also, to a second historical table.

Of course, I could ommit using 'working' table, and have only history,
with inserts, and having a primary table containing a pointer to last -
newest record as my primary table, to get the newest record.
The problem is, I'm afraid a little of performance, sice there is all
other actions done on the primary table (select, searches aso.)
Having a big historical table, I will still need to get countinous
joins, to get the newest record, and even having a good indexing and
relation set up, it might be slow while table can be big.

This semicolon devided string, as example was shown pretty simmilar,
but it can be also various:

A10;B13;c20;bub u;lala;GREEN;RE D
A13;BUBU;GREEN; YELLOW;mama
C25;YELLOW
BLUE;pleple;B13
aso.

The pointer I was talking about, is just a index, to which position in
this semicolon devided string, is curently activated.
Best regards

Matik

Dec 17 '06 #5

Ed Murphy

Matik wrote:

The problem is, I'm afraid a little of performance,

This has "premature optimization" written all over it. Build the
database cleanly first; then, if you /actually/ have performance
issues, then consider how to improve it (but breaking 1NF with "a;b;c"
type columns should still be a last resort).

Dec 17 '06 #6

Erland Sommarskog

Matik (ma****@sauron. xo.pl) writes:

With shifting data into history, I've ment, that by each change of data
in primary table, whole record should be copied to the history table
(nearly same construction as primary table).
This is than an issue with the second table, storing semicolon
separated field in one column (splitted) in different table. This need
to be shifted then also, to a second historical table.

I'm not sure that I see the problem. With a regular design, you would
have two tables for current data, and two tables for historical data.

Of course, I could ommit using 'working' table, and have only history,
with inserts, and having a primary table containing a pointer to last -
newest record as my primary table, to get the newest record.
The problem is, I'm afraid a little of performance, sice there is all
other actions done on the primary table (select, searches aso.)

Like Ed said, get the design right first, and do performance tuning
when everything else is working. But some basic ideas for performance
are good when designing for performance. For instance no repeating
groups (i.e. semicolon-separated lists.)

--
Erland Sommarskog, SQL Server MVP, es****@sommarsk og.se

Books Online for SQL Server 2005 at
http://www.microsoft.com/technet/pro...ads/books.mspx
Books Online for SQL Server 2000 at
http://www.microsoft.com/sql/prodinf...ons/books.mspx

Dec 17 '06 #7

David Portas

Matik wrote:

>
Of course, I could ommit using 'working' table, and have only history,
with inserts, and having a primary table containing a pointer to last -
newest record as my primary table, to get the newest record.
The problem is, I'm afraid a little of performance, sice there is all
other actions done on the primary table (select, searches aso.)
Having a big historical table, I will still need to get countinous
joins, to get the newest record, and even having a good indexing and
relation set up, it might be slow while table can be big.

The way to optimise is with good indexes and good query design. You say
"it might be slow" so obviously you haven't reached that stage yet. On
the other hand you know for sure that a redundant copy of the data will
have an additional performance cost, both for updates and queries.

--
David Portas, SQL Server MVP

Whenever possible please post enough code to reproduce your problem.
Including CREATE TABLE and INSERT statements usually helps.
State what version of SQL Server you are using and specify the content
of any error messages.

SQL Server Books Online:
http://msdn2.microsoft.com/library/m...S,SQL.90).aspx
--

Dec 17 '06 #8

Similar topics

1663

Want expert opinion on a way of storing 'relations'

by: ViperDK \(Daniel K.\) | last post by:

i've a database where relations are hold in a special way which the project leaders think of as "performant and uncomplicated" but which is very questionable to me: ------------------------------------------------ Table Fields and Table Fields object stuff.... and (varchar with 0-20 ids usually)

Microsoft SQL Server

4259

Design Changes To Objects In Shared MultiUser Database

by: Mike Wiseley | last post by:

We recently converted our department wide shared Access97 database to Access2K. We used to be able to open various reports in design mode and make changes to the design (or create new reports) even while other users in this shared database might be using other, unrelated reports. Now, in Access2k, it appears that whenever you do not have exclusive access, you cannot make any design changes to reports. It seems that the presence of any...

Microsoft Access / VBA

2038

Events and Tree like collections :: Would love to hear you opinion!

by: Sasha | last post by:

Hi everyone, Here is my problem: I have the following classes: - DataNode - this class is designed to hold some data and will be contained in a tree like data structure DataTree. When DataNode is changed, it raises "Changed" event. The class has a reference to the DataNode it is being contained in: - DataTree - tree like data structure that contains DataNodes; When

C# / C Sharp

1822

class design quandary

by: Baris | last post by:

Given the C# code below, can anyone think of a better class design? What really gets my goat is that the code within derived classes D1 and D2 is identical and in my mind should be refactored into a base class, but how can I move this particular 'shared functionality' into a base class? If anyone can suggest a better newsgroup for this, such as one focussed primarily on C# class design, please let me know.

C# / C Sharp

1864

Questions on application design

by: Matthew Hood | last post by:

My company has expressed a desire to convert an existing MS Access application to a full VB.NET application. My experience is with VB6 so I want to ask a few questions and get some input on the best way to handle the following design: Situation: We want to allow our customers to install with one of the following options: 1. Use an Access MDB file as the data backend. 2. Use a SQL Server backend. (Either MS SQL, or MySQL) 3. Use an...

Visual Basic .NET

408

Design Question

by: Suki | last post by:

Hi all, I'm facing a rather strange design problem. Let me explain it. I'm writing an application, which deals with strings extensively. So decided to write a new string class (CXString say) for my application. But majority of its member functions are the same as the ones in std::string class. So I've decided to derive CXString from std::string class. No problem till now. But now i'm facing with the problem of invoking CXString

C / C++

2714

Crud Design Question

by: John Kraft | last post by:

Friends, I'm working on some crud stuff, and I was looking for opinions on the subject. Below, I have pasted some VERY simple sample code. Class2 is a "traditional" crud type object. In a real example, these objects would probably implement some kind of ICrud interface with the common methods. I'm finding that many times, though, I want to use these objects as simple data objects, and I don't need all the database functionallity.

C# / C Sharp

1482

Object Oriented Design Contest

by: virtualadepts | last post by:

I have code here that explains my object oriented design model. I've been reading about other design models from what is documented on wikipedia about the key book on the subject: http://en.wikipedia.org/wiki/Design_patterns All of the models look fun and interesting but they are very specialized. I find myself writing bloated C++ code anyway, and not finding a way to use the basic design models. So I've invented my own. I call it...

C / C++

3273

Splitting function

by: shadow_ | last post by:

Hi i m new at C and trying to write a parser and a string class. Basicly program will read data from file and splits it into lines then lines to words. i used strtok function for splitting data to lines it worked quite well but srttok isnot working for multiple blank or commas. Can strtok do this kind of splitting if it cant what should i use . Unal

C / C++

10364

Problem With Comparison Operator <=> in G++

by: Oralloy | last post by:

Hello folks, I am unable to find appropriate documentation on the type promotion of bit-fields when using the generalised comparison operator "<=>". The problem is that using the GNU compilers, it seems that the internal comparison operator "<=>" tries to promote arguments from unsigned to signed. This is as boiled down as I can make it. Here is my compilation command: g++-12 -std=c++20 -Wnarrowing bit_field.cpp Here is the code in...

C / C++

10172

Maximizing Business Potential: The Nexus of Website Design and Digital Marketing

by: jinu1996 | last post by:

In today's digital age, having a compelling online presence is paramount for businesses aiming to thrive in a competitive landscape. At the heart of this digital strategy lies an intricately woven tapestry of website design and digital marketing. It's not merely about having a website; it's about crafting an immersive digital experience that captivates audiences and drives business growth. The Art of Business Website Design Your website is...

Online Marketing

10110

The easy way to turn off automatic updates for Windows 10/11

by: Hystou | last post by:

Overview: Windows 11 and 10 have less user interface control over operating system update behaviour than previous versions of Windows. In Windows 11 and 10, there is no way to turn off the Windows Update option using the Control Panel or Settings app; it automatically checks for updates and installs any it finds, whether you like it or not. For most users, this new feature is actually very convenient. If you want to control the update process,...

Windows Server

8993

AI Job Threat for Devs

by: agi2029 | last post by:

Let's talk about the concept of autonomous AI software engineers and no-code agents. These AIs are designed to manage the entire lifecycle of a software development project—planning, coding, testing, and deployment—without human intervention. Imagine an AI that can take a project description, break it down, write the code, debug it, and then launch it, all on its own.... Now, this would greatly impact the work of software developers. The idea...

Career Advice

7517

Access Europe - Using VBA to create a class based on a table - Wed 1 May

by: isladogs | last post by:

The next Access Europe User Group meeting will be on Wednesday 1 May 2024 starting at 18:00 UK time (6PM UTC+1) and finishing by 19:30 (7.30PM). In this session, we are pleased to welcome a new presenter, Adolph Dupré who will be discussing some powerful techniques for using class modules. He will explain when you may want to use classes instead of User Defined Types (UDT). For example, to manage the data in unbound forms. Adolph will...

Microsoft Access / VBA

5398

Trying to create a lan-to-lan vpn between two differents networks

by: TSSRALBI | last post by:

Hello I'm a network technician in training and I need your help. I am currently learning how to create and manage the different types of VPNs and I have a question about LAN-to-LAN VPNs. The last exercise I practiced was to create a LAN-to-LAN VPN between two Pfsense firewalls, by using IPSEC protocols. I succeeded, with both firewalls in the same network. But I'm wondering if it's possible to do the same thing, with 2 Pfsense firewalls...

Networking - Hardware / Configuration

5536

Windows Forms - .Net 8.0

by: adsilva | last post by:

A Windows Forms form does not have the event Unload, like VB6. What one acts like?

Visual Basic .NET

3670

How to add payments to a PHP MySQL app.

by: muto222 | last post by:

How can i add a mobile payment intergratation into php mysql website.

PHP

2894

Comprehensive Guide to Website Development in Toronto: Expert Insights from BSMN Consultancy

by: bsmnconsultancy | last post by:

In today's digital era, a well-designed website is crucial for businesses looking to succeed. Whether you're a small business owner or a large corporation in Toronto, having a strong online presence can significantly impact your brand's success. BSMN Consultancy, a leader in Website Development in Toronto offers valuable insights into creating effective websites that not only look great but also perform exceptionally well. In this comprehensive...

General