Design/implementation question

Yves Dhondt

Hello,

I've got the following UML design :

C
|
A _____|______ B

So 2 objects A and B are connected through a relation C. (For example an
employment scheme : person A1 worked for company B1 at time C1, person
A1 worked for company B2 at time C2, person A2 worked for company B1 at
time C3, ...)

My current implementation exists of :

public class objectA { ... }
public class objectB { ... }

public class relationC {
private objectA A;
private objectB B;

// other things about the relation
}

I use a list with all relationCs in it. This takes not much memory but
if I want to check which objectB's are connected to an objectA or vice
versa, I need to check every relationC.

ArrayList result = new ArrayList();

foreach (relationC c in list)
{
if (c.getA().equal s(A))
{
result.Add(c.ge tB())
}
}

Considering the fact I have about 10000 objectA's and 25 objectB's I can
have up to 250000 relationC's to check which makes my implementation
rather slow.

I considered adding direct links from A to B :

public class objectA {
private ArrayList objectBs

// ...
}

and the same the other way around but then the amount of memory used
goes up pretty fast.

I wonder if someone has an idea on how to make this faster without
needing a huge memory chunk.

TIA

Yves

Jul 21 '05 #1

Subscribe Reply

1440

Bob Grommes

Cody,

Your system clock is off by one year.

--Bob

"cody" <pl************ *************@g mx.de> wrote in message
news:OJ******** *****@TK2MSFTNG P12.phx.gbl...

It is generally not useful to load *all* data contained in a database and
then try to make *queries* against it.
First, you have to consider what you want to do with the data.

If you want to display all companies where person A worked in, do a query to return exactly that and display it. If you want to display all persons that worked in a certain company query exactly that from the database and display it.
Do not worry to much about performance: Databases are designed just for
that.

--
cody

Jul 21 '05 #2

David

On 2004-08-14, Yves Dhondt <no@privacy.net > wrote:

Hello,

I've got the following UML design :
Sorry, but you ask design questions, you get long-winded answers...

C
|
A _____|______ B

So 2 objects A and B are connected through a relation C. (For example an
employment scheme : person A1 worked for company B1 at time C1, person
A1 worked for company B2 at time C2, person A2 worked for company B1 at
time C3, ...)
From a design standpoint, that just seems wrong to me. In the real
world, companies have employees and people have jobs. It's hard to be
definitive here without knowing what app you're writing, but it doesn't
seem natural to me to think in terms of time spans having
company-employee relationships.

That's how relational database think, of course, but relational
databases are pretty much antithetical to OOP design, and it's a mistake
to have the middle-tier mirror the relational design IMHO (I realize
this area of OOP design isn't really settled, of course, but that's my
opinion).

My current implementation exists of :

public class objectA { ... }
public class objectB { ... }

public class relationC {
private objectA A;
private objectB B;

// other things about the relation
}

I use a list with all relationCs in it. This takes not much memory but
if I want to check which objectB's are connected to an objectA or vice
versa, I need to check every relationC.

ArrayList result = new ArrayList();

foreach (relationC c in list)
{
if (c.getA().equal s(A))
{
result.Add(c.ge tB())
}
}
Considering the fact I have about 10000 objectA's and 25 objectB's I can
have up to 250000 relationC's to check which makes my implementation
rather slow.

OK, I just got flamed every which way in another thread for suggesting
this, but I'm stubborn. If a relationC object really is a key object in
your design, then groups of them shouldn't be gathering together in dumb
collections. They should belong to typed collection classes that can
answer reasonable domain-specific questions:

ArrayList results = list.GetBs(A);

Now you haven't solved the problem, but at least you've encapsulated the
search algorithm into a relationCCollec tion class, rather than having
the search logic spread out all over your code. At that point there's a
lot of implementation options that should come quite easily. You could
maintain an index of references to A's relations in a Hashtable, which
would make lookup very fast. If memory's the issue, you could implement
a load-on-demand scheme, where the relationCCollec tion pulls down the
records from the database when they're requested. You could implement a
lazy-load scheme, where records are pulled down when requested, but are
then kept in the collection.

What I'd probably try is lazy loading the records, then keeping them in
collections of WeakReferences, and then indexing the keys that were
important to me. But it depends on what exactly you're trying to
achieve. There's always a natural trade-off between memory and
performance, so where the balance should be depends on your application.

Basically, though, the trade-off comes down to this. If you want fast
lookup based on a key, you want a hash. If you want fast lookup based
on different keys, you need multiple hashes. If you don't have the
memory for multiple hashes, you need to offload the calculation (e.g.,
to the database).

Jul 21 '05 #3

Nick Malik

Hi David,

This time, I understood what you are trying to say. (better coffee, more
sleep... :-)
I agree with what you said. You've clearly been thinking about this kind of
encapsulation for a while.

To Yves: you have a historical concept that is reminiscent of a data
warehouse: on a particular date, these two items were joined. Unfortunately
for you, data warehouses are large animals and require specialized
algorithms for searching and producing rolll-up calculations. You don't
make it clear what you are actually doing with this data, or even if the
employee-employer statement is just an example. However, David is right to
suggest that you should not try to mirror the logic of a database engine in
your middle layer. I hate to ask this, and I mean no disrespect, but are
you aware of SQL?

You are describing a simple many-to-many relation in a relational database.
Table A: Employees
Employee Id, Employee Name, etc

Table B: Companies
Company Id, Company Name, etc

Table C: EmployeeCompany
SomeUniqueKey -- GUID
EmployeeId
CompanyId
StartDate
EndDate

To find which Companies are connected to an Employee: (in Transact SQL)

Select Company.Company Id, Company.Company Name, EmployeeCompany .StartDate,
EmployeeCompany .EndDate
from Company inner join EmployeeCompany on Company.Company Id =
EmployeeCompany .CompanyId
where EmployeeCompany .EmployeeId = @paramEmployeeI d

Why not just program your "relation" object to perform the above query in
the constructor where you pass in an employee id (which is passed to TSQL in
@paramEmployeeI d), place the results into your collection, and work from
there...

The only deviation I'd suggest from David's excellent post would be this:
cache the data if it is not likely to change and you are likely to need it
again. Otherwise, destroy the resultset the moment you've answered the
original question.

I hope this helps,
--- Nick

"David" <df*****@woofix .local.dom> wrote in message
news:slrnchtab4 .ubf.df*****@wo ofix.local.dom. ..

On 2004-08-14, Yves Dhondt <no@privacy.net > wrote:
Hello,

I've got the following UML design :

Sorry, but you ask design questions, you get long-winded answers...

C
|
A _____|______ B

So 2 objects A and B are connected through a relation C. (For example an
employment scheme : person A1 worked for company B1 at time C1, person
A1 worked for company B2 at time C2, person A2 worked for company B1 at
time C3, ...)

From a design standpoint, that just seems wrong to me. In the real
world, companies have employees and people have jobs. It's hard to be
definitive here without knowing what app you're writing, but it doesn't
seem natural to me to think in terms of time spans having
company-employee relationships.

That's how relational database think, of course, but relational
databases are pretty much antithetical to OOP design, and it's a mistake
to have the middle-tier mirror the relational design IMHO (I realize
this area of OOP design isn't really settled, of course, but that's my
opinion).

My current implementation exists of :

public class objectA { ... }
public class objectB { ... }

public class relationC {
private objectA A;
private objectB B;

// other things about the relation
}

I use a list with all relationCs in it. This takes not much memory but
if I want to check which objectB's are connected to an objectA or vice
versa, I need to check every relationC.

ArrayList result = new ArrayList();

foreach (relationC c in list)
{
if (c.getA().equal s(A))
{
result.Add(c.ge tB())
}
}

Considering the fact I have about 10000 objectA's and 25 objectB's I can
have up to 250000 relationC's to check which makes my implementation
rather slow.

OK, I just got flamed every which way in another thread for suggesting
this, but I'm stubborn. If a relationC object really is a key object in
your design, then groups of them shouldn't be gathering together in dumb
collections. They should belong to typed collection classes that can
answer reasonable domain-specific questions:

ArrayList results = list.GetBs(A);

Now you haven't solved the problem, but at least you've encapsulated the
search algorithm into a relationCCollec tion class, rather than having
the search logic spread out all over your code. At that point there's a
lot of implementation options that should come quite easily. You could
maintain an index of references to A's relations in a Hashtable, which
would make lookup very fast. If memory's the issue, you could implement
a load-on-demand scheme, where the relationCCollec tion pulls down the
records from the database when they're requested. You could implement a
lazy-load scheme, where records are pulled down when requested, but are
then kept in the collection.

What I'd probably try is lazy loading the records, then keeping them in
collections of WeakReferences, and then indexing the keys that were
important to me. But it depends on what exactly you're trying to
achieve. There's always a natural trade-off between memory and
performance, so where the balance should be depends on your application.

Basically, though, the trade-off comes down to this. If you want fast
lookup based on a key, you want a hash. If you want fast lookup based
on different keys, you need multiple hashes. If you don't have the
memory for multiple hashes, you need to offload the calculation (e.g.,
to the database).

Jul 21 '05 #4

Similar topics

2451

An Elegant Design/Solution?

by: Patchwork | last post by:

Hi Everyone, I have a design related question (in C++) that I am hoping someone can help me with. It is related to my previous post but since it was pointed out that I was more or less asking the wrong questions about the wrong 'topic' (polymorphism) I have posted this new question. Please don't see this as a spurious attempt to repost :-)...

C / C++

1795

Help needed with design of generic class to handle multiple types

by: Code4u | last post by:

I need to design data storage classes and operators for an image processing system that must support a range of basic data types of different lengths i.e. float, int, char, double. I have a template class that stores the data. The problem with this design is the inability to treat image data generically- I have a set of specialized classes...

C / C++

1072

Design/implementation question

by: Yves Dhondt | last post by:

C# / C Sharp

105

5270

General question about Python design goals

by: Christoph Zwerschke | last post by:

Sometimes I find myself stumbling over Python issues which have to do with what I perceive as a lack of orthogonality. For instance, I just wanted to use the index() method on a tuple which does not work. It only works on lists and strings, for no obvious reason. Why not on all sequence types? Or, another example, the index() method has...

Python

2534

DAL design question and passing datasets

by: Peter M. | last post by:

Hi all, I'm currently designing an n-tier application and have some doubts about my design. I have created a Data Access layer which connects to the database (SQL Server) and performs Select, update, delete and inserts. I use dataset objects to pass data to and from the DAL. In my GUI (windows forms), I use databinding to bind controls...

C# / C Sharp

4698

Observer Design Pattern

by: Krivenok Dmitry | last post by:

Hello All! I am trying to implement my own Design Patterns Library. I have read the following documentation about Observer Pattern: 1) Design Patterns by GoF Classic description of Observer. Also describes implementation via ChangeManager (Mediator + Singleton) 2) Pattern hatching by John Vlissides Describes Observer's implementation via...

C / C++

3145

C++ and Design certification

by: neelsmail | last post by:

Hi, I have been working on C++ for some time now, and I think I have a flair for design (which just might be only my imagination over- stretched.. :) ). So, I tried to find a design certification, possibly that involves C++, but, if not, C++ and UML. All I could find was Java + UML design certifications (one such is detailed on...

C / C++

2287

Design problem: dealing with multiple time frames.

by: arnaudk | last post by:

I am trying to come up with a class design to deal with asynchronous data to be stored and analyzed over multiple time frames and could really use some design input. This is a rather long question but seeing the slowdown in the number of postings, I'm hoping some of you will have more time. (And this is not some kind of coursework question!) I...

C / C++

4999

Factory Method / Prototype Design Pattern

by: Pallav singh | last post by:

Hi , when should i select Factory Method / Prototype Design Pattern during my design phase ?? as both look similar to me Thanks in Advance Thanks Pallav

C / C++

7673

What is ONU?

by: marktang | last post by:

ONU (Optical Network Unit) is one of the key components for providing high-speed Internet services. Its primary function is to act as an endpoint device located at the user's premises. However, people are often confused as to whether an ONU can Work As a Router. In this blog post, we’ll explore What is ONU, What Is Router, ONU & Router’s main...

General

7584

Changing the language in Windows 10

by: Hystou | last post by:

Most computers default to English, but sometimes we require a different language, especially when relocating. Forgot to request a specific language before your computer shipped? No problem! You can effortlessly switch the default language on Windows 10 without reinstalling. I'll walk you through it. First, let's disable language...

Windows Server

7893

Problem With Comparison Operator <=> in G++

by: Oralloy | last post by:

Hello folks, I am unable to find appropriate documentation on the type promotion of bit-fields when using the generalised comparison operator "<=>". The problem is that using the GNU compilers, it seems that the internal comparison operator "<=>" tries to promote arguments from unsigned to signed. This is as boiled down as I can make it. ...

C / C++

7645

The easy way to turn off automatic updates for Windows 10/11

by: Hystou | last post by:

Overview: Windows 11 and 10 have less user interface control over operating system update behaviour than previous versions of Windows. In Windows 11 and 10, there is no way to turn off the Windows Update option using the Control Panel or Settings app; it automatically checks for updates and installs any it finds, whether you like it or not. For...

Windows Server

7953

Discussion: How does Zigbee compare with other wireless protocols in smart home applications?

by: tracyyun | last post by:

Dear forum friends, With the development of smart home technology, a variety of wireless communication protocols have appeared on the market, such as Zigbee, Z-Wave, Wi-Fi, Bluetooth, etc. Each protocol has its own unique characteristics and advantages, but as a user who is planning to build a smart home system, I am a bit confused by the...

General

5485

Access Europe - Using VBA to create a class based on a table - Wed 1 May

by: isladogs | last post by:

The next Access Europe User Group meeting will be on Wednesday 1 May 2024 starting at 18:00 UK time (6PM UTC+1) and finishing by 19:30 (7.30PM). In this session, we are pleased to welcome a new presenter, Adolph Dupré who will be discussing some powerful techniques for using class modules. He will explain when you may want to use classes...

Microsoft Access / VBA

5213

Couldn’t get equations in html when convert word .docx file to html file in C#.

by: conductexam | last post by:

I have .net C# application in which I am extracting data from word file and save it in database particularly. To store word all data as it is I am converting the whole word file firstly in HTML and then checking html paragraph one by one. At the time of converting from word file to html my equations which are in the word document file was convert...

C# / C Sharp

3626

Windows Forms - .Net 8.0

by: adsilva | last post by:

A Windows Forms form does not have the event Unload, like VB6. What one acts like?

Visual Basic .NET

2085

transfer the data from one system to another through ip address

by: 6302768590 | last post by:

Hai team i want code for transfer the data from one system to another through IP address by using C# our system has to for every 5mins then we have to update the data what the data is updated we have to send another system

C# / C Sharp