473,769 Members | 6,034 Online
Bytes | Software Development & Data Engineering Community
+ Post

Home Posts Topics Members FAQ

Data objects vs. speed

I have to create a big web application and I was thinking of using a
data layer. For each entity in the database, I'll define a class that
maps the table structure, having sub-objects for each foreign key,
having insert/delete/update methods, the usual deal. Yet, I am very
concerned about performance. For example, there are lots of cases when
I may just be needing the employee name. Yet using this model, I will
have to instantiate an entire Employee class, which may have sub-
objects and use lots of SELECT queries. I think this will be an
important performance hit. What do you think? Ever did something like
this? Is it worthy? Is there a better way? Should I give up creating
the data layer?

Feb 19 '07 #1
8 1940
I have just implemented something similar.

You can structure the SELECTs in such a way that they are only run if
the relavent information is requested from the class via a get method.
For instance, if any of the methods requesting user information are
called, a SELECT is run which gets all user information (since all
info is on the same row - might as well get it all). Any further
requests then give up this cached info rather than another SELECT.

However, I share your concern with having to instantiate a large class
for even simple queries. I am a newbie at PHP (C++ coder by trade) and
am only now reading about the performance overheard of function calls
and the fact that PHP will have to parse the entire class code.

Is it worth the trade off? I don't know. I has certainly made the
coding a whole lot easier and more structured. I am now implementing
cacheing certain data in the session, and this is a snip as everything
is accessed via the same class.

I will read this thread with interest.


Feb 19 '07 #2
rp******@yahoo. com wrote:
I have to create a big web application and I was thinking of using a
data layer. For each entity in the database, I'll define a class that
maps the table structure, having sub-objects for each foreign key,
having insert/delete/update methods, the usual deal. Yet, I am very
concerned about performance. For example, there are lots of cases when
I may just be needing the employee name. Yet using this model, I will
have to instantiate an entire Employee class, which may have sub-
objects and use lots of SELECT queries. I think this will be an
important performance hit. What do you think? Ever did something like
this? Is it worthy? Is there a better way? Should I give up creating
the data layer?
I do objects in all my more complex sites. But I generally use business
objects - ones related to the job at hand. Any business object may be
related to a few columns in one table, most columns across multiple
tables, or anything in between.

And I may even use the same object, but have different fetch methods
which retrieve only partial data.

For instance, if I have:

table student
id int
name char(30)
other stuff

table class
id int
name char
instructor_id int
other stuff

table class_student
class_id int
student_id int

I might have a class which will retrieve all the student names, the
instructor name, day(s) and start/s for a single class, the room in
which it will be held, etc.. Alternatively, it might only retrieve a
count of the number of students in the class

If the data is getting more complex, I may add a pure data layer under
the business object layer, where I do have one table per object. But
this layer will also have multiple functions to allow me to retrieve
what I need, when I need it.

--
=============== ===
Remove the "x" from my email address
Jerry Stuckle
JDS Computer Training Corp.
js*******@attgl obal.net
=============== ===
Feb 19 '07 #3
lister wrote:
I have just implemented something similar.

You can structure the SELECTs in such a way that they are only run if
the relavent information is requested from the class via a get method.
For instance, if any of the methods requesting user information are
called, a SELECT is run which gets all user information (since all
info is on the same row - might as well get it all). Any further
requests then give up this cached info rather than another SELECT.

However, I share your concern with having to instantiate a large class
for even simple queries. I am a newbie at PHP (C++ coder by trade) and
am only now reading about the performance overheard of function calls
and the fact that PHP will have to parse the entire class code.

Is it worth the trade off? I don't know. I has certainly made the
coding a whole lot easier and more structured. I am now implementing
cacheing certain data in the session, and this is a snip as everything
is accessed via the same class.

I will read this thread with interest.

In a big class where you need lots of data, this would be a performance
disaster. Some of my tables have 40+ columns - and making 40+ SELECT
statements, each fetching one item, is a tremendous waste of time.

--
=============== ===
Remove the "x" from my email address
Jerry Stuckle
JDS Computer Training Corp.
js*******@attgl obal.net
=============== ===
Feb 19 '07 #4
In a big class where you need lots of data, this would be a performance
disaster. Some of my tables have 40+ columns - and making 40+ SELECT
statements, each fetching one item, is a tremendous waste of time.
I think you misunderstand me. There is only one select statement per
table. Yes, I could have 40 columns, and 40 different get*() methods,
but only the first call to one of those methods would result in a
SELECT *. Then all 40 pieces of information would be populated in the
class, and any subsequent get*() calls would simply return the value.

Feb 19 '07 #5
lister wrote:
>In a big class where you need lots of data, this would be a performance
disaster. Some of my tables have 40+ columns - and making 40+ SELECT
statements, each fetching one item, is a tremendous waste of time.

I think you misunderstand me. There is only one select statement per
table. Yes, I could have 40 columns, and 40 different get*() methods,
but only the first call to one of those methods would result in a
SELECT *. Then all 40 pieces of information would be populated in the
class, and any subsequent get*() calls would simply return the value.
OK, that's better. But it's also a lot of unnecessary overhead to fetch
40 columns if you only need 3.

Additionally, you should never use SELECT * - put in the names of the
columns instead. For instance, what happens if someone adds a 10Mb BLOB
field to your table and you SELECT *?

As I indicated above, a better compromise between getting everything and
getting just one field is to have more than one fetch method to allow
fetching of multiple columns.

For instance, on one customer I have one table (companies they sell to)
with 42 columns (and yes, it is normalized). One fetch might get
company name and id. Another might get this plus address and phone
info. A third one might get all of this plus current financial info.

And since this is a business object, I could have a method which fetches
outstanding transactions. Another can fetch all transactions. These
and others like them link to other tables - but that is transparent to
the program itself.

--
=============== ===
Remove the "x" from my email address
Jerry Stuckle
JDS Computer Training Corp.
js*******@attgl obal.net
=============== ===
Feb 19 '07 #6
rpsetzer wrote:
Yet, I am very concerned about performance. For example, there are lots
of cases when I may just be needing the employee name.
The approach I'm using in a current project is along these lines:

public function load_from_datab ase($keys) {...}
public function load_from_array ($vals) {...}

Such that the class can be populated from either the database, or from an
array. It can be partially populated, so that, for example, you can load
the Employee ID and Employee Name from an array like so:

$e = new Employee();
$e->load_from_arra y(array('id'=>1 , 'name'=>'Joe')) ;

where the other fields like address, telephone number, etc would remain
blank within the object, but if they were asked for:

print $e->get('phone') ;

then the Employee object is smart enough to grab all the missing data from
the database and return the appropriate value.

So for example:

$db = new PDO(...);
foreach ($db->query('SELEC T id, name FROM employees') as $row)
{
$e = new Employee();
$e->load_from_arra y($row);
print $e->get('name');
if ($e->get('name')==' Joe')
print ' '.$e->get('phone') ;
print "\n";
}

will work as expected, printing a list of employees, including phone
numbers for employees called Joe. Yes, that's right -- even when the
initial query didn't select phone numbers!

When you perform the get('phone') call, $e realises that 'phone' is NULL,
so performs its own call to load_from_datab ase, which runs 'SELECT * FROM
employees WHERE id={$this->id}' and then feeds the results into
load_from_array .

--
Toby A Inkster BSc (Hons) ARCS
Contact Me ~ http://tobyinkster.co.uk/contact
Geek of ~ HTML/SQL/Perl/PHP/Python*/Apache/Linux

* = I'm getting there!
Feb 19 '07 #7
Hi,

Of course it depends on what you need to do and how much processing
power you have at hand. For some purposes using a relational database
may itself be a bad idea. But for most apps data objects can work fine,
if you implement caching. When i developed the first version of
phpPeanuts i wanted it to be the simpelest thing that could possebly
work so i did not include caching. But when i tested my first serious
app with it, caching was the first thing to be added. Without it it
simply did not perform. With caching the app only fired 1/4th to 1/10th
of the queries, this solved the performance problem. I implemented
caching in the framework and i have not yet had to develop an app for
wchich it was not sufficient.

To implement caching each object type needs to have a key. For each
object that is retrieved from the database a reference* is added to the
cache, using its key as key in an associative array. Associative arrays
are great for caching because they use hashed lookup to quickly retrieve
objects. With hashed lookup a there is very little searching, and the
searching does not increase substantially with large arrays.

When navigating over a '1-n' relationship in the '1' direction you have
the (now foreign) key, so you first look in the cache to see if the
object has already been loaded. If it has not, you load it and cache it
and return a reference. This mechanism is so fast that it is not needed
to store the reference to a related object in a member variable of the
relating object. This saves you a big headache with circular references
that tend to make php unstable.

When navigating over an 1-n relationship in the n direction the chache
is of little help**. Therefore it is generally a good idea to store each
array of objects you retrieve in a member variable of the relating object.

BTW, you do not have to implement insert/delete/update methods for each
object type, with metadata you can implement generic insert, delete and
update methods so you only need three methods however many types you
make. Or you could download a framework that does all this ;-).

Greetings,

Henk Verhoeven,
www.phpPeanuts.org.

BTW phpPeanuts does not work if you need arbitrary keys. It allways
needs 'id' to be the key. Simpelest thing...

* in php5 it is not necessary to think about references, but in php4 you
may get a subtantial performance gain by using variable references here.
** However, if an object is stored in multiple tables but you do not
know in advance in which ones (like with polymorphism) the cache may
save you an extra query to retrieve additional data.
Feb 20 '07 #8

<rp******@yahoo .comwrote in message
news:11******** **************@ v33g2000cwv.goo glegroups.com.. .
>I have to create a big web application and I was thinking of using a
data layer. For each entity in the database, I'll define a class that
maps the table structure, having sub-objects for each foreign key,
having insert/delete/update methods, the usual deal. Yet, I am very
concerned about performance. For example, there are lots of cases when
I may just be needing the employee name. Yet using this model, I will
have to instantiate an entire Employee class, which may have sub-
objects and use lots of SELECT queries. I think this will be an
important performance hit. What do you think? Ever did something like
this? Is it worthy? Is there a better way? Should I give up creating
the data layer?
Having a single class for each database table is a good idea - I have been
using it for years. The notion of having a subclass for each foreign key is
a definite no-no. It is totally unnecessary and a waste of time. Nor do you
need a separate class method for each possible SELECT ... WHERE ... as it is
possible to have a generic getData($where) method where the $where argument
is a string which is provided at runtime. This can cover all eventualities.

It does not matter that you have to instantiate the EMPLOYEE class
containing 40 columns if you only want a single column. In my methodology
the default is SELECT *, but a specific list of column names can be provided
for individual queries if required. This removes the need for a different
class method for each combination of SELECT ... and WHERE ...

One thin that you should notice is that 95% of the code in a database table
class is common to all database tables, therefore this common code can be
put into a superclass and inherited by individual table classes. This means
that each individual table class need contain no more than the following:
(1) database engine name
(2) database name
(3) table name
(4) database structure (column names, primary key names, candidate key
names, relationships with other tables)
(5) custom business rules

If you are really clever you can put your database APIs in a separate class
so that you can switch database engines (MySQL, PostgreSQL or Oracle) by
changing a single line of code.

The bottom line is that if you use OO classes intelligently you can save a
lot of repetitive coding, must don't waste time trying to create a complex
class hierarchy - it just ain't worth it.

--
Tony Marston
http://www.tonymarston.net
http://www.radicore.org
Feb 20 '07 #9

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

16
3039
by: D Witherspoon | last post by:
I am developing a Windows Forms application in VB.NET that will use .NET remoting to access the data tier classes. A very simple way I have come up with is by creating typed (.xsd) datasets. For example dsParts.xsd and including that in the data tier. I then will create a class that looks like this Public Class CPart Inherits dsParts
9
3313
by: Tony Lee | last post by:
Some time a ago, on this newsgroup the following comments were made in recommending good references for Access (2003) >I used to recommend Dr. Rick Dobson's, "Programming Access <version>" for >people moving from power user to developer, but now I suggest you browse >it, >too. It strongly emphasizes ADO, which knowledgeable Microsoft insiders no >longer recommend, and the Access ADP client to SQL Server. He writes well, >and is a good...
11
3619
by: hoopsho | last post by:
Hi Everyone, I am trying to write a program that does a few things very fast and with efficient use of memory... a) I need to parse a space-delimited file that is really large, upwards fo a million lines. b) I need to store the contents into a unique hash. c) I need to then sort the data on a specific field. d) I need to pull out certain fields and report them to the user.
1
1680
by: .Net Sports | last post by:
I am resurrecting an old script from a previous programmer, and want to migrate it over to asp.net from classic asp. Tho there are other peripheral parts of the script (that really have no bearing on the core, in which I will explain), the main core is the concern. I have a daily feed from a horseracing prognosticator that gets put into an sql database on a daily basis for roughly 15 popular racetracks. A user (who will need a...
7
1397
by: Ben Kim | last post by:
Hello all, I am coming from another development environment (Clarion for Windows) and they offer a special In-Memory driver which acts just like a database driver. What would be the appropriate replacement for such an animal using VB.NET 2005? We are looking for the fastest possible speed, data does not have to persist (may in some cases) and could be globally scoped or local to the procedure. Would you choose a collection or an...
5
1630
by: Brian P. Hammer | last post by:
I have data from multiple SQL tables, some of the tables will only have one row, while others multiple rows. I load a bunch of data from the various tables and add them to a third party grid. With some of the rows, I perform calculations on some of the rows and all this is loaded into the grid as well. I am trying to figure out the best way to store all this data so that it is easier to work with and perform calculations as users edit the...
1
4555
by: Demetri | last post by:
I have a question / concern regarding the new suggested way of creating a data access layer in an n-tier application. Typically, a web application specifically, using the SOA (Service Oriented Architecture) approach. At various sites such as 15 Seconds (http://www.15seconds.com/issue/050721.htm) they advocate using the TableAdapter wizard to generate the data access layer. Describing the TableAdapter wizard, here is a quote from the...
4
7585
by: Andrew Jackson | last post by:
I am writing a newsgroup client. I have the protocol figured out. But I get slow transfer speeds off any of the network objects read the data from For example one of the commands for a news client to use is "XOVER articlenumber-" This return string after string of all the news articles from article number on.... Another newsclient, i wont name names, pulls data down just fine. Using a
6
1733
by: Patrick Sullivan | last post by:
Hello. I will be using some large data sets ("points" from 2 to 12 variables) and would like to use one class for each point rather than a list or dictionary. I imagine this is terribly inefficient, but how much? What is the cost of creating a new class? What is the cost of referencing a class variable?
0
9423
by: Hystou | last post by:
Most computers default to English, but sometimes we require a different language, especially when relocating. Forgot to request a specific language before your computer shipped? No problem! You can effortlessly switch the default language on Windows 10 without reinstalling. I'll walk you through it. First, let's disable language synchronization. With a Microsoft account, language settings sync across devices. To prevent any complications,...
0
10216
Oralloy
by: Oralloy | last post by:
Hello folks, I am unable to find appropriate documentation on the type promotion of bit-fields when using the generalised comparison operator "<=>". The problem is that using the GNU compilers, it seems that the internal comparison operator "<=>" tries to promote arguments from unsigned to signed. This is as boiled down as I can make it. Here is my compilation command: g++-12 -std=c++20 -Wnarrowing bit_field.cpp Here is the code in...
0
10049
jinu1996
by: jinu1996 | last post by:
In today's digital age, having a compelling online presence is paramount for businesses aiming to thrive in a competitive landscape. At the heart of this digital strategy lies an intricately woven tapestry of website design and digital marketing. It's not merely about having a website; it's about crafting an immersive digital experience that captivates audiences and drives business growth. The Art of Business Website Design Your website is...
1
9997
by: Hystou | last post by:
Overview: Windows 11 and 10 have less user interface control over operating system update behaviour than previous versions of Windows. In Windows 11 and 10, there is no way to turn off the Windows Update option using the Control Panel or Settings app; it automatically checks for updates and installs any it finds, whether you like it or not. For most users, this new feature is actually very convenient. If you want to control the update process,...
0
8873
agi2029
by: agi2029 | last post by:
Let's talk about the concept of autonomous AI software engineers and no-code agents. These AIs are designed to manage the entire lifecycle of a software development project—planning, coding, testing, and deployment—without human intervention. Imagine an AI that can take a project description, break it down, write the code, debug it, and then launch it, all on its own.... Now, this would greatly impact the work of software developers. The idea...
1
7413
isladogs
by: isladogs | last post by:
The next Access Europe User Group meeting will be on Wednesday 1 May 2024 starting at 18:00 UK time (6PM UTC+1) and finishing by 19:30 (7.30PM). In this session, we are pleased to welcome a new presenter, Adolph Dupré who will be discussing some powerful techniques for using class modules. He will explain when you may want to use classes instead of User Defined Types (UDT). For example, to manage the data in unbound forms. Adolph will...
0
5309
by: TSSRALBI | last post by:
Hello I'm a network technician in training and I need your help. I am currently learning how to create and manage the different types of VPNs and I have a question about LAN-to-LAN VPNs. The last exercise I practiced was to create a LAN-to-LAN VPN between two Pfsense firewalls, by using IPSEC protocols. I succeeded, with both firewalls in the same network. But I'm wondering if it's possible to do the same thing, with 2 Pfsense firewalls...
2
3565
muto222
by: muto222 | last post by:
How can i add a mobile payment intergratation into php mysql website.
3
2815
bsmnconsultancy
by: bsmnconsultancy | last post by:
In today's digital era, a well-designed website is crucial for businesses looking to succeed. Whether you're a small business owner or a large corporation in Toronto, having a strong online presence can significantly impact your brand's success. BSMN Consultancy, a leader in Website Development in Toronto offers valuable insights into creating effective websites that not only look great but also perform exceptionally well. In this comprehensive...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.