By using this site, you agree to our updated Privacy Policy and our Terms of Use. Manage your Cookies Settings.
440,569 Members | 1,350 Online
Bytes IT Community
+ Ask a Question
Need help? Post your question and get tips & solutions from a community of 440,569 IT Pros & Developers. It's quick & easy.

PhP database design question

P: n/a
I have built a web portal and was considering adding a 'user mail'
feature for users to message each other. I'm a bit uncertain on how to
design the DB, so was wondering if there is a standard to do this. I
was figuring perhaps a table storing all user messages (but this might
become very large very fast?) or would a seperate db storing
sent/saved/recieved etc messages be better?

Any advice would be appreciated,
Chris
Ps I'm unsure if this is the right group, but couldn;t find anything
concerning php and Dbs which seemed more appropriate.

Mar 29 '06 #1
Share this Question
Share on Google+
10 Replies


P: n/a

<ch***********@gmail.com> wrote in message
news:11**********************@t31g2000cwb.googlegr oups.com...
I have built a web portal and was considering adding a 'user mail'
feature for users to message each other. I'm a bit uncertain on how to
design the DB, so was wondering if there is a standard to do this. I
was figuring perhaps a table storing all user messages (but this might
become very large very fast?) or would a seperate db storing
sent/saved/recieved etc messages be better?


Yes, a separate Table for messages sent,saved,received could give better
results
than a whole table with all the messages, expecially with tons of users'
messages.

If users messages are really more than expected think about upgrading
hardware and doing load balancing.
Regards
--
Leonardo Armando Iarrusso - J2Be
www: http://www.J2be.com - e-mail: info[at]J2Be.com
Mar 29 '06 #2

P: n/a
ch***********@gmail.com wrote:
I have built a web portal and was considering adding a 'user mail'
feature for users to message each other. I'm a bit uncertain on how to
design the DB, so was wondering if there is a standard to do this. I
was figuring perhaps a table storing all user messages (but this might
become very large very fast?) or would a seperate db storing
sent/saved/recieved etc messages be better?

Any advice would be appreciated,
Chris
Ps I'm unsure if this is the right group, but couldn;t find anything
concerning php and Dbs which seemed more appropriate.


Hi, Chris,

First of all, no, this isn't the "wrong group" for the question. I don't know
what a "right group" would be :-).

There's no real standard for designing the DB for something like this. It's up
to you how to do it.

If I were doing it, I would keep the body of the message in a separate table,
something like:

First table:
msgid
from
to
date
subject
status (sent/received/saved)

Second table:
msgid
msgtext

The reason for keeping the message itself separate is speed. As you note, the
file could become very big. You don't want the messages every time you scan -
for instance, when you display the list of messages available to the user.
Keeping the text in a separate table will speed up these requests, at least with
MySQL. Of course, displaying the actual message will be slightly slower, but
you won't be doing that as much as just displaying to/from/subject.

--
==================
Remove the "x" from my email address
Jerry Stuckle
JDS Computer Training Corp.
js*******@attglobal.net
==================
Mar 29 '06 #3

P: n/a
Jerry Stuckle wrote:
ch***********@gmail.com wrote:
I have built a web portal and was considering adding a 'user mail'
feature for users to message each other. I'm a bit uncertain on how to
design the DB, so was wondering if there is a standard to do this. I
was figuring perhaps a table storing all user messages (but this might
become very large very fast?) or would a seperate db storing


[snip]

If I were doing it, I would keep the body of the message in a separate
table, something like:

First table:
msgid
from
to
date
subject
status (sent/received/saved)

Second table:
msgid
msgtext

The reason for keeping the message itself separate is speed. As you
note, the file could become very big. You don't want the messages every
time you scan - for instance, when you display the list of messages
available to the user. Keeping the text in a separate table will speed
up these requests, at least with MySQL. Of course, displaying the
actual message will be slightly slower, but you won't be doing that as
much as just displaying to/from/subject.


Jerry, can you elaborate on the field msgtext slowing down MySQL ?

Say for inbox retrieval for user "johndoe" we use:

SELECT msgid,from,to,date,subject,status
FROM messages
WHERE to='johndoe'
instead of

SELECT *
FROM messages
WHERE to='johndoe'

(It will be more appropriate to use user's id for "from" and "to"
instead of name. Also one would want to use slightly different field
names because some of the suggested ones are SQL keywords).

I am assuming indexed "from" and "to" queries, so db engine will most of
the time work on these indexes instead of crunching through the table.
Indexes will slow down the INSERT but that will be used less often than
SELECT.

In my opinion the field "msgtext" should not load the database if it is
not required in the query, but I am by no means a database expert. Maybe
an idea for quick experiment :)

Roman
Mar 29 '06 #4

P: n/a
Roman Ziak wrote:
Jerry Stuckle wrote:
ch***********@gmail.com wrote:
I have built a web portal and was considering adding a 'user mail'
feature for users to message each other. I'm a bit uncertain on how to
design the DB, so was wondering if there is a standard to do this. I
was figuring perhaps a table storing all user messages (but this might
become very large very fast?) or would a seperate db storing


[snip]

If I were doing it, I would keep the body of the message in a separate
table, something like:

First table:
msgid
from
to
date
subject
status (sent/received/saved)

Second table:
msgid
msgtext

The reason for keeping the message itself separate is speed. As you
note, the file could become very big. You don't want the messages every
time you scan - for instance, when you display the list of messages
available to the user. Keeping the text in a separate table will speed
up these requests, at least with MySQL. Of course, displaying the
actual message will be slightly slower, but you won't be doing that as
much as just displaying to/from/subject.

Jerry, can you elaborate on the field msgtext slowing down MySQL ?

Say for inbox retrieval for user "johndoe" we use:

SELECT msgid,from,to,date,subject,status
FROM messages
WHERE to='johndoe'
instead of

SELECT *
FROM messages
WHERE to='johndoe'

(It will be more appropriate to use user's id for "from" and "to"
instead of name. Also one would want to use slightly different field
names because some of the suggested ones are SQL keywords).

I am assuming indexed "from" and "to" queries, so db engine will most of
the time work on these indexes instead of crunching through the table.
Indexes will slow down the INSERT but that will be used less often than
SELECT.

In my opinion the field "msgtext" should not load the database if it is
not required in the query, but I am by no means a database expert. Maybe
an idea for quick experiment :)

Roman


Roman,

But it WILL slow down the database.

Everything on one table is typically kept in a single file. So, let's say
you're going to list the received messages for 'johndoe'. Are you going to
display all the message text with each one? Typically they display a header -
only to, from, subject and date sent (and maybe a message number). A query
would look like:

SELECT from, subject, sentdate
FROM messages
WHERE to = 'johndoe';

Now - it can use an index to locate the records where to = 'johndoe'. No
problem there. However, it must still read the table to get subject and sent date.

If the message text is in the same table, it still has to read the message text
(or at least part of it), even though it wasn't requested. And there is more
data to buffer, making it less likely that the next message is in the same
physical block of data.

OTOH, if the message text is in a different table, it doesn't have to be read
with the rest of the header information. There is a greater likelihood that the
next message is in the same buffer. And more message headers can be buffered in
the same amount of memory.

Of course, there will be slightly more overhead when you fetch the text of the
message because you now have to join two tables. But in a typical system that's
not done nearly as often as displaying a list of messages.

Just because you're not returning all the data doesn't mean there isn't
additional overhead.

--
==================
Remove the "x" from my email address
Jerry Stuckle
JDS Computer Training Corp.
js*******@attglobal.net
==================
Mar 29 '06 #5

P: n/a
Jerry Stuckle wrote:
That should be:

SELECT sentfrom, subject, sentdate
FROM messages
WHERE sentto = 'johndoe';

--
==================
Remove the "x" from my email address
Jerry Stuckle
JDS Computer Training Corp.
js*******@attglobal.net
==================
Mar 29 '06 #6

P: n/a
NC
J2be wrote:
<ch***********@gmail.com> wrote in message
news:11**********************@t31g2000cwb.googlegr oups.com...
I have built a web portal and was considering adding a 'user mail'
feature for users to message each other. I'm a bit uncertain on how to
design the DB, so was wondering if there is a standard to do this. I
was figuring perhaps a table storing all user messages (but this might
become very large very fast?) or would a seperate db storing
sent/saved/recieved etc messages be better?


Yes, a separate Table for messages sent,saved,received could give
better results than a whole table with all the messages,


I respectfully disagree. First, this would imply that a message sent
by user A to user B will have to exist in two instances (in the `sent`
table for user A and in the `received` table for user B). If a user
broadcasts a message to 100 other users, you'll have to store 101
copies of that message... Second, changing message status, say, from
received to saved would require an INSERT INTO `saved` and a DELETE
FROM `received` (probably inside a transaction, just to make sure we
don't end up with twins), as opposed to something much more mundane,
such as UPDATE `messages` SET status='saved'.

Personally, I like Jerry Stuckle's suggestion:

First table:
msgid
from
to
date
subject
status (sent/received/saved)

Second table:
msgid
msgtext

This is a good architecture, which may require a little tweaking if
users are allowed to delete their messages and/or send messages to
multiple users.

Cheers,
NC

Mar 29 '06 #7

P: n/a
On Wed, 29 Mar 2006 09:53:54 -0800, NC wrote:
Yes, a separate Table for messages sent,saved,received could give better
results than a whole table with all the messages,


I respectfully disagree. First, this would imply that a message sent by
user A to user B will have to exist in two instances (in the `sent` table
for user A and in the `received` table for user B). If a user broadcasts
a message to 100 other users, you'll have to store 101 copies of that
message...


OK, if you only have one table, how do you cope with the situation above
when 50 people delete it from their inbox, but the others want to keep the
message.

In a logical sense it absolutely does exist in each person's inbox/sent
items as each person could apply their own choice of actions to it
(deleting it is one, it could also be marked that you've replied and when
or flagged for follow-up or marked as spam) - these aren't global for
every copy of the message.

Cheers,
Andy

--
Andy Jeffries MBCS CITP ZCE | gPHPEdit Lead Developer
http://www.gphpedit.org | PHP editor for Gnome 2
http://www.andyjeffries.co.uk | Personal site and photos

Mar 29 '06 #8

P: n/a
NC
Andy Jeffries wrote:
On Wed, 29 Mar 2006 09:53:54 -0800, NC wrote:
Yes, a separate Table for messages sent,saved,received could give better
results than a whole table with all the messages,


I respectfully disagree. First, this would imply that a message sent by
user A to user B will have to exist in two instances (in the `sent` table
for user A and in the `received` table for user B). If a user broadcasts
a message to 100 other users, you'll have to store 101 copies of that
message...


OK, if you only have one table, how do you cope with the situation above
when 50 people delete it from their inbox, but the others want to keep the
message.


Simple, really; per Jerry Stuckle's suggestion, let's adopt the
following data model with a few tweaks:

Table `messages`:
msgid (primary key)
msgtext

Table `headers`:
id (primary key)
msgid (index, non-unique; links to `messsages`.`msgid`)
owner (index, non-unique; links to user ID)
from
to
date
subject
status ('draft'/'sent'/'received'/'saved'/'deleted')

Now, let's say user A sends a message to B, C and D. Now we have
something like this (partial rendering of the `headers` table):

id | msgid | owner | from | to | status
11 | 8 | A | A |B,C,D| sent
12 | 8 | B | A |B,C,D| received
13 | 8 | C | A |B,C,D| received
14 | 8 | D | A |B,C,D| received

Viewing A's outbox would look like this:

SELECT [fields] FROM headers
WHERE owner='A' AND status='sent';

Viewing B's inbox would look like this:

SELECT [fields] FROM headers
WHERE owner='B' AND status='received';

Deleting a message by C would look like this:

UPDATE headers SET status='deleted'
WHERE msgid=8 AND owner='C';

Pretty straightforward, I think...

Cheers,
NC

Mar 29 '06 #9

P: n/a
On Wed, 29 Mar 2006 12:07:54 -0800, NC wrote:
> I respectfully disagree. First, this would imply that a message sent
> by user A to user B will have to exist in two instances (in the `sent`
> table for user A and in the `received` table for user B). If a user
> broadcasts a message to 100 other users, you'll have to store 101
> copies of that message...


OK, if you only have one table, how do you cope with the situation above
when 50 people delete it from their inbox, but the others want to keep
the message.


Simple, really


The worst thing was as soon as you wrote "Simple, really" it twigged in my
mind. Ignore me, obviously having a dopey day earlier...

(On the upside, I was just going to slink away in embarrassment - at least
I've been man enough to post on here and admit I was being a twat!)

Cheers,
Andy

--
Andy Jeffries MBCS CITP ZCE | gPHPEdit Lead Developer
http://www.gphpedit.org | PHP editor for Gnome 2
http://www.andyjeffries.co.uk | Personal site and photos

Mar 29 '06 #10

P: n/a
Couple things:

- Since this is an internal system, delivery is gaurented and status
should would be better as read/unread, and perhaps 'deleted' if you
want to keep old messages around.
- To nit pick a bit: normalized (read: more or less 'proper') database
design doesn't allow for multiple values to be within a single field
(1st form violation? Can't recall exactly). This also eliminates the
need for the "owner" field, allows better indexing of "to", selection
with to, aggregate functions for things involving the "to" field.
- There doesn't need to be a record with status sent, that is implied
by someone receiving the message.
- If each row has a primary key id, then it would be better to perform
operations on the rows using the primary key, rather than a combination
of other keys that is unique (or just call that combination the primary
key, i personally feel its easier pass around a single id rather than a
combination of multiple fields)

So the table headers would become, in the instance A sends a message to
B, C, and D:

id | msgid | from | to | status
12 | 8 | A | B | unread
13 | 8 | A | C | read
14 | 8 | A | D | read

You're now storing 3 rows instead 4 and have eliminated a whole column.

Mar 30 '06 #11

This discussion thread is closed

Replies have been disabled for this discussion.