473,387 Members | 1,619 Online
Bytes | Software Development & Data Engineering Community
Post Job

Home Posts Topics Members FAQ

Join Bytes to post your question to a community of 473,387 software developers and data experts.

Locking threads

Jim
Hi,

I'm developing a CMS and I'd like to be able cache the site "tree" in
a multi-dimensional array in shared memory (far too many sql calls
otherwise). When someone adds an item in the tree I need to be able to
read in the array from shared memory, add the new item, then write it
back to shared memory.....all in one atomic action.

I've done plenty of research and short of using something like
eaccelerator or mmcache I'm stuck with PHP semaphores which even then
don't appear to be thread safe, only process safe (correct me if I'm
wrong) - and then I'm restricted to *nix systems.

Is there any way to create a method of doing the above which will work
on *nix and Windows, whether it's multi-threaded or single-threaded?

Thanks,

Jim.

Jul 15 '07 #1
10 1353
Jim wrote:
Hi,

I'm developing a CMS and I'd like to be able cache the site "tree" in
a multi-dimensional array in shared memory (far too many sql calls
otherwise). When someone adds an item in the tree I need to be able to
read in the array from shared memory, add the new item, then write it
back to shared memory.....all in one atomic action.

I've done plenty of research and short of using something like
eaccelerator or mmcache I'm stuck with PHP semaphores which even then
don't appear to be thread safe, only process safe (correct me if I'm
wrong) - and then I'm restricted to *nix systems.

Is there any way to create a method of doing the above which will work
on *nix and Windows, whether it's multi-threaded or single-threaded?

Thanks,

Jim.
Jim,

Use a database. There are dozens around which use databases; if
implemented properly they can be quite efficient.

--
==================
Remove the "x" from my email address
Jerry Stuckle
JDS Computer Training Corp.
js*******@attglobal.net
==================
Jul 16 '07 #2
Jim
I'm developing a CMS and I'd like to be able cache the site "tree" in
a multi-dimensional array in shared memory (far too many sql calls
otherwise). When someone adds an item in the tree I need to be able to
read in the array from shared memory, add the new item, then write it
back to shared memory.....all in one atomic action.
I've done plenty of research and short of using something like
eaccelerator or mmcache I'm stuck with PHP semaphores which even then
don't appear to be thread safe, only process safe (correct me if I'm
wrong) - and then I'm restricted to *nix systems.
Is there any way to create a method of doing the above which will work
on *nix and Windows, whether it's multi-threaded or single-threaded?
Jim,

Use a database. There are dozens around which use databases; if
implemented properly they can be quite efficient.
Hi Jerry,

If I could think of a way of doing it efficiently then I'd stick with
db only, but I can't see how. For example, I have a table which
represents the structure of the site, so to put it simply each record
has an id and a parent id. To build say a left hand nav I may need to
call 3 or 4 sql statements to get all the data I need which I'd like
to avoid doing if possible.

Thanks,

Jim.

Jul 16 '07 #3
rf

"Jim" <ji***@yahoo.comwrote in message
news:11*********************@m3g2000hsh.googlegrou ps.com...
>Use a database. There are dozens around which use databases; if
implemented properly they can be quite efficient.

Hi Jerry,

If I could think of a way of doing it efficiently then I'd stick with
db only, but I can't see how. For example, I have a table which
represents the structure of the site, so to put it simply each record
has an id and a parent id. To build say a left hand nav I may need to
call 3 or 4 sql statements to get all the data I need which I'd like
to avoid doing if possible.
Premature optimization?

Three or four sql calls is sub-millisecond (once the operating system has
*cached* your database working set for you). Compare this to the tens of
milliseconds for a TCP/IP packet exchange or, from over here in .au,
hundreds of milliseconds.

Have you done any benchmarking to prove the sql calls are really a problem?

--
Richard.
Jul 16 '07 #4
Jim
If I could think of a way of doing it efficiently then I'd stick with
db only, but I can't see how. For example, I have a table which
represents the structure of the site, so to put it simply each record
has an id and a parent id. To build say a left hand nav I may need to
call 3 or 4 sql statements to get all the data I need which I'd like
to avoid doing if possible.

Premature optimization?
Perhaps, but I'm still interested in the shared memory caching options
as there's more than once instance where I believe caching of data
would come in handy. At the moment I'm not seeing issues with with
performance, however on accessing a page I may have 5 or 6 db
statements accessing data which rarely changes (i.e. site structure,
data dictionary) and it feels like I'm making the db work un-
necessarily when the data could be cached. I need for the cms to be
able to scale well and I feel that ignoring caching would be a
mistake.

Jim.

Jul 16 '07 #5
rf

"Jim" <ji***@yahoo.comwrote in message
news:11**********************@k79g2000hse.googlegr oups.com...
If I could think of a way of doing it efficiently then I'd stick with
db only, but I can't see how. For example, I have a table which
represents the structure of the site, so to put it simply each record
has an id and a parent id. To build say a left hand nav I may need to
call 3 or 4 sql statements to get all the data I need which I'd like
to avoid doing if possible.

Premature optimization?

Perhaps, but I'm still interested in the shared memory caching options
You already have one. Your operating system. Modern OSs (and many older
ones) are very very good at keeping a processes working set in memory. The
authors have spent many years fine tuning the caching algorithms. A cache
miss is very expensive (read many milliseconds). A cache hit is to be
ignored (microseconds).

<checks OSYep. Of the two gigabytes of physical memory present on this
computer the OS is currently allocating almost one gig to System Cache. In
there would be the working set of each program I have open (not windows,
programs), the working set of each process (yes, windows) I have open and
most likely the contents of the most recent files each process has opened.
In the case of my database server I would expect that most of the indexes
and much of the data that I have recently hit would be in the OS cache. All
this is over and above the virtual memory living behind my real memory,
which is much faster than any file other access.

Your server would be running far less processes than I am at the moment.
as there's more than once instance where I believe caching of data
would come in handy. At the moment I'm not seeing issues with with
performance, however on accessing a page I may have 5 or 6 db
statements accessing data which rarely changes (i.e. site structure,
data dictionary)
So, those things will most likely be in the OS cache, if they come from a
database (after the first hit on your page, that is). If *you* re-cache them
in memory then you are defeating the OS or the database cache, since your
memory cache will use up memory that they may have been able to use. And,
I'll bet, they are better at cache algorithms than you, or I, are :-)
and it feels like I'm making the db work un-
necessarily when the data could be cached.
Cross purposes. The db is working from cached data. Why layer another
caching system over the top of that?
I need for the cms to be
able to scale well
Think about all the other CMS's around. They all use a database. Then think
about the obviously database driven web sites out there. CNN, ebay and, most
to the point, Google. Do you really think they have put lots of effort into
building a turnkey memory cache? Nope. They rely on the technologies that
lie underneath the sql call. They rely on the database manager to perform
properly (ie, cache where it can, and should), which relies on the operating
system to perform properly (ie, cache where it can), which relies on the
hardware to perform properly (ie, to cache where it can and yes, modern disk
drives do cache, they even do read ahead, anticipating that if you have just
read this bit you will probably read what follows pretty soon).
and I feel that ignoring caching would be a
mistake.
Developing a cache to lie above all the other technology that is already
there would IMHO be the mistake. Better to simply add one more index in the
right place to your database, so your database manager can use the (cached)
index to better access your data. Or compress that 200K image you have down
to the 20K it should be.

Finally, what about all the other things that happen during a page access.
How many PHP files make up the page? (you do use include don't you?) How
many (correctly compressed) images are there? The CSS files? Javascript? And
where do these files live, after the first hit on your site? In the OS cache
:-)

Premature optmization.

Phew, it's now time for a quiet beer ;-)
--
Richard.

Jul 16 '07 #6
Jim wrote:
>>I'm developing a CMS and I'd like to be able cache the site "tree" in
a multi-dimensional array in shared memory (far too many sql calls
otherwise). When someone adds an item in the tree I need to be able to
read in the array from shared memory, add the new item, then write it
back to shared memory.....all in one atomic action.
I've done plenty of research and short of using something like
eaccelerator or mmcache I'm stuck with PHP semaphores which even then
don't appear to be thread safe, only process safe (correct me if I'm
wrong) - and then I'm restricted to *nix systems.
Is there any way to create a method of doing the above which will work
on *nix and Windows, whether it's multi-threaded or single-threaded?
>Jim,

Use a database. There are dozens around which use databases; if
implemented properly they can be quite efficient.

Hi Jerry,

If I could think of a way of doing it efficiently then I'd stick with
db only, but I can't see how. For example, I have a table which
represents the structure of the site, so to put it simply each record
has an id and a parent id. To build say a left hand nav I may need to
call 3 or 4 sql statements to get all the data I need which I'd like
to avoid doing if possible.

Thanks,

Jim.
It is efficient - and probably a lot more so than what you're trying to
do. Not only do you need to cache it in memory, but you need ways to
identify the cached data, determine if it is in memory, if not, load it
into memory and a whole bunch of other things. All of these can easily
take more time than a simple database (or three or four) access(s).

There's a good reason why every CMS today uses databases - it works, and
it works well.

And don't prematurely optimize your code.

--
==================
Remove the "x" from my email address
Jerry Stuckle
JDS Computer Training Corp.
js*******@attglobal.net
==================
Jul 16 '07 #7
On Jul 16, 11:30 am, "rf" <r...@invalid.comwrote:
"Jim" <ji...@yahoo.comwrote in message

news:11**********************@k79g2000hse.googlegr oups.com...
If I could think of a way of doing it efficiently then I'd stick with
db only, but I can't see how. For example, I have a table which
represents the structure of the site, so to put it simply each record
has an id and a parent id. To build say a left hand nav I may need to
call 3 or 4 sql statements to get all the data I need which I'd like
to avoid doing if possible.
Premature optimization?
Perhaps, but I'm still interested in the shared memory caching options

You already have one. Your operating system. Modern OSs (and many older
ones) are very very good at keeping a processes working set in memory. The
authors have spent many years fine tuning the caching algorithms. A cache
miss is very expensive (read many milliseconds). A cache hit is to be
ignored (microseconds).

<checks OSYep. Of the two gigabytes of physical memory present on this
computer the OS is currently allocating almost one gig to System Cache. In
there would be the working set of each program I have open (not windows,
programs), the working set of each process (yes, windows) I have open and
most likely the contents of the most recent files each process has opened.
In the case of my database server I would expect that most of the indexes
and much of the data that I have recently hit would be in the OS cache. All
this is over and above the virtual memory living behind my real memory,
which is much faster than any file other access.

Your server would be running far less processes than I am at the moment.
as there's more than once instance where I believe caching of data
would come in handy. At the moment I'm not seeing issues with with
performance, however on accessing a page I may have 5 or 6 db
statements accessing data which rarely changes (i.e. site structure,
data dictionary)

So, those things will most likely be in the OS cache, if they come from a
database (after the first hit on your page, that is). If *you* re-cache them
in memory then you are defeating the OS or the database cache, since your
memory cache will use up memory that they may have been able to use. And,
I'll bet, they are better at cache algorithms than you, or I, are :-)
and it feels like I'm making the db work un-
necessarily when the data could be cached.

Cross purposes. The db is working from cached data. Why layer another
caching system over the top of that?
I need for the cms to be
able to scale well

Think about all the other CMS's around. They all use a database. Then think
about the obviously database driven web sites out there. CNN, ebay and, most
to the point, Google. Do you really think they have put lots of effort into
building a turnkey memory cache? Nope. They rely on the technologies that
lie underneath the sql call. They rely on the database manager to perform
properly (ie, cache where it can, and should), which relies on the operating
system to perform properly (ie, cache where it can), which relies on the
hardware to perform properly (ie, to cache where it can and yes, modern disk
drives do cache, they even do read ahead, anticipating that if you have just
read this bit you will probably read what follows pretty soon).
and I feel that ignoring caching would be a
mistake.

Developing a cache to lie above all the other technology that is already
there would IMHO be the mistake. Better to simply add one more index in the
right place to your database, so your database manager can use the (cached)
index to better access your data. Or compress that 200K image you have down
to the 20K it should be.

Finally, what about all the other things that happen during a page access.
How many PHP files make up the page? (you do use include don't you?) How
many (correctly compressed) images are there? The CSS files? Javascript? And
where do these files live, after the first hit on your site? In the OS cache
:-)

Premature optmization.

Phew, it's now time for a quiet beer ;-)
--
Richard.
i am developing a system which relies heavily on multiple includes()
functions and has lots of sql calls/data.. so i saved my sql results
to a session array (so it only calls the results once and updates it
only when needed).. i thought all the sql would slow the page down a
lot (especially as I'm on a shared hosting server) but to my surprise
PHP is really really quick - i'm not even up to half a second yet :)

Jul 16 '07 #8
Jim
as there's more than once instance where I believe caching of data
would come in handy. At the moment I'm not seeing issues with with
performance, however on accessing a page I may have 5 or 6 db
statements accessing data which rarely changes (i.e. site structure,
data dictionary)

So, those things will most likely be in the OS cache, if they come from a
database (after the first hit on your page, that is). If *you* re-cache them
in memory then you are defeating the OS or the database cache, since your
memory cache will use up memory that they may have been able to use. And,
I'll bet, they are better at cache algorithms than you, or I, are :-)
I understand what you're saying but when you consider that each time I
display a nav I need to execute a recursive function which may result
in many calls to the database then I struggle to believe that it'll be
anywhere near as quick as retrieving an array from shared memory, even
if the entire database if cached....I'll have to perform some tests.
and it feels like I'm making the db work un-
necessarily when the data could be cached.

Cross purposes. The db is working from cached data. Why layer another
caching system over the top of that?
Because there's a fair amount of php code executed to build the multi-
dimensional arrays that represent the site structure and data
dictionary. It would save the execution of this code every time.

I think I need to run some tests and see what happens. I'll give
eAccelerator a go for now.

Cheers,

Jim.

and I feel that ignoring caching would be a
mistake.

Developing a cache to lie above all the other technology that is already
there would IMHO be the mistake. Better to simply add one more index in the
right place to your database, so your database manager can use the (cached)
index to better access your data. Or compress that 200K image you have down
to the 20K it should be.

Jul 16 '07 #9
Jim
i am developing a system which relies heavily on multiple includes()
functions and has lots of sql calls/data.. so i saved my sql results
to a session array (so it only calls the results once and updates it
only when needed)..
I'm effectively doing that at the moment but I'd like to take it one
step further with shared memory. Did you notice that you're system was
slower before caching the mysql results?

Thanks,

Jim.

Jul 16 '07 #10
Rik
On Mon, 16 Jul 2007 13:21:39 +0200, Jim <ji***@yahoo.comwrote:
I understand what you're saying but when you consider that each time I
display a nav I need to execute a recursive function which may result
in many calls to the database then I struggle to believe that it'll be
anywhere near as quick as retrieving an array from shared memory, even
if the entire database if cached....I'll have to perform some tests.
It should not have to be like that. If you have an adjacency model, what
about this (bogus code, unchecked):

$navs = mysql_query('SELECT id, name, parent FROM table');
$pages = array();
while($page = mysql_fetch_assoc($navs)){
$pages[$page['id']] = $page;
}
foreach($pages as $id =$page){
//Are 'root'-nodes with parent = 0 or NULL? both will be taken into
account:
$parent = ($page['parent'] 0) ? $page['parent'] : 0;
//may be unneccesary, but I like to be straight:
if(!isset($pages[$parent]['childs']) $pages[$parent]['childs'] = array();
//reference it in the parent:
$pages[$parent]['childs'][] =& $pages[$id];
}
print_r($pages[0]['childs']);

1 query, some fiddling with references, and you're done, be very wary for
recursion in your tree though.

Offcourse, you could always try a nested set, might be more appropriate
for a navigation:
http://dev.mysql.com/tech-resources/...ical-data.html
and it feels like I'm making the db work un-
necessarily when the data could be cached.

Cross purposes. The db is working from cached data. Why layer another
caching system over the top of that?

Because there's a fair amount of php code executed to build the multi-
dimensional arrays that represent the site structure and data
dictionary. It would save the execution of this code every time.
Possibly go for the simple solution of saving the tree once on changes in
the backend with var_export(), and just calling that from somewhere (file,
db, etc)?

However, as others have said, only employ this kind of thing when you
think your server isn't coping right now/takes to long to complete a page.
If it's allright without chaching, don't bother with it.
--
Rik Wasmus
Jul 16 '07 #11

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

2
by: Vinay Aggarwal | last post by:
I have been thinking about the lazy initialization and double checked locking problem. This problem is explain in detail here http://www.cs.umd.edu/~pugh/java/memoryModel/DoubleCheckedLocking.html...
4
by: Michael Chermside | last post by:
Ype writes: > For the namespaces in Jython this 'Python internal thread safety' > is handled by the Java class: > > http://www.jython.org/docs/javadoc/org/python/core/PyStringMap.html > > which...
12
by: Alban Hertroys | last post by:
Good day, I have a number of threads doing inserts in a table, after which I want to do a select. This means that it will occur that the row that I want to select is locked (by the DB). In these...
3
by: Morten Wennevik | last post by:
Hi, I have a MDIChild with static properties that can be updated at any time by the MDIParent. These properties may also at unknown times be accessed by one or more worker threads. Now, I'm...
16
by: akantrowitz | last post by:
In csharp, what is the correct locking around reading and writing into a hashtable. Note that the reader is not looping through the keys, simply reading an item out with a specific key: If i...
7
by: Shak | last post by:
Hi all, I'm trying to write a thread-safe async method to send a message of the form (type)(contents). My model is as follows: private void SendMessage(int type, string message) { //lets...
6
by: shaanxxx | last post by:
I have global variable which is being shared between threads (problem is not connected with thread). Without using any mutex i have do some operation global variable in *consistent* way. ...
5
by: Chris Mullins | last post by:
I've spent some time recently looking into optimizing some memory usage in our products. Much of this was doing through the use of string Interning. I spent the time and checked numbers in both x86...
4
by: Mark S. | last post by:
Much to my surprised the code below compiled and ran. I just don't know enough about threading to know for sure if this is too good to be true. I'm attempting to isolate the Hashtable lock to...
15
by: Matt Brandt | last post by:
I am trying to get multiple threads to lock specific regions of a file. However, since each thread has the same PID, it appears that a lock by one thread does not block another thread from the same...
0
by: taylorcarr | last post by:
A Canon printer is a smart device known for being advanced, efficient, and reliable. It is designed for home, office, and hybrid workspace use and can also be used for a variety of purposes. However,...
0
by: Charles Arthur | last post by:
How do i turn on java script on a villaon, callus and itel keypad mobile phone
0
by: aa123db | last post by:
Variable and constants Use var or let for variables and const fror constants. Var foo ='bar'; Let foo ='bar';const baz ='bar'; Functions function $name$ ($parameters$) { } ...
0
by: ryjfgjl | last post by:
If we have dozens or hundreds of excel to import into the database, if we use the excel import function provided by database editors such as navicat, it will be extremely tedious and time-consuming...
0
by: ryjfgjl | last post by:
In our work, we often receive Excel tables with data in the same format. If we want to analyze these data, it can be difficult to analyze them because the data is spread across multiple Excel files...
0
by: emmanuelkatto | last post by:
Hi All, I am Emmanuel katto from Uganda. I want to ask what challenges you've faced while migrating a website to cloud. Please let me know. Thanks! Emmanuel
1
by: Sonnysonu | last post by:
This is the data of csv file 1 2 3 1 2 3 1 2 3 1 2 3 2 3 2 3 3 the lengths should be different i have to store the data by column-wise with in the specific length. suppose the i have to...
0
by: Hystou | last post by:
There are some requirements for setting up RAID: 1. The motherboard and BIOS support RAID configuration. 2. The motherboard has 2 or more available SATA protocol SSD/HDD slots (including MSATA, M.2...
0
by: Hystou | last post by:
Most computers default to English, but sometimes we require a different language, especially when relocating. Forgot to request a specific language before your computer shipped? No problem! You can...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.