473,386 Members | 1,798 Online
Bytes | Software Development & Data Engineering Community
Post Job

Home Posts Topics Members FAQ

Join Bytes to post your question to a community of 473,386 software developers and data experts.

is file_exists expensive in performance terms?


I've written some template code and one thing I'm trying to protect
against is references to images that don't exist. Because users have
the ability to muck around with the templates after everything's been
set up, there is a chance they'll delete an image or ruin some tag
after the web designer has set up everything perfectly. I want the
software to catch mistakes like that and, at the very least, not show
broken links. On the control panel sometimes as many as 100 thumbnails
are run for each page, I'm wondering if running file_exists on all of
those slows things down at all?

Jul 17 '05 #1
5 8604
On 2 Feb 2005 15:48:41 -0800, lk******@geocities.com wrote:
I've written some template code and one thing I'm trying to protect
against is references to images that don't exist. Because users have
the ability to muck around with the templates after everything's been
set up, there is a chance they'll delete an image or ruin some tag
after the web designer has set up everything perfectly. I want the
software to catch mistakes like that and, at the very least, not show
broken links. On the control panel sometimes as many as 100 thumbnails
are run for each page, I'm wondering if running file_exists on all of
those slows things down at all?


Clearly, yes - running file_exists() is slower than not running file_exists().
But that's not a useful thing to say. You've given good reasons why you should
be checking the existence of the files, so removing the call doesn't seem to be
an option. file_exists() is the PHP method for checking whether a file exists,
after all.

Along with the question about comments, you seem to be trying to optimise at
random; as another poster pointed out, you need to profile your code to measure
whether your file_exists() calls, or any other particular part of the script,
take up a significant proportion of your code's runtime before you start
thinking about optimising or removing them - there's not much point getting a
10% reduction in elapsed time on a part that takes 0.01% of the total runtime.

The cost of file_exists() depends on several factors; operating system, speed
of disk, whether the directory inode's in cache, filesystem type, number of
files in directory, etc.

--
Andy Hassall / <an**@andyh.co.uk> / <http://www.andyh.co.uk>
<http://www.andyhsoftware.co.uk/space> Space: disk usage analysis tool
Jul 17 '05 #2
"lkrubner" wrote:
I’ve written some template code and one thing I’m trying
to protect
against is references to images that don’t exist. Because users
have
the ability to muck around with the templates after everything’s
been
set up, there is a chance they’ll delete an image or ruin some
tag
after the web designer has set up everything perfectly. I want the
software to catch mistakes like that and, at the very least, not show
broken links. On the control panel sometimes as many as 100 thumbnailsare run for each page, I’m wondering if running file_exists on
all of
those slows things down at all?


Put a loop of 10000 iterations around file_exists, and see the timing.
Can’t get more accurate than that.

If too worried about it, perhaps you can put a reference to those
files inside a mysql table?

--
Posted using the http://www.dbforumz.com interface, at author's request
Articles individually checked for conformance to usenet standards
Topic URL: http://www.dbforumz.com/PHP-file_exi...ict194240.html
Visit Topic URL to contact author (reg. req'd). Report abuse: http://www.dbforumz.com/eform.php?p=657451
Jul 17 '05 #3
No. This won't be really accurate, because as the documentation says,
the results of this function are cached. So you have to make sure each
call is made on a different file. Moreover, the cost of the function
might be different if the file exists or not. It might also be
different if the directory contains 2 or 2000 files.

Jul 17 '05 #4
lk******@geocities.com wrote in news:1107388121.254255.123800
@z14g2000cwz.googlegroups.com:

I've written some template code and one thing I'm trying to protect
against is references to images that don't exist. Because users have
the ability to muck around with the templates after everything's been
set up, there is a chance they'll delete an image or ruin some tag
after the web designer has set up everything perfectly. I want the
software to catch mistakes like that and, at the very least, not show
broken links. On the control panel sometimes as many as 100 thumbnails
are run for each page, I'm wondering if running file_exists on all of
those slows things down at all?


After some local testing, I believe the answer is no. I built the following
script:

<?php

function getmicrotime(){
list($usec, $sec) = explode(' ', microtime());
return ((float)$usec + (float)$sec);
}

#Find out how long it takes to do nothing 1000 times
$start = getmicrotime();
for($i=0; $i<1000; $i++){
;; // Control - do nothing
}
$finish = getmicrotime();
$latency = sprintf("%.2f", ($finish - $start));
echo "It took $latency seconds to do nothing 1000 times.\n";

#Find out how long it takes to sleep 10 seconds
$start = getmicrotime();
sleep(10);
$finish = getmicrotime();
$latency = sprintf("%.2f", ($finish - $start));
echo "It took $latency seconds to sleep 10 seconds.\n";

#Find out how long it takes to generate 1000 random filenames
$start = getmicrotime();
for($i=0; $i<1000; $i++){
$foo = uniqid('');
}
$finish = getmicrotime();
$latency = sprintf("%.2f", ($finish - $start));
echo "It took $latency seconds to call uniqid() 1000 times.\n";

#Find out how long it takes to lookup 1000 random filenames
$start = getmicrotime();
for($i=0; $i<1000; $i++){
$foo = file_exists('/home/foo/' . uniqid(''));
}
$finish = getmicrotime();
$latency = sprintf("%.2f", ($finish - $start));
echo "It took $latency seconds to look up 1000 files.\n";

?>

The purpose of this script is to test how long it takes to do various
things 1,000 times. First, it tests doing nothing. Then, it tests sleeping
for 10 seconds as another control. Next, it generates 1,000 random
filenames using uniqid(). Finally, it runs file_exists() against 1,000
filenames randomly generated with uniqid().

Here is the output from a few runs on my local dev machine:

[root@winfosec phptest]# php test.php
It took 0.00 seconds to do nothing 1000 times.
It took 10.00 seconds to sleep 10 seconds.
It took 20.00 seconds to call uniqid() 1000 times.
It took 20.05 seconds to look up 1000 files.

[root@winfosec phptest]# php test.php
It took 0.00 seconds to do nothing 1000 times.
It took 10.01 seconds to sleep 10 seconds.
It took 20.04 seconds to call uniqid() 1000 times.
It took 20.06 seconds to look up 1000 files.

[root@winfosec phptest]# php test.php
It took 0.00 seconds to do nothing 1000 times.
It took 10.01 seconds to sleep 10 seconds.
It took 20.02 seconds to call uniqid() 1000 times.
It took 20.04 seconds to look up 1000 files.

What I'm really looking at here are the last two values. Test #3 judges the
generation of random strings via uniqid(''). Test #4 tests the time it
takes to perform file_exists() on random filenames generated via uniqid
(''). Test #4 takes approximately the same time as test #3. That tells me
that the file_exists() calls aren't using so much time as the uniqid('')
calls are.

file_exists() appears to be a pretty low-resource function, at least from
my results. This test was performed using PHP 4.3.10 on a FreeBSD 4.10
system, P3 600mhz, 40 megs of RAM. Better systems will no doubt give better
results. If your server is a bit more modern than my test machine, I would
suggest that you shouldn't have any problem calling file_exists() hundreds
or even thousands of times per execution.

hth
--

Bulworth : PHP/MySQL/Unix | Email : str_rot13('f@fung.arg');
--------------------------|---------------------------------
<http://www.phplabs.com/> | PHP scripts, webmaster resources
Jul 17 '05 #5

<lk******@geocities.com> wrote in message
news:11**********************@z14g2000cwz.googlegr oups.com...

I've written some template code and one thing I'm trying to protect
against is references to images that don't exist. Because users have
the ability to muck around with the templates after everything's been
set up, there is a chance they'll delete an image or ruin some tag
after the web designer has set up everything perfectly. I want the
software to catch mistakes like that and, at the very least, not show
broken links. On the control panel sometimes as many as 100 thumbnails
are run for each page, I'm wondering if running file_exists on all of
those slows things down at all?


The answer is yes and no. Yes, file_exists() is fairly expensive. Although
PHP does cache directory info, I do not believe the cache lives beyond the
lifetime of the request. At best, it's per thread cache, meaning requests
handled by other server threads would still end up hitting the OS.

The answer is also no, however, because whatever overhead will be small
compared to loading 100 thumbnails.
Jul 17 '05 #6

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

2
by: Andrew Crowe | last post by:
Hi guys, I've created this little function to check whether a user has uploaded a file with the same name as an existing file, and if so rename it to file-1.jpg, file-2.jpg etc. ...
10
by: Google Mike | last post by:
{NOTE: I have PHP 4.2.2 for RH9 Linux.} Anyone have a better file_exists() out there? Even if you use shell out tricks with Linux using the `command` trick, I'd be interested to see what you...
1
by: Hinrich Specht | last post by:
Hello, I have a problem using file_exists. I want to use file_exists to dertermine if a product-image is available or not to show either the product-image or a standard-image. This is the code:...
3
by: annie | last post by:
Hi I've got two linux boxes running redhat9 and either PHP Version 4.2.2 or PHP Version 4.3.7. The file structure is set up the same on both machines and both files exist. The problem is not...
6
by: teedilo | last post by:
We have an application with a SQL Server 2000 back end that is fairly database intensive -- lots of fairly frequent queries, inserts, updates -- the gamut. The application does not make use of...
20
by: Bob Sanderson | last post by:
This is my code: if (file_exists($Fname)) { echo "<td>$Fname exists</td>"; } else { echo "<td>$Fname does not exist</td>"; } $Fname is the full path to the file I'm trying to verify. When I...
31
by: Tom P. | last post by:
I am doing quite a bit of custom painting and it means I have to create a lot of brushes (think one for every file system object in a directory) per paint. How expensive is this? Should I find a...
3
by: rickcasey | last post by:
I'm using PHP 4.4.4-8 on Debian Linux on a low traffic site, and need to detect if a file exists, and if so, unlink it. I am mystified as to why the file_exists() function does not work in one...
8
by: SpaceMarine | last post by:
hello, my web app form has many DropDownLists that pull their content from a database. these calls are in a Business Access Layer, when first checks the context's Cache object for existing...
0
by: Charles Arthur | last post by:
How do i turn on java script on a villaon, callus and itel keypad mobile phone
0
by: ryjfgjl | last post by:
If we have dozens or hundreds of excel to import into the database, if we use the excel import function provided by database editors such as navicat, it will be extremely tedious and time-consuming...
0
by: ryjfgjl | last post by:
In our work, we often receive Excel tables with data in the same format. If we want to analyze these data, it can be difficult to analyze them because the data is spread across multiple Excel files...
0
by: emmanuelkatto | last post by:
Hi All, I am Emmanuel katto from Uganda. I want to ask what challenges you've faced while migrating a website to cloud. Please let me know. Thanks! Emmanuel
0
BarryA
by: BarryA | last post by:
What are the essential steps and strategies outlined in the Data Structures and Algorithms (DSA) roadmap for aspiring data scientists? How can individuals effectively utilize this roadmap to progress...
1
by: nemocccc | last post by:
hello, everyone, I want to develop a software for my android phone for daily needs, any suggestions?
1
by: Sonnysonu | last post by:
This is the data of csv file 1 2 3 1 2 3 1 2 3 1 2 3 2 3 2 3 3 the lengths should be different i have to store the data by column-wise with in the specific length. suppose the i have to...
0
by: Hystou | last post by:
There are some requirements for setting up RAID: 1. The motherboard and BIOS support RAID configuration. 2. The motherboard has 2 or more available SATA protocol SSD/HDD slots (including MSATA, M.2...
0
marktang
by: marktang | last post by:
ONU (Optical Network Unit) is one of the key components for providing high-speed Internet services. Its primary function is to act as an endpoint device located at the user's premises. However,...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.