Is there some way of using ereg to detect when certain filename extensions
are supplied and to return false if so, WITHOUT using the ! operator
before ereg () ?
I have an API that allows as an input a regular expression, enabling the
administrator to ensure a file upload matches a certain pattern. For
instance, supplying the string
'.exe$|.com$|.bat$|.zip$|.doc$'
means that the file must end with any of these five extensions.
Is there a way that the regexp could be rewritten to say that the file
must NOT end with any of these, without changing the ereg to !ereg - I
can't do the latter because it's within the class.
Any ideas?
Martin 3 20957
On Mon, 26 Jan 2004 18:12:59 +0000, Martin Lucas-Smith <mv***@cam.ac.uk> wrote: Is there some way of using ereg to detect when certain filename extensions are supplied and to return false if so, WITHOUT using the ! operator before ereg () ?
I have an API that allows as an input a regular expression, enabling the administrator to ensure a file upload matches a certain pattern. For instance, supplying the string
'.exe$|.com$|.bat$|.zip$|.doc$'
means that the file must end with any of these five extensions.
Is there a way that the regexp could be rewritten to say that the file must NOT end with any of these, without changing the ereg to !ereg - I can't do the latter because it's within the class.
Not neatly; that'd require a negative lookahead assertion, which is only
supported in Perl-compatible regexes. Or just using ! ... ;-p
I suppose you could take the perverse approach of enumerating all other
three-letter extensions, except those. So, have a series of three character
classes containing all but the 1st, 2nd then 3rd character of each extension.
But you could only check one extension at a time; if you had an alternation,
it'd always match (if it doesn't match the complement of one extension's three
characters, then it must match on of the other patterns).
e.g. for matching extensions except .exe, letting 0,1,2 and 4+ letter
extensions through:
\.[^eE][^xX][^eE]$|\..{0,2}$|\..{4,}$
(yuk!)
--
Andy Hassall <an**@andyh.co.uk> / Space: disk usage analysis tool
<http://www.andyh.co.uk> / <http://www.andyhsoftware.co.uk/space>
Andy Hassall wrote: On Mon, 26 Jan 2004 18:12:59 +0000, Martin Lucas-Smith <mv***@cam.ac.uk> wrote:
Is there some way of using ereg to detect when certain filename extensions are supplied and to return false if so, WITHOUT using the ! operator before ereg () ?
Not neatly; that'd require a negative lookahead assertion, which is only supported in Perl-compatible regexes.
But wouldn't it require more than one assertion? You can't merely
apply one negative lookahead assertion to the characters following a
FULL STOP, because if the filename contains more than one FULL STOP,
and the characters after the last FULL STOP constitute a forbidden
extension, the pattern would match. For example, imagining "exe" is
the only forbidden extension, then
$string = 'foo.bar.exe';
if (preg_match('`\.(?!exe$)`i',$string))
would return true, since there is present a FULL STOP that isn't
immediately followed by the anchored character sequence "exe".
What you'd need to do if you want to check filename extensions would
be to apply two assertions: one positive lookahead assertion, making
sure the characters following the FULL STOP are at the end of the
string, ensuring that you're dealing with the filename extension and
not another part of the filename; and one negative lookahead
assertion, making sure those characters don't constitute a forbidden
extension. Now, for one- to four-letter extensions,
$string = 'foo.bar.exe';
if (preg_match('`\.(?=[a-z]{1,4}$)(?!exe)`i',$string))
, where the character class denotes possible characters in filename
extensions, will return false.
That's all hypothetical of course, because we're saved by the NOT
operator. Please castigate me for any errors.
--
Jock
On Mon, 26 Jan 2004 22:14:57 -0000, John Dunlop <jo*********@johndunlop.info>
wrote: Andy Hassall wrote:
On Mon, 26 Jan 2004 18:12:59 +0000, Martin Lucas-Smith <mv***@cam.ac.uk> wrote:
>Is there some way of using ereg to detect when certain filename extensions >are supplied and to return false if so, WITHOUT using the ! operator >before ereg () ? Not neatly; that'd require a negative lookahead assertion, which is only supported in Perl-compatible regexes.
But wouldn't it require more than one assertion? You can't merely apply one negative lookahead assertion to the characters following a FULL STOP, because if the filename contains more than one FULL STOP, and the characters after the last FULL STOP constitute a forbidden extension, the pattern would match. For example, imagining "exe" is the only forbidden extension, then
$string = 'foo.bar.exe'; if (preg_match('`\.(?!exe$)`i',$string))
would return true, since there is present a FULL STOP that isn't immediately followed by the anchored character sequence "exe".
What you'd need to do if you want to check filename extensions would be to apply two assertions: one positive lookahead assertion, making sure the characters following the FULL STOP are at the end of the string, ensuring that you're dealing with the filename extension and not another part of the filename; and one negative lookahead assertion, making sure those characters don't constitute a forbidden extension. Now, for one- to four-letter extensions,
$string = 'foo.bar.exe'; if (preg_match('`\.(?=[a-z]{1,4}$)(?!exe)`i',$string))
, where the character class denotes possible characters in filename extensions, will return false.
Indeed :-) Perhaps even, removing the 1-4 char restriction:
/\.(?=[^.]+$)(?!bad$|worse$|evil$)/i
i.e. a '.' followed by a sequence of one or more non-dots up to the end of the
string, where that sequence is not any of 'bad', 'evil' or 'worse', each
followed by end of string.
So putting it all together:
<pre>
<?php
$goodExts = array('c', 'h', 'jpeg', 'png', 'torrent', 'xyz', 'z');
$badExts = array('exe', 'com', 'bat', 'doc', 'vbscript', 'x', 'zyx');
$re = '/\.(?=[^.]+$)(?!' .
join('|',
array_map(create_function('$a', 'return $a."$";'),
$badExts)) .
')/i';
print("regex = $re\n\n");
$allExts = array_merge($goodExts, $badExts);
$fileNames = array('thingy', 'foo', 'weasel', 'earwig');
for ($i=0; $i<42; $i++) {
$str = $fileNames[array_rand($fileNames)];
for ($j=0; $j < mt_rand(1,3); $j++)
$str .= '.' . $allExts[array_rand($allExts)];
$matched = preg_match($re, $str);
printf("%-64s %s\n",
$str,
$matched ? 'match' : '<b>no match</b>');
}
?>
</pre>
It rejects files without an extension, though.
That's all hypothetical of course, because we're saved by the NOT operator. Please castigate me for any errors.
A single ! character vs. the insanity above... hmm.
--
Andy Hassall <an**@andyh.co.uk> / Space: disk usage analysis tool
<http://www.andyh.co.uk> / <http://www.andyhsoftware.co.uk/space> This thread has been closed and replies have been disabled. Please start a new discussion. Similar topics
by: Stefan Gangefors |
last post by:
I'm trying to figure out what I'm doing wrong when using ereg().
This is my regexp:
ereg("^]$", "]");
and that does'n work, but this does:
ereg("^$", "[");
|
by: Magnus Lie Hetland |
last post by:
I'm working on a project (Atox) where I need to match quite a few
regular expressions (several hundred) in reasonably large text files.
I've found that this can easily get rather slow. (There are...
|
by: Jane Doe |
last post by:
Hello,
I need to browse a list of hyperlinks, each followed by an
author, and remove the links only for certain authors.
1. I searched the archives on Google, but didn't find how to tell the...
|
by: news |
last post by:
I'm trying to make sure a form has only one or two digits in either of
two fields.
I looked at php.net and
http://www.regular-expressions.info/reference.html and this is what I
put together, but...
|
by: jasonkester |
last post by:
Just a heads up for anybody that comes across this in the future.
Noticed a strange behavior in RegExp.test() today. Check out the
following code. It will alternately display "chokes" and null,...
|
by: yawnmoth |
last post by:
Say I have the following script:
<?
$string = 'test';
if (eregi("^+$",$string))
{
echo 'matches!';
}
else
{
|
by: millw0rm |
last post by:
why i m getting error here???
Warning: ereg(): REG_BADBR
code:
if(!ereg("^(.){15,400}$",$string)) {
$errormsg = "- Must be more then 15 Characters & less then 400
Characters";
}
|
by: gentsquash |
last post by:
In a setting where I can specify only a JS regular
expression, but not the JS code that will use it, I seek
a regexp component that matches a string of letters,
ignoring case. E.g, for "cat" I'd...
|
by: ahgan |
last post by:
Hi
I'm new here. I'm trying to put some code to copy files with specific patterns/keywords from a target location. I encountered a pattern matching (regexp) issue where the filename didn't...
|
by: Charles Arthur |
last post by:
How do i turn on java script on a villaon, callus and itel keypad mobile phone
|
by: aa123db |
last post by:
Variable and constants
Use var or let for variables and const fror constants.
Var foo ='bar';
Let foo ='bar';const baz ='bar';
Functions
function $name$ ($parameters$) {
}
...
|
by: ryjfgjl |
last post by:
In our work, we often receive Excel tables with data in the same format. If we want to analyze these data, it can be difficult to analyze them because the data is spread across multiple Excel files...
|
by: emmanuelkatto |
last post by:
Hi All, I am Emmanuel katto from Uganda. I want to ask what challenges you've faced while migrating a website to cloud.
Please let me know.
Thanks!
Emmanuel
|
by: Sonnysonu |
last post by:
This is the data of csv file
1 2 3
1 2 3
1 2 3
1 2 3
2 3
2 3
3
the lengths should be different i have to store the data by column-wise with in the specific length.
suppose the i have to...
|
by: Hystou |
last post by:
There are some requirements for setting up RAID:
1. The motherboard and BIOS support RAID configuration.
2. The motherboard has 2 or more available SATA protocol SSD/HDD slots (including MSATA, M.2...
|
by: marktang |
last post by:
ONU (Optical Network Unit) is one of the key components for providing high-speed Internet services. Its primary function is to act as an endpoint device located at the user's premises. However,...
|
by: Hystou |
last post by:
Most computers default to English, but sometimes we require a different language, especially when relocating. Forgot to request a specific language before your computer shipped? No problem! You can...
|
by: Oralloy |
last post by:
Hello folks,
I am unable to find appropriate documentation on the type promotion of bit-fields when using the generalised comparison operator "<=>".
The problem is that using the GNU compilers,...
| |