472,805 Members | 1,160 Online
Bytes | Software Development & Data Engineering Community
Post Job

Home Posts Topics Members FAQ

Join Bytes to post your question to a community of 472,805 software developers and data experts.

Image and text: howto see the difference

Hi,
Depending on if I get an image or a text of a certain URL, I want to do
something different. I don't know in advance whether I'll get an image
or a text.

This is a URL that returns an image:
http://indicator.amessage.info/indic...mp;param4=.png

This is one that returns text:
http://indicator.amessage.info/indic...mp;param4=.png
How could I see the difference between the 2 with PHP code?
Hoping that sb. can get me out of this,
greetings,
Mattias

Jul 17 '05 #1
13 2969
Mattias Campe wrote:

Hi,

Depending on if I get an image or a text of a certain URL, I want to do
something different. I don't know in advance whether I'll get an image
or a text.

This is a URL that returns an image:
http://indicator.amessage.info/indic...mp;param4=.png

This is one that returns text:
http://indicator.amessage.info/indic...mp;param4=.png

How could I see the difference between the 2 with PHP code?


Use fsockopen and a regex to search for Content-Type in the header.

Regards,
Shawn
--
Shawn Wilson
sh***@glassgiant.com
http://www.glassgiant.com
Jul 17 '05 #2
The link look the same, only difference i there is two "o" in coobnet i nthe
second .. ???.... if so, you can do if ($_GET["param1"] ==
"coobnet%40jabber.org") {...} else {}

If it was just a mistaping then use the Content-Type in header();

Savut

"Mattias Campe" <Ma******************************@UGent.be> wrote in message
news:bq**********@gaudi2.UGent.be...
Hi,
Depending on if I get an image or a text of a certain URL, I want to do
something different. I don't know in advance whether I'll get an image
or a text.

This is a URL that returns an image:
http://indicator.amessage.info/indic...mp;param4=.png
This is one that returns text:
http://indicator.amessage.info/indic...mp;param4=.png

How could I see the difference between the 2 with PHP code?
Hoping that sb. can get me out of this,
greetings,
Mattias

Jul 17 '05 #3
Shawn Wilson wrote:
Mattias Campe wrote:
Hi,

Depending on if I get an image or a text of a certain URL, I want to do
something different. I don't know in advance whether I'll get an image
or a text.

This is a URL that returns an image:
http://indicator.amessage.info/indic...mp;param4=.png

This is one that returns text:
http://indicator.amessage.info/indic...mp;param4=.png

How could I see the difference between the 2 with PHP code?

Use fsockopen and a regex to search for Content-Type in the header.


thx a lot! After figuring out how fsockopen worked, I managed to make it
work like this:

$fp=fsockopen("indicator.amessage.info",80, $errno, $errstr);
if (!$fp) {
echo "$errstr ($errno)<br>\n";
} else {
fputs ($fp, "HEAD
/indicator.php?param1=cobnet%40jabber.org&amp;param 2=bounce&amp;param3=http%3A%2F%2Fstudent.ugent.be% 2Fastrid%2Fpics%2Fjabber%2F&amp;param4=.png
HTTP/1.0\r\nHost: indicator.amessage.info\r\n\r\n");
$string = "";
while (!feof($fp)) {
$string = $string.fgets ($fp,128);
}
echo $string;
strstr($string,"Content-Type: image/png");
if (strpos($string,"Content-Type: image/png") != "")
echo "We have an image";
else
echo "We don't have an image";
fclose ($fp);
}
Do you think it looks good? I don't want to do "an attack" on that
server by a stupid mistake :-)

Greetings,
Mattias Campe

Jul 17 '05 #4
Savut wrote:
The link look the same, only difference i there is two "o" in coobnet i nthe
second .. ???.... if so, you can do if ($_GET["param1"] ==
"coobnet%40jabber.org") {...} else {}


Well, it could be that coobnet%40jabber.org is also correct :-). Maybe I
wasn't too clear, but I don't know by looking at the param1 whether
there will be an image yes or no.

Still, thx for the remark!

[...]

Jul 17 '05 #5
Mattias Campe wrote:

Shawn Wilson wrote:
Mattias Campe wrote:
Hi,

Depending on if I get an image or a text of a certain URL, I want to do
something different. I don't know in advance whether I'll get an image
or a text.

This is a URL that returns an image:
http://indicator.amessage.info/indic...mp;param4=.png

This is one that returns text:
http://indicator.amessage.info/indic...mp;param4=.png

How could I see the difference between the 2 with PHP code?

Use fsockopen and a regex to search for Content-Type in the header.


thx a lot! After figuring out how fsockopen worked, I managed to make it
work like this:

$fp=fsockopen("indicator.amessage.info",80, $errno, $errstr);
if (!$fp) {
echo "$errstr ($errno)<br>\n";
} else {
fputs ($fp, "HEAD
/indicator.php?param1=cobnet%40jabber.org&amp;param 2=bounce&amp;param3=http%3A%2F%2Fstudent.ugent.be% 2Fastrid%2Fpics%2Fjabber%2F&amp;param4=.png
HTTP/1.0\r\nHost: indicator.amessage.info\r\n\r\n");
$string = "";
while (!feof($fp)) {
$string = $string.fgets ($fp,128);
}
echo $string;
strstr($string,"Content-Type: image/png");
if (strpos($string,"Content-Type: image/png") != "")
echo "We have an image";
else
echo "We don't have an image";
fclose ($fp);
}

Do you think it looks good? I don't want to do "an attack" on that
server by a stupid mistake :-)


Just a few observations:

There is nothing in that code that would constitute an attack in itself. If,
however, you put that code in a poorly though-out loop, you could inadvertently
"attack" a site. But I don't know what you're doing with the code, so this may
be moot.
With the code shown, any page with the text "Content-Type: image/png" in it will
claim it's an image. This could be a problem if you're building a bot to crawl
the web. Websites with code examples like php.net, Google groups, devshed, etc.
would occasionally have that string in the text. Again, I don't know the
intended use, but if it's similar to that just described, you might want to
adjust your code to something like the following (The regular expression may or
may not work, I haven't tested it). It should display only the first instance
of the Content-Type text. It also has the advantage that it doesn't go through
the entire file. Just enough to determine the type.

while (!feof($fp)) {
$string .= fgets ($fp,128);
if (preg_match("/^Content-Type: ([^\r\n]+)[\r|\n]/", $string, $matches)) {
echo "File is type: " . $matches[1];
fclose ($fp);
exit();
}
}
fclose ($fp);

I don't know if the headers are case- and whitespace- sensitive or not. In
other words, if it's possible to have

"content-type: image/png"

you'll have to adjust the regex.
And it's a little thing, but you can use "$variable .= $somestring" instead of
"$variable = $variable . $somestring"

Regards,
Shawn

--
Shawn Wilson
sh***@glassgiant.com
http://www.glassgiant.com
Jul 17 '05 #6
Shawn Wilson wrote:
[...]
Just a few observations:

There is nothing in that code that would constitute an attack in itself. If,
however, you put that code in a poorly though-out loop, you could inadvertently
"attack" a site. But I don't know what you're doing with the code, so this may
be moot.
It will come in a loop, but I'll take care that it's not an infinite one
;) ...
With the code shown, any page with the text "Content-Type: image/png" in it will
claim it's an image.
That should be "any *heading* of a page" ;p, because I'm not asking for
a GET but for HEAD...

Still I like the principle of your code more :D, only one problem: the
regexp doesn't work (it never matches) and I can't figure out why :-s.

Another question: is the exit() obligated?

[...] while (!feof($fp)) {
$string .= fgets ($fp,128);
if (preg_match("/^Content-Type: ([^\r\n]+)[\r|\n]/", $string, $matches)) {
echo "File is type: " . $matches[1];
fclose ($fp);
exit();
}
}
fclose ($fp);

I don't know if the headers are case- and whitespace- sensitive or not. In
other words, if it's possible to have

"content-type: image/png"

you'll have to adjust the regex.
Normally it's like "Content-Type: text/html"
And it's a little thing, but you can use "$variable .= $somestring" instead of
"$variable = $variable . $somestring"
thx for the hint!
Regards,
Shawn


Greetings,
Mattias

Jul 17 '05 #7
> [...]
With the code shown, any page with the text "Content-Type: image/png" in it will
claim it's an image.
That should be "any *heading* of a page" ;p, because I'm not asking for
a GET but for HEAD...


Whoops, didn't notice that. My mistake.
Still I like the principle of your code more :D, only one problem: the
regexp doesn't work (it never matches) and I can't figure out why :-s.
Yeah, I wasn't really confident in it. Looking at it again, I put the "^" in
there, which means it would only match if the first header was Content-Type.
Take that out and it may work (too busy to test).
Another question: is the exit() obligated?
You can take that or leave it. I had it in there so the entire page wouldn't be
opened, while I was thinking you were using GET. You might want to break the
while loop, though, to save a bit of time.
[...]
while (!feof($fp)) {
$string .= fgets ($fp,128);
if (preg_match("/^Content-Type: ([^\r\n]+)[\r|\n]/", $string, $matches)) {
echo "File is type: " . $matches[1];
fclose ($fp);
exit();
}
}
fclose ($fp);


Regards,
Shawn
--
Shawn Wilson
sh***@glassgiant.com
http://www.glassgiant.com
Jul 17 '05 #8
Shawn Wilson wrote:
[...]
With the code shown, any page with the text "Content-Type: image/png" in it will
claim it's an image.


That should be "any *heading* of a page" ;p, because I'm not asking for
a GET but for HEAD...


Whoops, didn't notice that. My mistake.


np ;-)
Still I like the principle of your code more :D, only one problem: the
regexp doesn't work (it never matches) and I can't figure out why :-s.


Yeah, I wasn't really confident in it. Looking at it again, I put the "^" in
there, which means it would only match if the first header was Content-Type.
Take that out and it may work (too busy to test).


Indeed, it works, great!!! But it seems that the Content-Type can
*sometimes* be "text/plain;charset=UTF-8", but then I only need
"text/plain", I tried "/Content-Type: ([^\r\n]+)[\r|\n];/", but
apperently I don't understand much of those regexps :-s.

If you don't have the time to look for the ";", no problem, I'm already
very(, very) glad that you could help this far!
Another question: is the exit() obligated?


You can take that or leave it. I had it in there so the entire page wouldn't be
opened, while I was thinking you were using GET. You might want to break the
while loop, though, to save a bit of time.


Okay, ic, you're right. But I changed it to "break;" because there still
some code after the while. It still saves a bit of time :).

Greetings,
Mattias

Jul 17 '05 #9
Just grab the file and see what you have. Image files will inevitably have
chr(0) while text files (the exception being UCS16 encode ones) will never
have it. Hence the following:

$url1 =
"http://indicator.amessage.info/indicator.php?param1=cobnet%40jabber.org&amp
;param2=bounce&amp;param3=http%3A%2F%2Fstudent.uge nt.be%2Fastrid%2Fpics%2Fja
bber%2F&amp;param4=.png";
$url2 =
"http://indicator.amessage.info/indicator.php?param1=coobnet%40jabber.org&am
p;param2=bounce&amp;param3=http%3A%2F%2Fstudent.ug ent.be%2Fastrid%2Fpics%2Fj
abber%2F&amp;param4=.png";

function TasteTest($url) {
$data = file_get_contents($url);
return strchr($data, "\x00") ? "Image" : "Text";
}

echo TasteTest($url1); echo "<br>";
echo TasteTest($url2); echo "<br>";

Uzytkownik "Mattias Campe" <Ma******************************@UGent.be>
napisal w wiadomosci news:bq**********@gaudi2.UGent.be...
Hi,
Depending on if I get an image or a text of a certain URL, I want to do
something different. I don't know in advance whether I'll get an image
or a text.

This is a URL that returns an image:
http://indicator.amessage.info/indic...abber.org&amp;
param2=bounce&amp;param3=http%3A%2F%2Fstudent.ugen t.be%2Fastrid%2Fpics%2Fjab
ber%2F&amp;param4=.png
This is one that returns text:
http://indicator.amessage.info/indic...jabber.org&amp
;param2=bounce&amp;param3=http%3A%2F%2Fstudent.uge nt.be%2Fastrid%2Fpics%2Fja
bber%2F&amp;param4=.png

How could I see the difference between the 2 with PHP code?
Hoping that sb. can get me out of this,
greetings,
Mattias

Jul 17 '05 #10
Chung Leong wrote:
Just grab the file and see what you have. Image files will inevitably have
chr(0) while text files (the exception being UCS16 encode ones) will never
have it. Hence the following: [...] $data = file_get_contents($url);


Damn: although this is the most short solution, it appears that I need
PHP 4 >= 4.3.0 for file_get_contents and I have 4.1.2 :-s (which I can't
change, because I don't have the rights). Thx anyway!

Greetings,
Mattias

Jul 17 '05 #11
> >>Still I like the principle of your code more :D, only one problem: the
regexp doesn't work (it never matches) and I can't figure out why :-s.


Yeah, I wasn't really confident in it. Looking at it again, I put the "^" in
there, which means it would only match if the first header was Content-Type.
Take that out and it may work (too busy to test).


Indeed, it works, great!!! But it seems that the Content-Type can
*sometimes* be "text/plain;charset=UTF-8", but then I only need
"text/plain", I tried "/Content-Type: ([^\r\n]+)[\r|\n];/", but
apperently I don't understand much of those regexps :-s.

If you don't have the time to look for the ";", no problem, I'm already
very(, very) glad that you could help this far!


Try "/Content-Type: ([^\r\n;]+)(;[^\r\n]*)?[\r|\n];/"
or "/Content-Type: ([^\r\n\;]+)(;[^\r\n]*)?[\r|\n];/"
I think one of those should work, though they're untested. I can't remember if
you have to escape the ";" or not...

Regards,
Shawn

--
Shawn Wilson
sh***@glassgiant.com
http://www.glassgiant.com
Jul 17 '05 #12
Shawn Wilson wrote:
Still I like the principle of your code more :D, only one problem: the
regexp doesn't work (it never matches) and I can't figure out why :-s.

Yeah, I wasn't really confident in it. Looking at it again, I put the "^" in
there, which means it would only match if the first header was Content-Type.
Take that out and it may work (too busy to test).


Indeed, it works, great!!! But it seems that the Content-Type can
*sometimes* be "text/plain;charset=UTF-8", but then I only need
"text/plain", I tried "/Content-Type: ([^\r\n]+)[\r|\n];/", but
apperently I don't understand much of those regexps :-s.

If you don't have the time to look for the ";", no problem, I'm already
very(, very) glad that you could help this far!

Try "/Content-Type: ([^\r\n;]+)(;[^\r\n]*)?[\r|\n];/"
or "/Content-Type: ([^\r\n\;]+)(;[^\r\n]*)?[\r|\n];/"
I think one of those should work, though they're untested. I can't remember if
you have to escape the ";" or not...


It doesn't work, but that's okay, it will only be "less good code" ;),
but it will work. Thanks a lot for the help you offered!!!

Greetings,
Mattias

Jul 17 '05 #13
Shawn Wilson wrote:
I don't know if the headers are case- and whitespace- sensitive or not.


Parts of them might be. In the case of the Content-Type header, only
parameter attribute values may be case sensitive; all other
components, such as MIME media types and subtypes, and parameter
attribute names are case insensitive. Header field names are always
case insensitive.

But I'm afraid it's even more complicated than that. Whitespace, or
rather LWS (Linear White Space -- a CRLF sequence followed by either
one or more spaces or tabs), can appear in certain places and must
not appear in others. The ABNF of RFC2616, your handy authoritative
source, explains it all. It's spread out over numerous sections
though, so there'll be a lot of flicking back and forth.

Here's what I came up with to match Content-Type headers:

$LWS = '(?:(?:\r\n)?(?:\x20|\x9)+)';
$CHAR = '[\x-\x7f]';
$TEXT = "[^\x-\x1f\x7f]|$LWS";
$TOKEN = '[^\x-\x1f()<>@,;:\\/\"[\]?={}\x20\x9]+';
$QDTEXT = '(?:[^\x-\x1f\\x7f"]|$LWS)';
$QUOTEDPAIR = '\\$CHAR';
$QUOTEDSTRING = "(?:\"(?:$QDTEXT|$QUOTEDPAIR)*\")";

preg_match_all(
"`^content-type: $LWS* $TOKEN/$TOKEN $LWS*
(?:;$LWS*$TOKEN=(?:$TOKEN|$QUOTEDSTRING)$LWS*)*`mi x",
$string,
$matches);

This does strip some whitespace in some circumstances. Although the
captured string may not be identical to the actual header in terms
of whitespace, the semantics will be the same.

There are likely to be ghastly solecisms in the above as I haven't
thoroughly read through it, tested it, or taken any great time to
throw it together. But the reader should get the general idea. ;-)

--
Jock
Jul 17 '05 #14

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

2
by: Mathias | last post by:
Dear NG, I currently ty to switch from matlab to python/scipy but have a lot of trouble with images. What I need is a function for subsequently displaying a number of 2D-matrices as an image. I...
4
by: Kenny | last post by:
I have been trying to write a script that will increase the size of and image when you mouse over it, and decrease it to original size, when you mouse out. After a couple of attempts, this is what...
6
by: Haines Brown | last post by:
I find that when I use list-style-image with galeon or mozilla, padding is inserted between the symbol image and the following list text, while under IE 5.0 it seems to be inserted before the image...
10
by: John Smith | last post by:
I know that uploading an image to a database has been covered, oh, about 3 trillion times. However, I haven't found anything covering uploading to a MySQL database with .net. Please don't...
15
by: David Lozzi | last post by:
Howdy, I have a function that uploads an image and that works great. I love ..Nets built in upload, so much easier than 3rd party uploaders! Now I am making a public function that will take the...
7
by: needin4mation | last post by:
Hi, I have an Access 2002 - 2003 database. I am using Access 2003. Whenever I link an image all it shows is the filename. Not the image. Other versions of Access can link the image just fine. ...
4
by: tshad | last post by:
I am trying to set up an Image authorization where you type in the value that is in a picture to log on to our site. I found a program that is supposed to do it, but it doesn't seem to work. ...
4
by: Nicolas | last post by:
Hello! I have a question: I have a database in MySql that has two fields (code and image). Now, I want to create a WebService that takes the image of the most current row (the code is the date...
3
by: Ganesh | last post by:
Hi There, Could you some tell me the difference between Image and ImageButton, which one do i need to use if i want to show some picture on the html Thanks Ganapathi
2
isladogs
by: isladogs | last post by:
The next Access Europe meeting will be on Wednesday 2 August 2023 starting at 18:00 UK time (6PM UTC+1) and finishing at about 19:15 (7.15PM) The start time is equivalent to 19:00 (7PM) in Central...
0
by: erikbower65 | last post by:
Using CodiumAI's pr-agent is simple and powerful. Follow these steps: 1. Install CodiumAI CLI: Ensure Node.js is installed, then run 'npm install -g codiumai' in the terminal. 2. Connect to...
2
isladogs
by: isladogs | last post by:
The next Access Europe meeting will be on Wednesday 6 Sept 2023 starting at 18:00 UK time (6PM UTC+1) and finishing at about 19:15 (7.15PM) The start time is equivalent to 19:00 (7PM) in Central...
14
DJRhino1175
by: DJRhino1175 | last post by:
When I run this code I get an error, its Run-time error# 424 Object required...This is my first attempt at doing something like this. I test the entire code and it worked until I added this - If...
0
by: Rina0 | last post by:
I am looking for a Python code to find the longest common subsequence of two strings. I found this blog post that describes the length of longest common subsequence problem and provides a solution in...
5
by: DJRhino | last post by:
Private Sub CboDrawingID_BeforeUpdate(Cancel As Integer) If = 310029923 Or 310030138 Or 310030152 Or 310030346 Or 310030348 Or _ 310030356 Or 310030359 Or 310030362 Or...
0
by: lllomh | last post by:
How does React native implement an English player?
0
by: Mushico | last post by:
How to calculate date of retirement from date of birth
2
by: DJRhino | last post by:
Was curious if anyone else was having this same issue or not.... I was just Up/Down graded to windows 11 and now my access combo boxes are not acting right. With win 10 I could start typing...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.