473,405 Members | 2,287 Online
Bytes | Software Development & Data Engineering Community
Post Job

Home Posts Topics Members FAQ

Join Bytes to post your question to a community of 473,405 software developers and data experts.

Image and text: howto see the difference

Hi,
Depending on if I get an image or a text of a certain URL, I want to do
something different. I don't know in advance whether I'll get an image
or a text.

This is a URL that returns an image:
http://indicator.amessage.info/indic...mp;param4=.png

This is one that returns text:
http://indicator.amessage.info/indic...mp;param4=.png
How could I see the difference between the 2 with PHP code?
Hoping that sb. can get me out of this,
greetings,
Mattias

Jul 17 '05 #1
13 3012
Mattias Campe wrote:

Hi,

Depending on if I get an image or a text of a certain URL, I want to do
something different. I don't know in advance whether I'll get an image
or a text.

This is a URL that returns an image:
http://indicator.amessage.info/indic...mp;param4=.png

This is one that returns text:
http://indicator.amessage.info/indic...mp;param4=.png

How could I see the difference between the 2 with PHP code?


Use fsockopen and a regex to search for Content-Type in the header.

Regards,
Shawn
--
Shawn Wilson
sh***@glassgiant.com
http://www.glassgiant.com
Jul 17 '05 #2
The link look the same, only difference i there is two "o" in coobnet i nthe
second .. ???.... if so, you can do if ($_GET["param1"] ==
"coobnet%40jabber.org") {...} else {}

If it was just a mistaping then use the Content-Type in header();

Savut

"Mattias Campe" <Ma******************************@UGent.be> wrote in message
news:bq**********@gaudi2.UGent.be...
Hi,
Depending on if I get an image or a text of a certain URL, I want to do
something different. I don't know in advance whether I'll get an image
or a text.

This is a URL that returns an image:
http://indicator.amessage.info/indic...mp;param4=.png
This is one that returns text:
http://indicator.amessage.info/indic...mp;param4=.png

How could I see the difference between the 2 with PHP code?
Hoping that sb. can get me out of this,
greetings,
Mattias

Jul 17 '05 #3
Shawn Wilson wrote:
Mattias Campe wrote:
Hi,

Depending on if I get an image or a text of a certain URL, I want to do
something different. I don't know in advance whether I'll get an image
or a text.

This is a URL that returns an image:
http://indicator.amessage.info/indic...mp;param4=.png

This is one that returns text:
http://indicator.amessage.info/indic...mp;param4=.png

How could I see the difference between the 2 with PHP code?

Use fsockopen and a regex to search for Content-Type in the header.


thx a lot! After figuring out how fsockopen worked, I managed to make it
work like this:

$fp=fsockopen("indicator.amessage.info",80, $errno, $errstr);
if (!$fp) {
echo "$errstr ($errno)<br>\n";
} else {
fputs ($fp, "HEAD
/indicator.php?param1=cobnet%40jabber.org&amp;param 2=bounce&amp;param3=http%3A%2F%2Fstudent.ugent.be% 2Fastrid%2Fpics%2Fjabber%2F&amp;param4=.png
HTTP/1.0\r\nHost: indicator.amessage.info\r\n\r\n");
$string = "";
while (!feof($fp)) {
$string = $string.fgets ($fp,128);
}
echo $string;
strstr($string,"Content-Type: image/png");
if (strpos($string,"Content-Type: image/png") != "")
echo "We have an image";
else
echo "We don't have an image";
fclose ($fp);
}
Do you think it looks good? I don't want to do "an attack" on that
server by a stupid mistake :-)

Greetings,
Mattias Campe

Jul 17 '05 #4
Savut wrote:
The link look the same, only difference i there is two "o" in coobnet i nthe
second .. ???.... if so, you can do if ($_GET["param1"] ==
"coobnet%40jabber.org") {...} else {}


Well, it could be that coobnet%40jabber.org is also correct :-). Maybe I
wasn't too clear, but I don't know by looking at the param1 whether
there will be an image yes or no.

Still, thx for the remark!

[...]

Jul 17 '05 #5
Mattias Campe wrote:

Shawn Wilson wrote:
Mattias Campe wrote:
Hi,

Depending on if I get an image or a text of a certain URL, I want to do
something different. I don't know in advance whether I'll get an image
or a text.

This is a URL that returns an image:
http://indicator.amessage.info/indic...mp;param4=.png

This is one that returns text:
http://indicator.amessage.info/indic...mp;param4=.png

How could I see the difference between the 2 with PHP code?

Use fsockopen and a regex to search for Content-Type in the header.


thx a lot! After figuring out how fsockopen worked, I managed to make it
work like this:

$fp=fsockopen("indicator.amessage.info",80, $errno, $errstr);
if (!$fp) {
echo "$errstr ($errno)<br>\n";
} else {
fputs ($fp, "HEAD
/indicator.php?param1=cobnet%40jabber.org&amp;param 2=bounce&amp;param3=http%3A%2F%2Fstudent.ugent.be% 2Fastrid%2Fpics%2Fjabber%2F&amp;param4=.png
HTTP/1.0\r\nHost: indicator.amessage.info\r\n\r\n");
$string = "";
while (!feof($fp)) {
$string = $string.fgets ($fp,128);
}
echo $string;
strstr($string,"Content-Type: image/png");
if (strpos($string,"Content-Type: image/png") != "")
echo "We have an image";
else
echo "We don't have an image";
fclose ($fp);
}

Do you think it looks good? I don't want to do "an attack" on that
server by a stupid mistake :-)


Just a few observations:

There is nothing in that code that would constitute an attack in itself. If,
however, you put that code in a poorly though-out loop, you could inadvertently
"attack" a site. But I don't know what you're doing with the code, so this may
be moot.
With the code shown, any page with the text "Content-Type: image/png" in it will
claim it's an image. This could be a problem if you're building a bot to crawl
the web. Websites with code examples like php.net, Google groups, devshed, etc.
would occasionally have that string in the text. Again, I don't know the
intended use, but if it's similar to that just described, you might want to
adjust your code to something like the following (The regular expression may or
may not work, I haven't tested it). It should display only the first instance
of the Content-Type text. It also has the advantage that it doesn't go through
the entire file. Just enough to determine the type.

while (!feof($fp)) {
$string .= fgets ($fp,128);
if (preg_match("/^Content-Type: ([^\r\n]+)[\r|\n]/", $string, $matches)) {
echo "File is type: " . $matches[1];
fclose ($fp);
exit();
}
}
fclose ($fp);

I don't know if the headers are case- and whitespace- sensitive or not. In
other words, if it's possible to have

"content-type: image/png"

you'll have to adjust the regex.
And it's a little thing, but you can use "$variable .= $somestring" instead of
"$variable = $variable . $somestring"

Regards,
Shawn

--
Shawn Wilson
sh***@glassgiant.com
http://www.glassgiant.com
Jul 17 '05 #6
Shawn Wilson wrote:
[...]
Just a few observations:

There is nothing in that code that would constitute an attack in itself. If,
however, you put that code in a poorly though-out loop, you could inadvertently
"attack" a site. But I don't know what you're doing with the code, so this may
be moot.
It will come in a loop, but I'll take care that it's not an infinite one
;) ...
With the code shown, any page with the text "Content-Type: image/png" in it will
claim it's an image.
That should be "any *heading* of a page" ;p, because I'm not asking for
a GET but for HEAD...

Still I like the principle of your code more :D, only one problem: the
regexp doesn't work (it never matches) and I can't figure out why :-s.

Another question: is the exit() obligated?

[...] while (!feof($fp)) {
$string .= fgets ($fp,128);
if (preg_match("/^Content-Type: ([^\r\n]+)[\r|\n]/", $string, $matches)) {
echo "File is type: " . $matches[1];
fclose ($fp);
exit();
}
}
fclose ($fp);

I don't know if the headers are case- and whitespace- sensitive or not. In
other words, if it's possible to have

"content-type: image/png"

you'll have to adjust the regex.
Normally it's like "Content-Type: text/html"
And it's a little thing, but you can use "$variable .= $somestring" instead of
"$variable = $variable . $somestring"
thx for the hint!
Regards,
Shawn


Greetings,
Mattias

Jul 17 '05 #7
> [...]
With the code shown, any page with the text "Content-Type: image/png" in it will
claim it's an image.
That should be "any *heading* of a page" ;p, because I'm not asking for
a GET but for HEAD...


Whoops, didn't notice that. My mistake.
Still I like the principle of your code more :D, only one problem: the
regexp doesn't work (it never matches) and I can't figure out why :-s.
Yeah, I wasn't really confident in it. Looking at it again, I put the "^" in
there, which means it would only match if the first header was Content-Type.
Take that out and it may work (too busy to test).
Another question: is the exit() obligated?
You can take that or leave it. I had it in there so the entire page wouldn't be
opened, while I was thinking you were using GET. You might want to break the
while loop, though, to save a bit of time.
[...]
while (!feof($fp)) {
$string .= fgets ($fp,128);
if (preg_match("/^Content-Type: ([^\r\n]+)[\r|\n]/", $string, $matches)) {
echo "File is type: " . $matches[1];
fclose ($fp);
exit();
}
}
fclose ($fp);


Regards,
Shawn
--
Shawn Wilson
sh***@glassgiant.com
http://www.glassgiant.com
Jul 17 '05 #8
Shawn Wilson wrote:
[...]
With the code shown, any page with the text "Content-Type: image/png" in it will
claim it's an image.


That should be "any *heading* of a page" ;p, because I'm not asking for
a GET but for HEAD...


Whoops, didn't notice that. My mistake.


np ;-)
Still I like the principle of your code more :D, only one problem: the
regexp doesn't work (it never matches) and I can't figure out why :-s.


Yeah, I wasn't really confident in it. Looking at it again, I put the "^" in
there, which means it would only match if the first header was Content-Type.
Take that out and it may work (too busy to test).


Indeed, it works, great!!! But it seems that the Content-Type can
*sometimes* be "text/plain;charset=UTF-8", but then I only need
"text/plain", I tried "/Content-Type: ([^\r\n]+)[\r|\n];/", but
apperently I don't understand much of those regexps :-s.

If you don't have the time to look for the ";", no problem, I'm already
very(, very) glad that you could help this far!
Another question: is the exit() obligated?


You can take that or leave it. I had it in there so the entire page wouldn't be
opened, while I was thinking you were using GET. You might want to break the
while loop, though, to save a bit of time.


Okay, ic, you're right. But I changed it to "break;" because there still
some code after the while. It still saves a bit of time :).

Greetings,
Mattias

Jul 17 '05 #9
Just grab the file and see what you have. Image files will inevitably have
chr(0) while text files (the exception being UCS16 encode ones) will never
have it. Hence the following:

$url1 =
"http://indicator.amessage.info/indicator.php?param1=cobnet%40jabber.org&amp
;param2=bounce&amp;param3=http%3A%2F%2Fstudent.uge nt.be%2Fastrid%2Fpics%2Fja
bber%2F&amp;param4=.png";
$url2 =
"http://indicator.amessage.info/indicator.php?param1=coobnet%40jabber.org&am
p;param2=bounce&amp;param3=http%3A%2F%2Fstudent.ug ent.be%2Fastrid%2Fpics%2Fj
abber%2F&amp;param4=.png";

function TasteTest($url) {
$data = file_get_contents($url);
return strchr($data, "\x00") ? "Image" : "Text";
}

echo TasteTest($url1); echo "<br>";
echo TasteTest($url2); echo "<br>";

Uzytkownik "Mattias Campe" <Ma******************************@UGent.be>
napisal w wiadomosci news:bq**********@gaudi2.UGent.be...
Hi,
Depending on if I get an image or a text of a certain URL, I want to do
something different. I don't know in advance whether I'll get an image
or a text.

This is a URL that returns an image:
http://indicator.amessage.info/indic...abber.org&amp;
param2=bounce&amp;param3=http%3A%2F%2Fstudent.ugen t.be%2Fastrid%2Fpics%2Fjab
ber%2F&amp;param4=.png
This is one that returns text:
http://indicator.amessage.info/indic...jabber.org&amp
;param2=bounce&amp;param3=http%3A%2F%2Fstudent.uge nt.be%2Fastrid%2Fpics%2Fja
bber%2F&amp;param4=.png

How could I see the difference between the 2 with PHP code?
Hoping that sb. can get me out of this,
greetings,
Mattias

Jul 17 '05 #10
Chung Leong wrote:
Just grab the file and see what you have. Image files will inevitably have
chr(0) while text files (the exception being UCS16 encode ones) will never
have it. Hence the following: [...] $data = file_get_contents($url);


Damn: although this is the most short solution, it appears that I need
PHP 4 >= 4.3.0 for file_get_contents and I have 4.1.2 :-s (which I can't
change, because I don't have the rights). Thx anyway!

Greetings,
Mattias

Jul 17 '05 #11
> >>Still I like the principle of your code more :D, only one problem: the
regexp doesn't work (it never matches) and I can't figure out why :-s.


Yeah, I wasn't really confident in it. Looking at it again, I put the "^" in
there, which means it would only match if the first header was Content-Type.
Take that out and it may work (too busy to test).


Indeed, it works, great!!! But it seems that the Content-Type can
*sometimes* be "text/plain;charset=UTF-8", but then I only need
"text/plain", I tried "/Content-Type: ([^\r\n]+)[\r|\n];/", but
apperently I don't understand much of those regexps :-s.

If you don't have the time to look for the ";", no problem, I'm already
very(, very) glad that you could help this far!


Try "/Content-Type: ([^\r\n;]+)(;[^\r\n]*)?[\r|\n];/"
or "/Content-Type: ([^\r\n\;]+)(;[^\r\n]*)?[\r|\n];/"
I think one of those should work, though they're untested. I can't remember if
you have to escape the ";" or not...

Regards,
Shawn

--
Shawn Wilson
sh***@glassgiant.com
http://www.glassgiant.com
Jul 17 '05 #12
Shawn Wilson wrote:
Still I like the principle of your code more :D, only one problem: the
regexp doesn't work (it never matches) and I can't figure out why :-s.

Yeah, I wasn't really confident in it. Looking at it again, I put the "^" in
there, which means it would only match if the first header was Content-Type.
Take that out and it may work (too busy to test).


Indeed, it works, great!!! But it seems that the Content-Type can
*sometimes* be "text/plain;charset=UTF-8", but then I only need
"text/plain", I tried "/Content-Type: ([^\r\n]+)[\r|\n];/", but
apperently I don't understand much of those regexps :-s.

If you don't have the time to look for the ";", no problem, I'm already
very(, very) glad that you could help this far!

Try "/Content-Type: ([^\r\n;]+)(;[^\r\n]*)?[\r|\n];/"
or "/Content-Type: ([^\r\n\;]+)(;[^\r\n]*)?[\r|\n];/"
I think one of those should work, though they're untested. I can't remember if
you have to escape the ";" or not...


It doesn't work, but that's okay, it will only be "less good code" ;),
but it will work. Thanks a lot for the help you offered!!!

Greetings,
Mattias

Jul 17 '05 #13
Shawn Wilson wrote:
I don't know if the headers are case- and whitespace- sensitive or not.


Parts of them might be. In the case of the Content-Type header, only
parameter attribute values may be case sensitive; all other
components, such as MIME media types and subtypes, and parameter
attribute names are case insensitive. Header field names are always
case insensitive.

But I'm afraid it's even more complicated than that. Whitespace, or
rather LWS (Linear White Space -- a CRLF sequence followed by either
one or more spaces or tabs), can appear in certain places and must
not appear in others. The ABNF of RFC2616, your handy authoritative
source, explains it all. It's spread out over numerous sections
though, so there'll be a lot of flicking back and forth.

Here's what I came up with to match Content-Type headers:

$LWS = '(?:(?:\r\n)?(?:\x20|\x9)+)';
$CHAR = '[\x-\x7f]';
$TEXT = "[^\x-\x1f\x7f]|$LWS";
$TOKEN = '[^\x-\x1f()<>@,;:\\/\"[\]?={}\x20\x9]+';
$QDTEXT = '(?:[^\x-\x1f\\x7f"]|$LWS)';
$QUOTEDPAIR = '\\$CHAR';
$QUOTEDSTRING = "(?:\"(?:$QDTEXT|$QUOTEDPAIR)*\")";

preg_match_all(
"`^content-type: $LWS* $TOKEN/$TOKEN $LWS*
(?:;$LWS*$TOKEN=(?:$TOKEN|$QUOTEDSTRING)$LWS*)*`mi x",
$string,
$matches);

This does strip some whitespace in some circumstances. Although the
captured string may not be identical to the actual header in terms
of whitespace, the semantics will be the same.

There are likely to be ghastly solecisms in the above as I haven't
thoroughly read through it, tested it, or taken any great time to
throw it together. But the reader should get the general idea. ;-)

--
Jock
Jul 17 '05 #14

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

2
by: Mathias | last post by:
Dear NG, I currently ty to switch from matlab to python/scipy but have a lot of trouble with images. What I need is a function for subsequently displaying a number of 2D-matrices as an image. I...
4
by: Kenny | last post by:
I have been trying to write a script that will increase the size of and image when you mouse over it, and decrease it to original size, when you mouse out. After a couple of attempts, this is what...
6
by: Haines Brown | last post by:
I find that when I use list-style-image with galeon or mozilla, padding is inserted between the symbol image and the following list text, while under IE 5.0 it seems to be inserted before the image...
10
by: John Smith | last post by:
I know that uploading an image to a database has been covered, oh, about 3 trillion times. However, I haven't found anything covering uploading to a MySQL database with .net. Please don't...
15
by: David Lozzi | last post by:
Howdy, I have a function that uploads an image and that works great. I love ..Nets built in upload, so much easier than 3rd party uploaders! Now I am making a public function that will take the...
7
by: needin4mation | last post by:
Hi, I have an Access 2002 - 2003 database. I am using Access 2003. Whenever I link an image all it shows is the filename. Not the image. Other versions of Access can link the image just fine. ...
4
by: tshad | last post by:
I am trying to set up an Image authorization where you type in the value that is in a picture to log on to our site. I found a program that is supposed to do it, but it doesn't seem to work. ...
4
by: Nicolas | last post by:
Hello! I have a question: I have a database in MySql that has two fields (code and image). Now, I want to create a WebService that takes the image of the most current row (the code is the date...
3
by: Ganesh | last post by:
Hi There, Could you some tell me the difference between Image and ImageButton, which one do i need to use if i want to show some picture on the html Thanks Ganapathi
0
by: Charles Arthur | last post by:
How do i turn on java script on a villaon, callus and itel keypad mobile phone
1
by: nemocccc | last post by:
hello, everyone, I want to develop a software for my android phone for daily needs, any suggestions?
1
by: Sonnysonu | last post by:
This is the data of csv file 1 2 3 1 2 3 1 2 3 1 2 3 2 3 2 3 3 the lengths should be different i have to store the data by column-wise with in the specific length. suppose the i have to...
0
by: Hystou | last post by:
There are some requirements for setting up RAID: 1. The motherboard and BIOS support RAID configuration. 2. The motherboard has 2 or more available SATA protocol SSD/HDD slots (including MSATA, M.2...
0
by: Hystou | last post by:
Most computers default to English, but sometimes we require a different language, especially when relocating. Forgot to request a specific language before your computer shipped? No problem! You can...
0
jinu1996
by: jinu1996 | last post by:
In today's digital age, having a compelling online presence is paramount for businesses aiming to thrive in a competitive landscape. At the heart of this digital strategy lies an intricately woven...
0
tracyyun
by: tracyyun | last post by:
Dear forum friends, With the development of smart home technology, a variety of wireless communication protocols have appeared on the market, such as Zigbee, Z-Wave, Wi-Fi, Bluetooth, etc. Each...
0
agi2029
by: agi2029 | last post by:
Let's talk about the concept of autonomous AI software engineers and no-code agents. These AIs are designed to manage the entire lifecycle of a software development project—planning, coding, testing,...
0
isladogs
by: isladogs | last post by:
The next Access Europe User Group meeting will be on Wednesday 1 May 2024 starting at 18:00 UK time (6PM UTC+1) and finishing by 19:30 (7.30PM). In this session, we are pleased to welcome a new...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.