470,849 Members | 1,084 Online
Bytes | Developer Community
New Post

Home Posts Topics Members FAQ

Post your question to a community of 470,849 developers. It's quick & easy.

Extract data from XHTML tag

Hello,
I try to get the contents behind the tag name.

www@mike:/home/www > /usr/local/bin/php -a
Interactive mode enabled

<?php
if (preg_match('<(?:body|BODY)("[^"]*"|\'[^\']*\'|[^">\'])*>',
'<body bgcolor="#ffffff">',
$Match))
{
print($Match[1]);
}
else
{
print('No match!');
}
?>
X-Powered-By: PHP/4.2.3
Content-type: text/html

<br />
<b>Warning</b>: Unknown modifier ''' in <b>-</b> on line <b>11</b><br
/>
No match!
Why does this pattern not work?
I get more by the statements in the following programming language.

% info tclversion
8.3
% if [regexp {<(?:body|BODY)("[^"]*"|'[^']*'|[^">'])*>} {<body
bgcolor="#ffffff">} X Match] \
{
puts "Match=$Match"
} \
else \
{
puts {No match!}
}
Match="#ffffff"
Have you got an idea?

Best regards,
Markus Elfring
Jul 17 '05 #1
7 2231
Ma************@web.de (Markus Elfring) wrote in message news:<40**************************@posting.google. com>...
Hello,
I try to get the contents behind the tag name.


<snip>

XHTML codes can be treated like XML and so you can use XML parser
functions <http://in.php.net/xml>

--
| Just another PHP saint |
Email: rrjanbiah-at-Y!com
Jul 17 '05 #2
> if (preg_match('<(?:body|BODY)("[^"]*"|\'[^\']*\'|[^">\'])*>',
'<body bgcolor="#ffffff">',
$Match))
[...]
<b>Warning</b>: Unknown modifier ''' in <b>-</b> on line <b>11</b>


I forgot to add the Perl pattern delimiters because I tried the
function "eregi" before.
But it does not support non-capturing parentheses.
Jul 17 '05 #3
> <b>Warning</b>: Unknown modifier ''' in <b>-</b> on line <b>11</b>

Is there a chance that this error message can be more precise?

http://perldoc.com/perl5.8.4/pod/perlrequick.html

I did not see the relation between the terms "modifier" and "delimiter".
How do you think about to get the message "Pattern delimiters are missing"?
Jul 17 '05 #4
Markus Elfring wrote:
<b>Warning</b>: Unknown modifier ''' in <b>-</b> on line <b>11</b>
Is there a chance that this error message can be more precise?


Yes, it can, but it probably won't.

Your expression was

<(?:body|BODY)("[^"]*"|\'[^\']*\'|[^">\'])*>

The '<' is treated as the opening delimiter and the first
'>' as the ending delimiter. The character after the
expression proper -- the ''', because '\'' in a single-
quoted string means ''' -- is not a known pattern modifier.
http://perldoc.com/perl5.8.4/pod/perlrequick.html
http://www.php.net/manual/en/ref.pcre.php
I did not see the relation between the terms "modifier" and "delimiter".
Pattern modifiers affect matching. Delimiters enclose the
expression.
How do you think about to get the message "Pattern delimiters are missing"?


True, if both delimiters are actually missing. They're not
though: '<' and '>' are the delimiters.

--
Jock
Jul 17 '05 #5
> XHTML codes can be treated like XML and so you can use XML parser
functions <http://in.php.net/xml>


Another alternative:
http://de.php.net/manual/en/ref.dom.php

I prefer to build specific regular expressions for my needs instead of
to use the other programming interfaces.
I would use the XML/DOM APIs if I need more data and sophisticated
processing.
Jul 17 '05 #6
> True, if both delimiters are actually missing. They're not
though: '<' and '>' are the delimiters.


I have not found a documentation that specifies angle brackets as
valid (default) pattern delimiters.
http://perldoc.com/perl5.8.4/pod/perlretut.html
Jul 17 '05 #7
.oO(Markus Elfring)
True, if both delimiters are actually missing. They're not
though: '<' and '>' are the delimiters.


I have not found a documentation that specifies angle brackets as
valid (default) pattern delimiters.
http://perldoc.com/perl5.8.4/pod/perlretut.html


XCVII. Regular Expression Functions (Perl-Compatible)
http://www.php.net/manual/en/ref.pcre.php

"... Since PHP 4.0.4, you can also use Perl-style (), {}, [], and <>
matching delimiters."

Micha
Jul 17 '05 #8

This discussion thread is closed

Replies have been disabled for this discussion.

Similar topics

31 posts views Thread by Greg Scharlemann | last post: by
2 posts views Thread by Thief_ | last post: by
5 posts views Thread by Leon | last post: by
reply views Thread by =?ISO-8859-1?Q?J=F6rg_Battermann?= | last post: by
By using this site, you agree to our Privacy Policy and Terms of Use.