473,320 Members | 1,870 Online
Bytes | Software Development & Data Engineering Community
Post Job

Home Posts Topics Members FAQ

Join Bytes to post your question to a community of 473,320 software developers and data experts.

Analyze and read in html file

Hi,

what i want is something similar to th simple-xml extension of php, but for
html.

I have to analyze and read in certain tags from a html file in a comfortable
manner.
Is there a php extension/library which makes this possible?

Thx

Axel

Jun 6 '06 #1
5 1593
I think xml extensions support parsing for html
--
http://www.mastervb.net
http://www.padbuilder.com

Jun 6 '06 #2
I thought that only XHTML is XML compatible, but not HTML.
So it would not be possible to read it via XML extension.

Can someone please comment on that thought?

Thx

Axel


"lorento" <la**********@yahoo.com> schrieb im Newsbeitrag
news:11*********************@g10g2000cwb.googlegro ups.com...
I think xml extensions support parsing for html
--
http://www.mastervb.net
http://www.padbuilder.com

Jun 6 '06 #3
Radium wrote:
I thought that only XHTML is XML compatible, but not HTML.
So it would not be possible to read it via XML extension.

Can someone please comment on that thought?


These are rather sweeping descriptions - not actual language descriptions.
The short answer is that if your code isn't xml you need to get it fixed
soon.

C.
Jun 7 '06 #4
Radium:
what i want is something similar to th simple-xml extension of php, but for
html.


Be warned that there are two kinds of HTML: SGML-HTML, as specified
in HTML specs, and tag-soup-HTML, as digested by browsers.

--
Jock

Jun 9 '06 #5
Radium (uh**@rz.uni-karlsruhe.de) wrote:
: Hi,

: what i want is something similar to th simple-xml extension of php, but for
: html.

: I have to analyze and read in certain tags from a html file in a comfortable
: manner.
: Is there a php extension/library which makes this possible?

In php, not that I know off though I would like to be wrong.

If you know any perl then use the excellent HTML::Parser. It handles just
about anything that a web site might throw at it. You could use the perl
script to build a PHP script
Assume text input something like

<html><head><title>example page</title> (etc)
So write a perl script with handlers something like (totally pseudo code)

sub do_start_tag
{
my $tag_name = this is available in the parser, but I forget how
print TMP_PHP_SCRIPT , "handle_tag('$tag_name');\n";
}

sub do_text
{
my $raw_text = this is available in the parser, but I forget how
my $safe_text = quotemeta($raw_text);
print TMP_PHP_SCRIPT , "handle_text('$safe_text');\n";
}

sub do_end_tag
{
my $tag_name = this is available in the parser, but I forget how
print TMP_PHP_SCRIPT , "handle_end_tag('$tag_name');\n";
}
From that you would get a temporary files with lines like

handle_tag('html');
handle_tag('head');
handle_tag('title');
handle_text( 'example page');
handle_end_tag('title');
handle_end_tag('head');
Your main php script would run the perl script, and then run the temporary
php script (example shown just above), and your php functions like
handle_tag etc would be called just as if you had been able to parse the
data directly from within php.

$0.10

Jun 9 '06 #6

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

3
by: Code4u | last post by:
We have a mature project that suffers long build times because many modules pull in far more than they need. In the long term I would like to refactor to break some of the dependencies, but I would...
6
by: Holger Marzen | last post by:
Hi all, the docs are not clear for me. If I want (in version 7.1.x, 7.2.x) to help the analyzer AND free unused space do I have to do a vacuum vacuum analyze or is a
1
by: Joseph Shraibman | last post by:
Is there any way to force analyze to run on a whole table? In other words for large tables to avoid sampling? What happens if I run a vacuum analyze? ---------------------------(end of...
3
by: Harry Broomhall | last post by:
I asked earlier about ways of doing an UPDATE involving a left outer join and got some very useful feedback. This has thrown up a (to me) strange anomaly about the speed of such an update. ...
3
by: user_5701 | last post by:
Hello, I have an Access 2000 database that I need to export certain queries to Excel 2000. The problem is that I have to take the toolbars away from the users for security purposes, but still let...
0
by: Rajesh Kumar Mallah | last post by:
Greeting, Will it be an useful feature to be able to vacumm / analyze all tables in a given schema. eg VACUUM schema.* ; at least for me it will be a good feature.
4
by: superflit | last post by:
Hi All, I am reading a log file, and wondering what is the best way to read and analize this. I am think in two options: 1- Read the data and put all variables in a list 2- Read the data and...
5
devonknows
by: devonknows | last post by:
Good afternoon, if anyone can help me with this i would be most appreciative Im looking for a way to invoke the Defrag through visual basic, the defrag part i can do but i want to be able to...
1
by: =?Utf-8?B?TWVubm92ZHZlZW4=?= | last post by:
Goodday, I'm stuck building a .NET-application. The purpose of the application is checking for empty records in a database-exportfile. The user gives a 'csv, .xls or .sql file. The...
0
by: DolphinDB | last post by:
The formulas of 101 quantitative trading alphas used by WorldQuant were presented in the paper 101 Formulaic Alphas. However, some formulas are complex, leading to challenges in calculation. Take...
1
isladogs
by: isladogs | last post by:
The next Access Europe meeting will be on Wednesday 6 Mar 2024 starting at 18:00 UK time (6PM UTC) and finishing at about 19:15 (7.15PM). In this month's session, we are pleased to welcome back...
0
by: Vimpel783 | last post by:
Hello! Guys, I found this code on the Internet, but I need to modify it a little. It works well, the problem is this: Data is sent from only one cell, in this case B5, but it is necessary that data...
0
by: jfyes | last post by:
As a hardware engineer, after seeing that CEIWEI recently released a new tool for Modbus RTU Over TCP/UDP filtering and monitoring, I actively went to its official website to take a look. It turned...
0
by: ArrayDB | last post by:
The error message I've encountered is; ERROR:root:Error generating model response: exception: access violation writing 0x0000000000005140, which seems to be indicative of an access violation...
0
by: CloudSolutions | last post by:
Introduction: For many beginners and individual users, requiring a credit card and email registration may pose a barrier when starting to use cloud servers. However, some cloud server providers now...
0
by: Shællîpôpï 09 | last post by:
If u are using a keypad phone, how do u turn on JavaScript, to access features like WhatsApp, Facebook, Instagram....
0
by: af34tf | last post by:
Hi Guys, I have a domain whose name is BytesLimited.com, and I want to sell it. Does anyone know about platforms that allow me to list my domain in auction for free. Thank you
0
by: Faith0G | last post by:
I am starting a new it consulting business and it's been a while since I setup a new website. Is wordpress still the best web based software for hosting a 5 page website? The webpages will be...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.