By using this site, you agree to our updated Privacy Policy and our Terms of Use. Manage your Cookies Settings.
457,695 Members | 1,356 Online
Bytes IT Community
+ Ask a Question
Need help? Post your question and get tips & solutions from a community of 457,695 IT Pros & Developers. It's quick & easy.

Extracting Semantic Structure of HTML Document- Feature based

P: n/a
Hi,

I've read somewhere that feature-based analysis can be used to extract
the semantic structure of HTML documents. By semantic structure, they
mean the model of the rendered view a reader sees. Now, my question is,
what should such feature-based analysis involve? What exactly is a
feature-based analysis?

Please help.

Cheers,
Michael

Jul 23 '05 #1
Share this Question
Share on Google+
3 Replies


P: n/a
da*****@hotmail.com wrote:
I've read somewhere that feature-based analysis can be used to extract
the semantic structure of HTML documents. By semantic structure, they
mean the model of the rendered view a reader sees. Now, my question is,
what should such feature-based analysis involve? What exactly is a
feature-based analysis?


You asked the same question five days ago and at least twice in
December. You didn't bother to respond to any of the replies you got
then. So why should anyone bother replying to you now? Please join the
discussion and explain in more detail what you are looking for.

Steve

--
"My theories appal you, my heresies outrage you,
I never answer letters and you don't like my tie." - The Doctor

Steve Pugh <st***@pugh.net> <http://steve.pugh.net/>
Jul 23 '05 #2

P: n/a
Hi,

Sorry for the late response. Well, I'm trying to extract a structure of
the view of what a reader sees, e.g. extract all headings and link them
to the corresponding paragraphs etc. In the end, the output graph shall
be a hierachy of sections, sub-sections etc. I know this can be quite
complex, because the HTML used today can be very messy, esp. with
tables. I was suggested to use a "feature-based analysis" to extract
such information, but I'm not sure what exactly that should mean. What
should a feature-based analysis be, even in other contexts? Is it
really feasible to extract "features" of HTML documents?

Any help will be much appreciated.

Cheers,
Michael

Steve Pugh wrote:
da*****@hotmail.com wrote:
I've read somewhere that feature-based analysis can be used to extractthe semantic structure of HTML documents. By semantic structure, theymean the model of the rendered view a reader sees. Now, my question is,what should such feature-based analysis involve? What exactly is a
feature-based analysis?
You asked the same question five days ago and at least twice in
December. You didn't bother to respond to any of the replies you got
then. So why should anyone bother replying to you now? Please join

the discussion and explain in more detail what you are looking for.

Steve

--
"My theories appal you, my heresies outrage you,
I never answer letters and you don't like my tie." - The Doctor

Steve Pugh <st***@pugh.net> <http://steve.pugh.net/>


Jul 23 '05 #3

P: n/a
da*****@hotmail.com wrote:
Hi,

I've read somewhere that feature-based analysis can be used to extract
the semantic structure of HTML documents. By semantic structure, they
mean the model of the rendered view a reader sees. Now, my question is,
what should such feature-based analysis involve? What exactly is a
feature-based analysis?

Please help.

Cheers,
Michael

If the the author of the article you read, or the person who suggested
you use this method for analysis of mark-up text knows what *he* means
by it, maybe he is the best source for a clearer explanation. Frankly,
it sounds idiosyncratic--maybe his own invention. Better to go to the
source, and find out.
Or...have you considered looking at any of the several document object
models (DOM)? A DOM is not HTML, but who knows. Maybe someone was a
little confused...
- Jake Lloyd
Jul 23 '05 #4

This discussion thread is closed

Replies have been disabled for this discussion.