473,323 Members | 1,547 Online
Bytes | Software Development & Data Engineering Community
Post Job

Home Posts Topics Members FAQ

Join Bytes to post your question to a community of 473,323 software developers and data experts.

HTML Import and Parse

I admit that I am a novice when it comes to the HTML DOM and
JavaScript. Basically, here is what I want to do:

1. Import some HTML from a remote web site
2. Parse the HTML to locate a specific tag based on font weight and
font size
3. Check the text in that tag to see if it matches a specified string

Does anyone have any ideas?
Jul 23 '05 #1
3 1718
> I admit that I am a novice when it comes to the HTML DOM and
JavaScript. Basically, here is what I want to do:

1. Import some HTML from a remote web site
2. Parse the HTML to locate a specific tag based on font weight and
font size
3. Check the text in that tag to see if it matches a specified string

Does anyone have any ideas?


You can't do that in JavaScript because of security restrictions.
You'll need to use some server-side technology for that.

- Use PHP to parse
- Use the proxy module in apache to 'proxy' that page to your webserver
so you can read it
- Java/JSP/Servlets. java has an API to parse HTML pages
- ASP ... i'm sure it can be done in here

Good luck,
Vincnet

Jul 23 '05 #2
Vincent van Beveren wrote:
> I admit that I am a novice when it comes to the HTML DOM and
> JavaScript. Basically, here is what I want to do:
>
> 1. Import some HTML from a remote web site
> 2. Parse the HTML to locate a specific tag based on font weight and
> font size
> 3. Check the text in that tag to see if it matches a specified string
>
> Does anyone have any ideas?
You can't do that in JavaScript because of security restrictions.


In IE and Mozilla, sure I can.

http://jibbering.com/2002/4/httprequest.html
You'll need to use some server-side technology for that.


Depends on the browser actually. And whether it has a JVM installed or
not. And, whether it has javascript enabled or not.
--
Randy
Chance Favors The Prepared Mind
comp.lang.javascript FAQ - http://jibbering.com/faq/
Jul 23 '05 #3
Randy Webb wrote:
Vincent van Beveren wrote:
> I admit that I am a novice when it comes to the HTML DOM and
> JavaScript. Basically, here is what I want to do:
>
> 1. Import some HTML from a remote web site
> 2. Parse the HTML to locate a specific tag based on font weight and
> font size
> 3. Check the text in that tag to see if it matches a specified string
>
> Does anyone have any ideas?


You can't do that in JavaScript because of security restrictions.


In IE and Mozilla, sure I can.

http://jibbering.com/2002/4/httprequest.html


HTTPRequest won't read from a site other then the site the file it is included
on was downloaded from in the default security environment.

In other words, your site can't make a request from www.yahoo.com with the
HTTPRequest object in the default security environment.
You'll need to use some server-side technology for that.


Depends on the browser actually. And whether it has a JVM installed or
not. And, whether it has javascript enabled or not.


Applets running the sandbox in the default security environment can't access
domains other then the one they are downloaded from, Javascript running in the
default security environment can't make HTTPRequests from domains other then
the one they were downloaded from.

If you have a test site that demonstrates otherwise I'd be interested to see
how you are getting around these security limitations.

--
| Grant Wagner <gw*****@agricoreunited.com>

* Client-side Javascript and Netscape 4 DOM Reference available at:
*
http://devedge.netscape.com/library/...ce/frames.html

* Internet Explorer DOM Reference available at:
*
http://msdn.microsoft.com/workshop/a...ence_entry.asp

* Netscape 6/7 DOM Reference available at:
* http://www.mozilla.org/docs/dom/domref/
* Tips for upgrading JavaScript for Netscape 7 / Mozilla
* http://www.mozilla.org/docs/web-deve...upgrade_2.html
Jul 23 '05 #4

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

6
by: Els | last post by:
***newbie question*** Hi, I am trying to make my server (Apache) parse .html files as .php. I found this line of code: ForceType application/x-httpd-php placed it in an .htaccess file and...
8
by: Anders Eriksson | last post by:
Hello! I want to extract some info from a some specific HTML pages, Microsofts International Word list (e.g. http://msdn.microsoft.com/library/en-us/dnwue/html/swe_word_list.htm). I want to...
2
by: Thomas SMETS | last post by:
-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 Dear, I need to parse XHTML/HTML files in all ways : ~ _ Removing comments and javascripts is a first issue ~ _ Retrieving the list of fields...
0
by: atlantis | last post by:
Hi, I have a very strange problem with xsl:import when usig RELATIVE path on AIX 5.2 server. I have two XSL files in the same directory: "ists_xslt3.xsl" and "ists_xslt3_layout.xsl". This...
10
by: George | last post by:
How can I parse an HTML file and collect only that the A tags. I have a start for the code but an unable to figure out how to finish the code. HTML_parse gets the data from the URL document. Thanks...
13
by: DH | last post by:
Hi, I'm trying to strip the html and other useless junk from a html page.. Id like to create something like an automated text editor, where it takes the keywords from a txt file and removes them...
5
by: mtuller | last post by:
Alright. I have tried everything I can find, but am not getting anywhere. I have a web page that has data like this: <tr > <td headers="col1_1" style="width:21%" > <span class="hpPageText"...
2
by: PythonNoob89 | last post by:
Hello me and my friends are currently working on an easy pythion project but we are stuck with one of the functions we need to develope:( We really don't know how to parse data from an html page.....
7
by: Benjamin | last post by:
I'm trying to parse an HTML file. I want to retrieve all of the text inside a certain tag that I find with XPath. The DOM seems to make this available with the innerHTML element, but I haven't...
0
by: DolphinDB | last post by:
Tired of spending countless mintues downsampling your data? Look no further! In this article, you’ll learn how to efficiently downsample 6.48 billion high-frequency records to 61 million...
0
by: ryjfgjl | last post by:
ExcelToDatabase: batch import excel into database automatically...
0
isladogs
by: isladogs | last post by:
The next Access Europe meeting will be on Wednesday 6 Mar 2024 starting at 18:00 UK time (6PM UTC) and finishing at about 19:15 (7.15PM). In this month's session, we are pleased to welcome back...
1
isladogs
by: isladogs | last post by:
The next Access Europe meeting will be on Wednesday 6 Mar 2024 starting at 18:00 UK time (6PM UTC) and finishing at about 19:15 (7.15PM). In this month's session, we are pleased to welcome back...
0
by: jfyes | last post by:
As a hardware engineer, after seeing that CEIWEI recently released a new tool for Modbus RTU Over TCP/UDP filtering and monitoring, I actively went to its official website to take a look. It turned...
1
by: Defcon1945 | last post by:
I'm trying to learn Python using Pycharm but import shutil doesn't work
1
by: Shællîpôpï 09 | last post by:
If u are using a keypad phone, how do u turn on JavaScript, to access features like WhatsApp, Facebook, Instagram....
0
by: af34tf | last post by:
Hi Guys, I have a domain whose name is BytesLimited.com, and I want to sell it. Does anyone know about platforms that allow me to list my domain in auction for free. Thank you
0
by: Faith0G | last post by:
I am starting a new it consulting business and it's been a while since I setup a new website. Is wordpress still the best web based software for hosting a 5 page website? The webpages will be...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.