467,859 Members | 1,361 Online
Bytes | Developer Community
New Post

Home Posts Topics Members FAQ

Post your question to a community of 467,859 developers. It's quick & easy.

Reading an HTML document & extracting content

Hi gang,

I'm an ASP developer by trade, but I've had to create client side
scripts with JavaScript many times in the past. Simple things, like
validating form elements and such.

Now I've been assigned the task of extracting content from a given HTML
page. If anyone's familiar with the Yahoo! Store order confirmation
screen, I need to be able to grab the total amount from the table to
the right-hand side. (Sample File:
http://www.2beyourself.com/t/sample.html)

If you view the source, this is in a table and enclosed with ugly html.
the value I want to retrieve is wrapped with b tags. Originally I was
thinking of using innerHTML or innerText for extracting the value. But
I find that we cannot gain control of this piece of the Yahoo! Store to
make it work!

So after talking with peers, we thought of reading in the entire HTML
page and using regular expressions to try and extract the value.
Something along the lines of: '\<b\>[0-9]+\.[0-9]{2}\<\/b\/>'

I'm not sure how to accomplish this. Could someone please point me in
the right direction? If this solution is even a good one. If you have
something better, I'm all ears! (eyes) If using the regular expression
would be a good solution, I need to find out how to read in the entire
HTML doc, and then parse out that piece.

Any tips and suggestions will be appreciate greatly!!

And I hope your week is starting off right. ^^

Jul 23 '05 #1
  • viewed: 2511
Share:
1 Reply
"Cognizance" <co**********@gmail.com> wrote in message
news:11*********************@f14g2000cwb.googlegro ups.com...
Hi gang,

I'm an ASP developer by trade, but I've had to create client side
scripts with JavaScript many times in the past. Simple things, like
validating form elements and such.

Now I've been assigned the task of extracting content from a given HTML
page. If anyone's familiar with the Yahoo! Store order confirmation
screen, I need to be able to grab the total amount from the table to
the right-hand side. (Sample File:
http://www.2beyourself.com/t/sample.html)

If you view the source, this is in a table and enclosed with ugly html.
the value I want to retrieve is wrapped with b tags. Originally I was
thinking of using innerHTML or innerText for extracting the value. But
I find that we cannot gain control of this piece of the Yahoo! Store to
make it work!

So after talking with peers, we thought of reading in the entire HTML
page and using regular expressions to try and extract the value.
Something along the lines of: '\<b\>[0-9]+\.[0-9]{2}\<\/b\/>'

I'm not sure how to accomplish this. Could someone please point me in
the right direction? If this solution is even a good one. If you have
something better, I'm all ears! (eyes) If using the regular expression
would be a good solution, I need to find out how to read in the entire
HTML doc, and then parse out that piece.

Any tips and suggestions will be appreciate greatly!!

And I hope your week is starting off right. ^^


RegEx would be better but this works:

<html>
<head>
<title>Total.htm</title>
<script type="text/javascript">
function total() {
var sURL = "http://www.2beyourself.com/t/sample.html";
var oXML = new ActiveXObject("Microsoft.XMLHTTP");
oXML.Open("GET",sURL,false);
oXML.send();
try {
var sXML = oXML.ResponseText;
// Find Total's label
var iTAG = sXML.indexOf("<b>Total:</b>");
var sVAL = sXML.substr(iTAG);
// Find Total's decimal
var iDOT = sVAL.indexOf(".");
sVAL = sVAL.substr(0,iDOT+3);
// Find Total's start
iTAG = sVAL.lastIndexOf(">")
sVAL = sVAL.substr(iTAG+1)
// Show Total's value
alert(sVAL);
} catch(e) {
alert(sURL + " not found!");
}
}
</script>
</head>
<body onload="total()">
</body>
</html>

Jul 23 '05 #2

This discussion thread is closed

Replies have been disabled for this discussion.

Similar topics

2 posts views Thread by Donald Firesmith | last post: by
6 posts views Thread by Eddie | last post: by
3 posts views Thread by user | last post: by
3 posts views Thread by news | last post: by
6 posts views Thread by clintonG | last post: by
By using this site, you agree to our Privacy Policy and Terms of Use.