473,387 Members | 1,700 Online
Bytes | Software Development & Data Engineering Community
Post Job

Home Posts Topics Members FAQ

Join Bytes to post your question to a community of 473,387 software developers and data experts.

Reading an HTML document & extracting content

Hi gang,

I'm an ASP developer by trade, but I've had to create client side
scripts with JavaScript many times in the past. Simple things, like
validating form elements and such.

Now I've been assigned the task of extracting content from a given HTML
page. If anyone's familiar with the Yahoo! Store order confirmation
screen, I need to be able to grab the total amount from the table to
the right-hand side. (Sample File:
http://www.2beyourself.com/t/sample.html)

If you view the source, this is in a table and enclosed with ugly html.
the value I want to retrieve is wrapped with b tags. Originally I was
thinking of using innerHTML or innerText for extracting the value. But
I find that we cannot gain control of this piece of the Yahoo! Store to
make it work!

So after talking with peers, we thought of reading in the entire HTML
page and using regular expressions to try and extract the value.
Something along the lines of: '\<b\>[0-9]+\.[0-9]{2}\<\/b\/>'

I'm not sure how to accomplish this. Could someone please point me in
the right direction? If this solution is even a good one. If you have
something better, I'm all ears! (eyes) If using the regular expression
would be a good solution, I need to find out how to read in the entire
HTML doc, and then parse out that piece.

Any tips and suggestions will be appreciate greatly!!

And I hope your week is starting off right. ^^

Jul 23 '05 #1
1 2764
"Cognizance" <co**********@gmail.com> wrote in message
news:11*********************@f14g2000cwb.googlegro ups.com...
Hi gang,

I'm an ASP developer by trade, but I've had to create client side
scripts with JavaScript many times in the past. Simple things, like
validating form elements and such.

Now I've been assigned the task of extracting content from a given HTML
page. If anyone's familiar with the Yahoo! Store order confirmation
screen, I need to be able to grab the total amount from the table to
the right-hand side. (Sample File:
http://www.2beyourself.com/t/sample.html)

If you view the source, this is in a table and enclosed with ugly html.
the value I want to retrieve is wrapped with b tags. Originally I was
thinking of using innerHTML or innerText for extracting the value. But
I find that we cannot gain control of this piece of the Yahoo! Store to
make it work!

So after talking with peers, we thought of reading in the entire HTML
page and using regular expressions to try and extract the value.
Something along the lines of: '\<b\>[0-9]+\.[0-9]{2}\<\/b\/>'

I'm not sure how to accomplish this. Could someone please point me in
the right direction? If this solution is even a good one. If you have
something better, I'm all ears! (eyes) If using the regular expression
would be a good solution, I need to find out how to read in the entire
HTML doc, and then parse out that piece.

Any tips and suggestions will be appreciate greatly!!

And I hope your week is starting off right. ^^


RegEx would be better but this works:

<html>
<head>
<title>Total.htm</title>
<script type="text/javascript">
function total() {
var sURL = "http://www.2beyourself.com/t/sample.html";
var oXML = new ActiveXObject("Microsoft.XMLHTTP");
oXML.Open("GET",sURL,false);
oXML.send();
try {
var sXML = oXML.ResponseText;
// Find Total's label
var iTAG = sXML.indexOf("<b>Total:</b>");
var sVAL = sXML.substr(iTAG);
// Find Total's decimal
var iDOT = sVAL.indexOf(".");
sVAL = sVAL.substr(0,iDOT+3);
// Find Total's start
iTAG = sVAL.lastIndexOf(">")
sVAL = sVAL.substr(iTAG+1)
// Show Total's value
alert(sVAL);
} catch(e) {
alert(sURL + " not found!");
}
}
</script>
</head>
<body onload="total()">
</body>
</html>

Jul 23 '05 #2

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

2
by: Donald Firesmith | last post by:
I am having trouble having Google Adsense code stored in XSL converted properly into HTML. The <> unfortunately become &lt; and &gt; and then no longer work. XSL code is: <script...
6
by: Eddie | last post by:
When I use JavaScript to read an element's textDecoration style, I only get one value even if there are more than one in the sytle sheet. For example if the text-decoration is defined as:...
3
by: user | last post by:
hi there has anyone of you writte a function to encode html from like '&' -> '&amp;' and likes to share it with me.. or can anybody give me a hint how to set up something like that. cheers me. ...
3
by: news | last post by:
I am trying to get at the source of a web page. Looking at the innerHTML element is only part of the story. In IE, right-clicking on various different parts of the page gives me different results...
6
by: Paolo Pignatelli | last post by:
I have an aspx code behind page that goes something like this in the HTML view: <asp:HyperLink id=HyperLink1 runat="server" NavigateUrl='<%#"mailto:" &amp;...
1
by: j7.henry | last post by:
I am trying to pull specific data that is in a comma delimited file into a web page. So if my comma delimited file looks like: Name,Address,Zip Fred,123 Elm,66666 Mike,23 Jump,11111 I would...
6
by: clintonG | last post by:
Can anybody make sense of this crazy and inconsistent results? // IE7 Feed Reading View disabled displays this raw XML <?xml version="1.0" encoding="utf-8" ?> <!-- AT&T HTML entities & XML...
10
by: jpollack | last post by:
I don't know JavaScript but have been tasked to write a script that will change the value of a Boolean variable to the word "Yes" on a table row. I have been trying to achieve this based on my...
0
by: taylorcarr | last post by:
A Canon printer is a smart device known for being advanced, efficient, and reliable. It is designed for home, office, and hybrid workspace use and can also be used for a variety of purposes. However,...
0
by: Charles Arthur | last post by:
How do i turn on java script on a villaon, callus and itel keypad mobile phone
0
by: aa123db | last post by:
Variable and constants Use var or let for variables and const fror constants. Var foo ='bar'; Let foo ='bar';const baz ='bar'; Functions function $name$ ($parameters$) { } ...
0
by: ryjfgjl | last post by:
In our work, we often receive Excel tables with data in the same format. If we want to analyze these data, it can be difficult to analyze them because the data is spread across multiple Excel files...
0
by: emmanuelkatto | last post by:
Hi All, I am Emmanuel katto from Uganda. I want to ask what challenges you've faced while migrating a website to cloud. Please let me know. Thanks! Emmanuel
1
by: nemocccc | last post by:
hello, everyone, I want to develop a software for my android phone for daily needs, any suggestions?
1
by: Sonnysonu | last post by:
This is the data of csv file 1 2 3 1 2 3 1 2 3 1 2 3 2 3 2 3 3 the lengths should be different i have to store the data by column-wise with in the specific length. suppose the i have to...
0
by: Hystou | last post by:
Most computers default to English, but sometimes we require a different language, especially when relocating. Forgot to request a specific language before your computer shipped? No problem! You can...
0
Oralloy
by: Oralloy | last post by:
Hello folks, I am unable to find appropriate documentation on the type promotion of bit-fields when using the generalised comparison operator "<=>". The problem is that using the GNU compilers,...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.