473,406 Members | 2,390 Online
Bytes | Software Development & Data Engineering Community
Post Job

Home Posts Topics Members FAQ

Join Bytes to post your question to a community of 473,406 software developers and data experts.

process XMLHTTP response returning poorly formed html

Using xmlhttp I am accessing a document from the web that is not xml
and is in fact not even proper html even though it is supposed to be
(unbalanced tags). Here is the type of code I am using:

url="http://www.domain.com/page.html";
var xmlhttp = new ActiveXObject("Msxml2.XMLHTTP");
xmlhttp.open("GET", url, false);
xmlhttp.send();
var xmlResp = xmlhttp.responseXML;

I want to create an array that holds the contents of every paragraph
<P> tag. The paragraphs are well formed with both opening and closing
tags <P> and </P> but the document as a whole is not valid xml or html.
How can I process the so-called xml response to extract the contents of
each paragraph into a unique element of the array? I would like to use
DOM methods to extract the paragraphs rather than parse up the text
with xmlhttp.responseText. Any help would be appreciated.

Jul 23 '05 #1
5 4336
It will be easy to parse by regular expression.

Jul 23 '05 #2

strout wrote:
It will be easy to parse by regular expression.


How? The only think I know about the document is that the information I
need is in between successive <P> and </P> tags. I was reluctant to
use regexp because I have any more structure than what is described and
I don't control the format of the page source.

Jul 23 '05 #3
Why not automate IE to load the page and then grab your content once
IE has done it's job ? Presumably that will "fix" any irregularities
in the source.

Depends on where you want to do this, but is an option. Or just use
response.text and MSHTML ?

Tim.
<da********@yahoo.com> wrote in message
news:11**********************@l41g2000cwc.googlegr oups.com...
Using xmlhttp I am accessing a document from the web that is not xml
and is in fact not even proper html even though it is supposed to be
(unbalanced tags). Here is the type of code I am using:

url="http://www.domain.com/page.html";
var xmlhttp = new ActiveXObject("Msxml2.XMLHTTP");
xmlhttp.open("GET", url, false);
xmlhttp.send();
var xmlResp = xmlhttp.responseXML;

I want to create an array that holds the contents of every paragraph
<P> tag. The paragraphs are well formed with both opening and
closing
tags <P> and </P> but the document as a whole is not valid xml or
html.
How can I process the so-called xml response to extract the contents
of
each paragraph into a unique element of the array? I would like to
use
DOM methods to extract the paragraphs rather than parse up the text
with xmlhttp.responseText. Any help would be appreciated.

Jul 23 '05 #4
On 25 Feb 2005 11:08:55 -0800, da********@yahoo.com wrote:
Using xmlhttp I am accessing a document from the web that is not xml
and is in fact not even proper html even though it is supposed to be
(unbalanced tags).
There's nothin inherent in unbalanced tags that would make something
not valid HTML - html fullly allows lots of closing elements as
optional.
How can I process the so-called xml response to extract the contents of
each paragraph into a unique element of the array? I would like to use
DOM methods to extract the paragraphs rather than parse up the text
with xmlhttp.responseText. Any help would be appreciated.


using the browser to parse the responseText as html will give you a
DOM of it, there's no other reasonable solution.

Jim.
Jul 23 '05 #5
>
How can I process the so-called xml response to extract the contents of
each paragraph into a unique element of the array? I would like to use
DOM methods to extract the paragraphs rather than parse up the text
with xmlhttp.responseText. Any help would be appreciated.


using the browser to parse the responseText as html will give you a
DOM of it, there's no other reasonable solution.


You could also do it with just an IHTMLDocument2 implementaton, you
don't really need the browser.
Jul 23 '05 #6

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

3
by: kajol | last post by:
Hi everyone I am trying to get the content of any webpage (URL) using XMLHTTP, and it is working fine for me, but suddenly I have got a URL "http://www.bizrate.com/" which is causing a system...
5
by: jim.frantzen | last post by:
You have an active XMLHTTP request on the main page (localhost/App1/index.aspx) The XMLHTTP request takes about 60 seconds to receive a response back from localhost/App1/getxml.aspx. You have an...
5
by: Ric | last post by:
I created a page in ASP.Net (with no buffering) that does the following: Output line #1 FLUSH {wait 1 second} Output line #2 {wait 1 second} Output line #3 FLUSH
3
by: Noozer | last post by:
Hrm.. last posting was mangled. Let's try again, with more detail... I'm just starting to try out "Ajax" web programming and I've got a question. AJAX is fairly straightforward. Javascript...
1
by: peterlan | last post by:
Hello, I have an issue with a long-running import process in our asp.net app (1.1). After the user initiates an import, we're trying to make periodic xmlhttp requests to update a progress bar. In...
4
by: mike.biang | last post by:
I have an ASP page that is using an XMLHTTP object to request various pages from my server. I keep a single session throughout the XMLHTTP requests by bassing the ASPSESSIONID cookie through the...
3
by: JMcCrillis | last post by:
I've implemented a FileUpload servlet using AJAX and JS. It appears to be working well but for one issue. I used XMLHTTP so I could intercept the response in Javascript and write it out to a field...
1
by: farghal | last post by:
Hello as many people I'm new to ajax but trying my best to understand. At this point I got a problem I'm not able to solve. I've looked on several forums and googled internet but I can't find a...
21
vikas251074
by: vikas251074 | last post by:
I am getting error while entry in userid field. When user enter his user id, an event is fired immediately and user id is verified using AJAX method. But I am getting error 'Object doesn't support...
0
by: Charles Arthur | last post by:
How do i turn on java script on a villaon, callus and itel keypad mobile phone
0
by: emmanuelkatto | last post by:
Hi All, I am Emmanuel katto from Uganda. I want to ask what challenges you've faced while migrating a website to cloud. Please let me know. Thanks! Emmanuel
1
by: nemocccc | last post by:
hello, everyone, I want to develop a software for my android phone for daily needs, any suggestions?
1
by: Sonnysonu | last post by:
This is the data of csv file 1 2 3 1 2 3 1 2 3 1 2 3 2 3 2 3 3 the lengths should be different i have to store the data by column-wise with in the specific length. suppose the i have to...
0
by: Hystou | last post by:
There are some requirements for setting up RAID: 1. The motherboard and BIOS support RAID configuration. 2. The motherboard has 2 or more available SATA protocol SSD/HDD slots (including MSATA, M.2...
0
by: Hystou | last post by:
Overview: Windows 11 and 10 have less user interface control over operating system update behaviour than previous versions of Windows. In Windows 11 and 10, there is no way to turn off the Windows...
0
tracyyun
by: tracyyun | last post by:
Dear forum friends, With the development of smart home technology, a variety of wireless communication protocols have appeared on the market, such as Zigbee, Z-Wave, Wi-Fi, Bluetooth, etc. Each...
0
agi2029
by: agi2029 | last post by:
Let's talk about the concept of autonomous AI software engineers and no-code agents. These AIs are designed to manage the entire lifecycle of a software development project—planning, coding, testing,...
0
isladogs
by: isladogs | last post by:
The next Access Europe User Group meeting will be on Wednesday 1 May 2024 starting at 18:00 UK time (6PM UTC+1) and finishing by 19:30 (7.30PM). In this session, we are pleased to welcome a new...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.