473,395 Members | 1,941 Online
Bytes | Software Development & Data Engineering Community
Post Job

Home Posts Topics Members FAQ

Join Bytes to post your question to a community of 473,395 software developers and data experts.

Strip useless whitespace from XML

Does anyone have any idea on how I can strip the extra whitespace in
the XML that shows up when I receive a response from an ASP.NET 2.0
webservice? This has been discussed before, but no one has ever come up
with a good answer to what seems like such a common question.

http://groups.google.com/group/micro...fff1b27460a421
http://groups.google.com/group/micro...240db31e1d3fe7
http://groups.google.com/group/micro...a3026a4d286e63

It just seems very silly to have it included since another program will
be consuming it, and since that other program could care less about
whitespace. Not only is it a waste of bandwidth (albeit not _that_
much), but it causes serious implications for me when I try to parse
through the XML document in JavaScript.

For example, if the XmlDocument that I return has two nodes on it, the
child nodes of the root element of the response has five child nodes in
JavaScript when I parse the doc in Firefox (two are my elements, and
three are the whitespace). IE does return that there are two child
nodes, but the whitespace is still there nonetheless. I can code around
it by building a function that looks for the right tag name or that
ignores whitespace elements, but it's dumb that I'd have to do that in
the first place.

Also, contrary to the other posters, I am certain that in this case the
whitespace actually does come from the server and isn't being inserted
in the document for readability reasons or whatever. I've used Fiddler
and taken a look at the hex data that comes back, and, sure enough,
there is a <CRLFat the end of every node and the indent spaces at the
beginning of every node.

Any takers on this question that seems to have been around forever
without a good answer?

Nov 6 '06 #1
9 2527
<am*****@gmail.comwrote in message
news:11**********************@m7g2000cwm.googlegro ups.com...
Does anyone have any idea on how I can strip the extra whitespace in
the XML that shows up when I receive a response from an ASP.NET 2.0
webservice? This has been discussed before, but no one has ever come up
with a good answer to what seems like such a common question.
My answer would be that you're probably wasting more time thinking about
stripping whitespace than will ever be wasted in sending and receiving it.

Is this really the next most important thing you have to do? Is this the
next performance bottleneck in your application?

John
Nov 7 '06 #2
Like I said, it's not so much about bandwidth as much as it is a pain
to deal with in all of my JavaScript apps that count the whitespace as
actual nodes. It causes problems when I both iterate through the an
element's' child nodes as well as try to directly access a child node
by index. For example, when I send an XMLHTTPRequest to the server and
know that I'm only looking to get a single node back, I actually have
to either a) get childNode[1] instead of childNode[0], b) call
Prototype's Element.cleanWhitespace() on the parent node (which doesn't
handle recursive nodes for XML well it seems), or c) check the nodeType
in each one of my nodes manually in a loop. As I said, there are ways
around it, but doesn't it seem odd to anyone that it's not easily
disabled?

The only reason I ask all this is because it's a constant pain in my
side to have to program around it every time. If there was something I
can do globally per server or web.config that would strip out the
useless whitespace, it would ultimately save me much more time than
having to code around it in every single one of my apps.
On Nov 6, 8:33 pm, "John Saunders" <john.saunders at trizetto.com>
wrote:
<amat...@gmail.comwrote in messagenews:11**********************@m7g2000cwm.go oglegroups.com...
Does anyone have any idea on how I can strip the extra whitespace in
the XML that shows up when I receive a response from an ASP.NET 2.0
webservice? This has been discussed before, but no one has ever come up
with a good answer to what seems like such a common question.My answer would be that you're probably wasting more time thinking about
stripping whitespace than will ever be wasted in sending and receiving it.

Is this really the next most important thing you have to do? Is this the
next performance bottleneck in your application?

John
Nov 8 '06 #3
<am*****@gmail.comwrote in message
news:11*********************@m7g2000cwm.googlegrou ps.com...
Like I said, it's not so much about bandwidth as much as it is a pain
to deal with in all of my JavaScript apps that count the whitespace as
actual nodes.
Whitespace _is_ a node. It's a whitespace node, not an element or attribute.
If your JavaScript code ignores whitespace nodes (as it no doubt ignores
comment nodes), then it should have no problems.

Am I mistaken?

John
Nov 8 '06 #4
While I understand that it's a node, I'm wondering if there's an option
that I can disable the output of such nodes since they screw up my
application. Yes, it is definitely possible to have JavaScript check if
a node is a whitespace node and ignore it, but it gets annoying that I
have to program around it _every single time_. It would be so much
easier to just do this:

// where firstChild is the root element and where I want childNode[0]
to be the string node returned from the webservice
var nodeImLookingFor = serverResponse.firstChild.childNode[0];

.... rather than this ...

// what happens if msft suddenly disables whitespace by default on me
or Opera / IE6 handle it one way and Safari / Firefox / IE7 handle it
another?
// would I have to make cases for each browser then??
var nodeImLookingFor;
if (isIE)
nodeImLookingFor = serverResponse.firstChild.childNode[0];
else
nodeImLookingFor = serverResponse.firstChild.childNode[1];

.... or this ...

// this is a pain to do every time if all I want is the first node!
var nodeImLookingFor;
for (var i = 0; i < nodeImLookingFor.childNodes.length; i++) {
if (serverResponse.childNodes[i].nodeType == 3)
continue;
nodeImLookingFor = serverResponse.childNodes[i];
}

See how silly it is to have to either a) write a global function that I
have to remember to include in every project I do that will fix this,
or b) manually program around it every time? I assure you that,
considering how often I have to address the issue, disabling the output
of unnecessary whitespace nodes will be worth it for me and any other
developer that has to contend with differences in how JavaScript is
handled in different browsers and the annoyance of having to work
around it as often as I have to.
On Nov 8, 4:22 am, "John Saunders" <john.saunders at trizetto.com>
wrote:
<amat...@gmail.comwrote in messagenews:11*********************@m7g2000cwm.goo glegroups.com...
Like I said, it's not so much about bandwidth as much as it is a pain
to deal with in all of my JavaScript apps that count the whitespace as
actual nodes.Whitespace _is_ a node. It's a whitespace node, not an element or attribute.
If your JavaScript code ignores whitespace nodes (as it no doubt ignores
comment nodes), then it should have no problems.

Am I mistaken?

John
Nov 8 '06 #5
<am*****@gmail.comwrote in message
news:11*********************@f16g2000cwb.googlegro ups.com...
While I understand that it's a node, I'm wondering if there's an option
that I can disable the output of such nodes since they screw up my
application. Yes, it is definitely possible to have JavaScript check if
a node is a whitespace node and ignore it, but it gets annoying that I
have to program around it _every single time_. It would be so much
easier to just do this:

// where firstChild is the root element and where I want childNode[0]
to be the string node returned from the webservice
var nodeImLookingFor = serverResponse.firstChild.childNode[0];
(sigh) If you insist on ignoring important details of the standards you're
using, then you're going to have problems.

Whitespace nodes and comment nodes are to be expected. Even if you could get
..NET to stop sending the whitespace today, you still could not guarantee
that a comment node wouldn't slip in somehow.

John
Nov 9 '06 #6
It's becoming quite pointless to argue with you here, but I'll bite
once more even though I was just looking for a quick fix.

I'm absolutely, definitely not looking to ignore any standards -- take
a look at my example code above if you don't believe me. Even though
Microsoft's own IE 6 _does_ ignore the standard that the whitespace is
a node, Firefox does not. Therefore, I have to program around it to be
truly standards compliant. If I wanted to ignore standards, I'd just
let IE's busted XML parsing engine ignore the standards for me.

I suppose that, overall, I'm really frustrated that I don't have
control over the output of my webservice. I'm not complaining that
whitespace or comments shouldn't be nodes. I'm just wondering if there
was a way I could disable the automatic output of them so I don't have
to deal with them in my code. Is it really that unreasonable to think
that, if I just wanted to return a single string node without any
whitespace or comments, I could?

Goodness. I didn't expect to start a flame war or bash any standards /
companies. I'm just looking for some quick help from someone else who
might be better educated on webservice output than I am.
On Nov 8, 9:12 pm, "John Saunders" <john.saunders at trizetto.com>
wrote:
<amat...@gmail.comwrote in messagenews:11*********************@f16g2000cwb.go oglegroups.com...
While I understand that it's a node, I'm wondering if there's an option
that I can disable the output of such nodes since they screw up my
application. Yes, it is definitely possible to have JavaScript check if
a node is a whitespace node and ignore it, but it gets annoying that I
have to program around it _every single time_. It would be so much
easier to just do this:
// where firstChild is the root element and where I want childNode[0]
to be the string node returned from the webservice
var nodeImLookingFor = serverResponse.firstChild.childNode[0];(sigh) If you insist on ignoring important details of the standards you're
using, then you're going to have problems.

Whitespace nodes and comment nodes are to be expected. Even if you could get
.NET to stop sending the whitespace today, you still could not guarantee
that a comment node wouldn't slip in somehow.

John
Nov 9 '06 #7
<am*****@gmail.comwrote in message
news:11**********************@b28g2000cwb.googlegr oups.com...
It's becoming quite pointless to argue with you here, but I'll bite
once more even though I was just looking for a quick fix.
The reason you're unlikely to find a quick fix is that you're trying to fix
something that's not supposed to matter. You appear to want the fix because
it appears that you can't be bothered to write code which takes into
consideration that whitespace and comment nodes may be interspersed with
other nodes. My point was simply that you don't need the quick fix if you
cod to the standard.

Obviously, if someone reading this knows of the quick fix the OP is looking
for, then please provide it. When you do provide the quick fix, you should
probably indicate whether the fix is likely to break in the future. For
instance, will his code break if he some day configures his web service with
a Soap Extension that happens to produce whitespace nodes.
John
Nov 9 '06 #8
You are correct in that I don't want to be bothered to write code that
I don't believe I should have to and that I believe there should be an
easy fix for. It seems that Microsoft even recognized this issue by
including a PreserveWhitespace property on the XmlDocument class, but
missed the inclusion of it into the webservice as far as I can tell. I
was just looking for someone else to confirm this for me.

You're argument for me coding around something "that's not supposed to
matter" is really quite juvenile if you think about it. The point your
trying to make does not target my original question at all (which,
again, has nothing to do with standards but rather the output of the
data). It's almost as if you're arguing with me because you don't have
a good answer and you just feel like arguing.

For anyone else suggesting any information on a fix, I would also like
to point out that I will have full control over the JavaScript,
webservice, servers, and any / all server-side caches and/or proxies
for the lifetime of the applications I make. Therefore, I won't need to
worry about any random third parties injecting whitespace or comments
into my XML via SOAP extensions or anything else.

Nov 9 '06 #9
<am*****@gmail.comwrote in message
news:11**********************@b28g2000cwb.googlegr oups.com...
You are correct in that I don't want to be bothered to write code that
I don't believe I should have to and that I believe there should be an
easy fix for. It seems that Microsoft even recognized this issue by
including a PreserveWhitespace property on the XmlDocument class, but
missed the inclusion of it into the webservice as far as I can tell. I
was just looking for someone else to confirm this for me.

You're argument for me coding around something "that's not supposed to
matter" is really quite juvenile if you think about it. The point your
trying to make does not target my original question at all (which,
again, has nothing to do with standards but rather the output of the
data).
XML web services aren't standards-based?
>It's almost as if you're arguing with me because you don't have
a good answer and you just feel like arguing.

For anyone else suggesting any information on a fix, I would also like
to point out that I will have full control over the JavaScript,
webservice, servers, and any / all server-side caches and/or proxies
for the lifetime of the applications I make. Therefore, I won't need to
worry about any random third parties injecting whitespace or comments
into my XML via SOAP extensions or anything else.
In this case, you'd be fine, unless Microsoft changed the output format.
That's not something you have control over.

John
Nov 10 '06 #10

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

5
by: qwweeeit | last post by:
Hi all, I need to limit as much as possible the lenght of a source line, stripping white spaces (except indentation). For example: .. . max_move and AC_RowStack.acceptsCards ( self,...
17
by: Stanimir Stamenkov | last post by:
Is it possible to make two inline elements to appear adjacent stripping any white space appearing in between in the source? Example: <span class="adj">1</span> <span class="adj">2</span>...
5
by: vMike | last post by:
Is there any benefit to stripping all the space from the page before rendering by overriding the page render and using the htmltextwriter and stringbuilder to strip linefeeds, tabs and extra space...
1
by: KevinGPO | last post by:
I am wondering about what's the best and easiest way to strip trailing whitespace from every single file in a folder, recursively. I want to write a program/script so that you pass in a folder...
6
by: rtilley | last post by:
s = ' qazwsx ' # How are these different? print s.strip() print str.strip(s) Do string objects all have the attribute strip()? If so, why is str.strip() needed? Really, I'm just curious......
5
by: micklee74 | last post by:
hi i have a file test.dat eg abcdefgh ijklmn <-----newline opqrs tuvwxyz
6
by: eight02645999 | last post by:
hi can someone explain strip() for these : 'example' when i did this: 'abcd,words.words'
7
by: Nick | last post by:
strip() isn't working as i expect, am i doing something wrong - Sample data in file in.txt: 'AF':'AFG':'004':'AFGHANISTAN':'Afghanistan' 'AL':'ALB':'008':'ALBANIA':'Albania'...
3
by: Colin J. Williams | last post by:
The Library Reference has strip( ) Return a copy of the string with the leading and trailing characters removed. The chars argument is a string specifying the set of characters to be removed....
0
by: ryjfgjl | last post by:
In our work, we often receive Excel tables with data in the same format. If we want to analyze these data, it can be difficult to analyze them because the data is spread across multiple Excel files...
0
BarryA
by: BarryA | last post by:
What are the essential steps and strategies outlined in the Data Structures and Algorithms (DSA) roadmap for aspiring data scientists? How can individuals effectively utilize this roadmap to progress...
0
by: Hystou | last post by:
There are some requirements for setting up RAID: 1. The motherboard and BIOS support RAID configuration. 2. The motherboard has 2 or more available SATA protocol SSD/HDD slots (including MSATA, M.2...
0
marktang
by: marktang | last post by:
ONU (Optical Network Unit) is one of the key components for providing high-speed Internet services. Its primary function is to act as an endpoint device located at the user's premises. However,...
0
by: Hystou | last post by:
Most computers default to English, but sometimes we require a different language, especially when relocating. Forgot to request a specific language before your computer shipped? No problem! You can...
0
Oralloy
by: Oralloy | last post by:
Hello folks, I am unable to find appropriate documentation on the type promotion of bit-fields when using the generalised comparison operator "<=>". The problem is that using the GNU compilers,...
0
jinu1996
by: jinu1996 | last post by:
In today's digital age, having a compelling online presence is paramount for businesses aiming to thrive in a competitive landscape. At the heart of this digital strategy lies an intricately woven...
0
by: Hystou | last post by:
Overview: Windows 11 and 10 have less user interface control over operating system update behaviour than previous versions of Windows. In Windows 11 and 10, there is no way to turn off the Windows...
0
tracyyun
by: tracyyun | last post by:
Dear forum friends, With the development of smart home technology, a variety of wireless communication protocols have appeared on the market, such as Zigbee, Z-Wave, Wi-Fi, Bluetooth, etc. Each...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.