473,320 Members | 1,922 Online
Bytes | Software Development & Data Engineering Community
Post Job

Home Posts Topics Members FAQ

Join Bytes to post your question to a community of 473,320 software developers and data experts.

tidy ms word output as pure xhtml without css style and font styles

Hi,

ms word should output xhtml without any css style. Tidy
(http://tidy.sourceforge.net/) helps quite a lot but leaves the css
styles like the following:

<p class="P11 c2">foo</p>
<ul class="c4">
<li class="P11 c3">
<p class="P11 c2">bar</p>
</li>
</ul>

And I do want to have:

<p>foo</p>
<ul>
<li>
<p>bar</p>
</li>
</ul>

In other word: I want all the attributes to be deleted.

Is there an option for tidy to achive this or another small app?
TIA Martin

ps.: I could do this with xslt but the input must be xml and I have not
used xslt for some years...
--
http://www.bretschneidernet.de/me/contact OpenPGP-key: 0x4EA52583
_o)(o_ Philip R. Zimmermann:
-./\\//\.- If privacy is outlawed,
_\_VV_/_ only outlaws will have privacy.
Jul 10 '07 #1
3 3118
In article <46**********************@newsspool2.arcor-online.net>,
Martin Bretschneider <sp**@bretschneidernet.dewrote:
[...] Tidy
(http://tidy.sourceforge.net/) helps quite a lot but leaves the css
styles like the following:

<p class="P11 c2">foo</p[...]

And I do want to have:

<p>foo</p[...]
Have a look at Mihai Sucan's ReTidy:
<http://www.robodesign.ro/mihai/my-projects/retidy>. I haven't had a
chance to test it myself, but I expect it can do what you want:
<http://www.robodesign.ro/mihai/my-projects/retidy#dom_strip_attrs>

--
Sander Tekelenburg
The Web Repair Initiative: <http://webrepair.org/>
Jul 10 '07 #2
Scripsit Adrienne Boswell:
Say Word does something like:
<p style="font-weight:bold">Bold</p>
<p style="font-style:italic">Italic</p>

HTML-Tidy will do:

<p class="c1">Bold</p>
<p class="c2">Italic</p>
So is that a problem? If you wish to preserve the formatting, you use the
style sheet generated (as such or as modified). If you don't, you drop the
style sheet. I don't see why the class attributes would be a problem. On the
contrary, they might turn up to be part of a solution, if you later decide
that preserving some of the formatting is a good idea, after all - then you
just write some nice style sheet using the "handles" (class attributes) that
you already have in the markup.

--
Jukka K. Korpela ("Yucca")
http://www.cs.tut.fi/~jkorpela/

Jul 11 '07 #3
Gazing into my crystal ball I observed "Jukka K. Korpela"
<jk******@cs.tut.fiwriting in
news:s8********************@reader1.news.saunalaht i.fi:
Scripsit Adrienne Boswell:
Say Word does something like:
<p style="font-weight:bold">Bold</p>
<p style="font-style:italic">Italic</p>

HTML-Tidy will do:

<p class="c1">Bold</p>
<p class="c2">Italic</p>

So is that a problem? If you wish to preserve the formatting, you use
the style sheet generated (as such or as modified). If you don't, you
drop the style sheet. I don't see why the class attributes would be a
problem. On the contrary, they might turn up to be part of a solution,
if you later decide that preserving some of the formatting is a good
idea, after all - then you just write some nice style sheet using the
"handles" (class attributes) that you already have in the markup.
It's a problem if you want no style attributes at all. So you can leave
the style out, and then it's a empty class. Okay. It's still there, and
it just doesn't sit well with me. That's just me.

--
Adrienne Boswell at Home
Arbpen Web Site Design Services
http://www.cavalcade-of-coding.info
Please respond to the group so others can share

Jul 12 '07 #4

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

5
by: Clifford W. Racz | last post by:
Has anyone solved the issue of translating lists in Word 2003 (WordML) into xHTML? I have been trying to get the nested table code for my XSLT to work for a while now, with no way to get the...
50
by: Christopher Benson-Manica | last post by:
(if this isn't the place for XHTML, I'd appreciate a redirect) According to the w3's web site, some non-HTML 4 browsers won't properly interpret non-minimized boolean attributes, i.e. <option...
4
by: Alexander Bolotnov | last post by:
I am trying to read xhtml spec and use one of its examples about css2 in xhtml. The example on the PDF paper with internal styles defenition works just fine. When I try to use an external file...
12
by: Stefan Weiss | last post by:
Hi. (this is somewhat similar to yesterday's thread about empty links) I noticed that Tidy issues warnings whenever it encounters empty tags, and strips those tags if cleanup was requested....
0
by: Maileen | last post by:
Hi, I try to add/modify some style in Word 2002 and 2003 using VB, but I have such error : System.runtime.interopServces.COMException(0x800A1735) : Given Items does not exist. at...
2
by: briano | last post by:
Is there a library that allows editing Word documents in the browser that is browser independent? Please forgive this question. Just trying to cover all bases. So far what I see is HTML or RTF...
4
by: Schraalhans Keukenmeester | last post by:
I recently discovered the value of tidy for my html adventures. Nice little app. Only one thing is becoming a bit of a drag. If I use tidy to clean up my code, it inserts the following in every...
2
by: Ola K | last post by:
Hi guys, I wrote a script that works *almost* perfectly, and this lack of perfection simply puzzles me. I simply cannot point the whys, so any help on it will be appreciated. I paste it all here,...
1
by: Darsin | last post by:
What i am doing is to pull the data from a CMS and import it to Word 2007 Beta and i also have to export the data from Word 2007 Beta back to that CMS. We have with us two Web Services of the CMS....
0
by: DolphinDB | last post by:
The formulas of 101 quantitative trading alphas used by WorldQuant were presented in the paper 101 Formulaic Alphas. However, some formulas are complex, leading to challenges in calculation. Take...
0
by: DolphinDB | last post by:
Tired of spending countless mintues downsampling your data? Look no further! In this article, you’ll learn how to efficiently downsample 6.48 billion high-frequency records to 61 million...
0
by: ryjfgjl | last post by:
ExcelToDatabase: batch import excel into database automatically...
1
isladogs
by: isladogs | last post by:
The next Access Europe meeting will be on Wednesday 6 Mar 2024 starting at 18:00 UK time (6PM UTC) and finishing at about 19:15 (7.15PM). In this month's session, we are pleased to welcome back...
0
by: ArrayDB | last post by:
The error message I've encountered is; ERROR:root:Error generating model response: exception: access violation writing 0x0000000000005140, which seems to be indicative of an access violation...
1
by: PapaRatzi | last post by:
Hello, I am teaching myself MS Access forms design and Visual Basic. I've created a table to capture a list of Top 30 singles and forms to capture new entries. The final step is a form (unbound)...
0
by: CloudSolutions | last post by:
Introduction: For many beginners and individual users, requiring a credit card and email registration may pose a barrier when starting to use cloud servers. However, some cloud server providers now...
0
by: Defcon1945 | last post by:
I'm trying to learn Python using Pycharm but import shutil doesn't work
0
by: Shællîpôpï 09 | last post by:
If u are using a keypad phone, how do u turn on JavaScript, to access features like WhatsApp, Facebook, Instagram....

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.