473,809 Members | 2,719 Online
Bytes | Software Development & Data Engineering Community
+ Post

Home Posts Topics Members FAQ

going from code heavy WORD 9 html doc in Norwegian- to - normal HTML

hi

A friend just sent me a text translation in norwegian, that she saved
with WORD 9, as an html file

It's loaded with Microsoft code like this :

<p class=MsoNormal ><span
style='font-size:10.0pt;mso-bidi-font-size:7.5pt;
font-family:"Courier New"'>send dine opplevelser og tanker om fred til
<o:p></o:p></span></p>

I want to get rid of that overbloat Microsoft stuff.

is there a stripper tool, or webpages somewhere where I can get the
raw text with the right HTML for norwegian characters [ ie in the form
&entityname ]

I just need it to be as simple as possible as I will use my own
stylesheet on this text for my formating

thanks.

Richard

Jul 20 '05 #1
4 1699
someone wrote:
I want to get rid of that overbloat Microsoft stuff.

is there a stripper tool, or webpages somewhere where I can get the
raw text with the right HTML for norwegian characters [ ie in the form
&entityname ]


http://tidy.sf.net/ should be able to cope with it.

Make sure you RTFM to enable the extra powerful Word fixing routines.

--
David Dorward <http://dorward.me.uk/>
Jul 20 '05 #2
ry***@yahooyaho o.com (someone) wrote:
I want to get rid of that overbloat Microsoft stuff.


You could use Tidy, which was mentioned here, and which is available as
part of the HTML-Kit software too.

Alternatively, you could get "Office 2000 HTML Filter 2.0", a free and
not too big (250 kB) addition to Office, available from
http://www.microsoft.com (sorry, no direct URL, since the site is a
mess, but try to use the software name in a site search there).
Then you can open the file and "Export To Compact HTML" from Word.
It removes most of the nonsense.

--
Yucca, http://www.cs.tut.fi/~jkorpela/
Pages about Web authoring: http://www.cs.tut.fi/~jkorpela/www.html

Jul 20 '05 #3
On Sun, 28 Dec 2003 16:55:05 GMT, someone declared in
comp.infosystem s.www.authoring.html:
hi
G'day.

A friend just sent me a text translation in norwegian, that she saved
with WORD 9, as an html file

I want to get rid of that overbloat Microsoft stuff.


Try
http://www.microsoft.com/downloads/d...DBEE-3FBD-482C
-83B0-96FB79B74DED&di splaylang=EN (watch wrapping). Doesn't do quite as
good a job as Tidy, but is easier to use.

Alternately, Tidy is available as part of HTML-Kit,
http://www.chami.com/html-kit/

HTH

--
Mark Parnell
http://www.clarkecomputers.com.au
Jul 20 '05 #4
On Sun, 28 Dec 2003, someone wrote:
A friend just sent me a text translation in norwegian, that she saved
with WORD 9, as an html file


Its similarity to HTML is misleading. Most of the rubbish is put
there (as I understand it) in order to be able to round-trip to Word
format.

In addition to the options mentioned by others, there _might_ be some
mileage in reading it back into Word, re-saving it as RTF, and then
using an RTF-to-HTML converter.

The value of doing that would be chiefly if the original had been
based on a meaningful template, with named styles which mean something
to HTML (heading-N, body text, bulleted-list, and so on) rather than
being "make it this big with that font" kind of DTP rubbish. Word has
been able to do _this_ job (stylesheeted logical markup) for at least
a decade, but most of its users haven't caught up with it yet: they
still use the damned thing as if it was an electric typewriter rather
than a real word processor. So, as I say, it depends on the technique
of the person who used Word in the first place, as to whether this
kind of approach makes any sense.

If there's no logical structure, then the advice you got from other
folks, such as Tidy or the Office cleanup tool, are surely less effort
- and the result will be no worse.

(Sometimes it's best to toss all the original formatting, and just
copy/paste the content into an appropriate template. Look for e.g
postings on the topic by Eric Jarvis for advice on how best to
organise the authoring of content in multiple languages - the secret
is to set out the method at the outset, rather than trying to re-work
arbitrary formats sent in by diverse contributors.)
Jul 20 '05 #5

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

10
2306
by: Daven Nair | last post by:
Hi, I would like to know if Python supports codes similar to shell scripts: count=`ps -ef|grep "pattern"|wc -l` for count in `echo $count` do done fi
15
2670
by: Steve | last post by:
Hi, I've been charged with investigating the possibilities of internationalizing our C++ libraries. std::strings are used all over the place, and unfortunately a mixture of isalpha,isdigit,etc. functions from the C library and C++ locale stuff. To fully embrace i18n I'm wondering if we have to fully make the switch to everything being wide ( wstring, wcin, wcout, wide streams, etc.)
8
1728
by: Corrupted Mind | last post by:
I have just finished the K&R's book. And, I would like to know what to forget and add to the teaching of K&R? I ask this because I know that no book is perfect, nor complete. ( even if they are the guys who wrote C ) I would like to also help GNU projects, therefore I would like to know the programming style to be adopted.
1
1262
by: Finn J Johnsen | last post by:
The issue is creating a norwegian C# group. (in Norwegian) Sjekker "tempen" om det finnes Norske C# - programmerere her på gruppa. Hvis du er interresert i en Norsk gruppe, så meld din interesse på no.usenet.admin. Det pågår en liten diskusjon der. no.it.programmering.csharp eller no.it.programmering.dotnet, er vel kandidatene mvh Finn Johnsen
7
1444
by: Fresh Air Rider | last post by:
Hi I understand that ASP.net 2.0 (Whidbey) is going to reduce coding by 70%. Surely this is going to de-skill or dumb down the developers task and open up the task to less qualified and trained staff. Tell me if I'm wrong.
1
1403
by: Rikart Pettersen | last post by:
Hi I have problems with the Norwegian characters æøå disappearing in UserControls. I had the same problem for aspx pages, but when I changed the charset to utf-8 this solved the problem for aspx. I have seen people in other forums having the same problem as I have, but it seems that this is an unresolved problem for all. Rikart Pettersen
12
1516
by: ishtar2020 | last post by:
Hi everybody I've been writing my very first application in Python and everything is running smoothly, except for a strange problem that pops up every once in a while. I'm sure is the kind of newbie thing every seasoned programmer knows. Sometimes a receive strange Syntax Errors from parts of code that worked perfectly minutes ago. What's even more puzzling is that those errors are pointed to another part of the module when I do some...
5
2405
by: Andy | last post by:
I'm having trouble accessing an unmanaged long from a managed class in VC++.NET When I do, the contents of the variable seem to be mangled. If I access the same variable byte-by-byte, I get the correct value. Regardless what I set the variable to, the value that is returned for a long is always the same value. What's going on...can anyone help me? A short version of the code follows:
2
6724
by: joakim.hove | last post by:
Hello, I am having great problems writing norwegian characters æøå to file from a python application. My (simplified) scenario is as follows: 1. I have a web form where the user can enter his name. 2. I use the cgi module module to get to the input from the user: .... name = form.value
0
1493
by: Frank Gallagher | last post by:
July 8 2008 Governments are far more corrupt than anyone would believe other than the members of Charter Democracy Force www.cdf.name who have a prodigious amount of irrefutable evidence and are exposing them now as you read, all published on their 10 affiliate sites. On November 8 2007 I filed with the Commission for Complaints against the RCMP and on June 20 2008 RCMP Staff Sergeant R.B. MacAdam assigned to study the evidence for...
0
9602
by: Hystou | last post by:
Most computers default to English, but sometimes we require a different language, especially when relocating. Forgot to request a specific language before your computer shipped? No problem! You can effortlessly switch the default language on Windows 10 without reinstalling. I'll walk you through it. First, let's disable language synchronization. With a Microsoft account, language settings sync across devices. To prevent any complications,...
0
10639
Oralloy
by: Oralloy | last post by:
Hello folks, I am unable to find appropriate documentation on the type promotion of bit-fields when using the generalised comparison operator "<=>". The problem is that using the GNU compilers, it seems that the internal comparison operator "<=>" tries to promote arguments from unsigned to signed. This is as boiled down as I can make it. Here is my compilation command: g++-12 -std=c++20 -Wnarrowing bit_field.cpp Here is the code in...
0
10376
jinu1996
by: jinu1996 | last post by:
In today's digital age, having a compelling online presence is paramount for businesses aiming to thrive in a competitive landscape. At the heart of this digital strategy lies an intricately woven tapestry of website design and digital marketing. It's not merely about having a website; it's about crafting an immersive digital experience that captivates audiences and drives business growth. The Art of Business Website Design Your website is...
1
10383
by: Hystou | last post by:
Overview: Windows 11 and 10 have less user interface control over operating system update behaviour than previous versions of Windows. In Windows 11 and 10, there is no way to turn off the Windows Update option using the Control Panel or Settings app; it automatically checks for updates and installs any it finds, whether you like it or not. For most users, this new feature is actually very convenient. If you want to control the update process,...
0
10120
tracyyun
by: tracyyun | last post by:
Dear forum friends, With the development of smart home technology, a variety of wireless communication protocols have appeared on the market, such as Zigbee, Z-Wave, Wi-Fi, Bluetooth, etc. Each protocol has its own unique characteristics and advantages, but as a user who is planning to build a smart home system, I am a bit confused by the choice of these technologies. I'm particularly interested in Zigbee because I've heard it does some...
1
7661
isladogs
by: isladogs | last post by:
The next Access Europe User Group meeting will be on Wednesday 1 May 2024 starting at 18:00 UK time (6PM UTC+1) and finishing by 19:30 (7.30PM). In this session, we are pleased to welcome a new presenter, Adolph Dupré who will be discussing some powerful techniques for using class modules. He will explain when you may want to use classes instead of User Defined Types (UDT). For example, to manage the data in unbound forms. Adolph will...
0
6881
by: conductexam | last post by:
I have .net C# application in which I am extracting data from word file and save it in database particularly. To store word all data as it is I am converting the whole word file firstly in HTML and then checking html paragraph one by one. At the time of converting from word file to html my equations which are in the word document file was convert into image. Globals.ThisAddIn.Application.ActiveDocument.Select();...
0
5688
by: adsilva | last post by:
A Windows Forms form does not have the event Unload, like VB6. What one acts like?
3
3015
bsmnconsultancy
by: bsmnconsultancy | last post by:
In today's digital era, a well-designed website is crucial for businesses looking to succeed. Whether you're a small business owner or a large corporation in Toronto, having a strong online presence can significantly impact your brand's success. BSMN Consultancy, a leader in Website Development in Toronto offers valuable insights into creating effective websites that not only look great but also perform exceptionally well. In this comprehensive...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.