473,479 Members | 2,115 Online
Bytes | Software Development & Data Engineering Community
Create Post

Home Posts Topics Members FAQ

How to write unicode-characters to a RTF-doc ?

I need to produce a RTF-document which is filled with
data from a database.
I've created a RTF-document in WordPad (a template,
so to speak) which contains 'placeholders', for example
'<dd01>', '<dd02>', etc.

I read the entire template into a StringBuilder and
then perform a simple 'replace' on it, using a Hashtable.
The keys in the Hashtable are strings representing the
placeholders and the Hashtable's values contain data
from the database.

After the replace-action, I write the content of the
StringBuilder to a file with extension '.rtf'
(See code below).

It works like a charm, I can read the file with Word
(or WordPad) and it looks alright.

---

But ... Problems arise when the data from the database
contains characters like é ë è ï ó ö, etc. (Are these
called 'unicode-characters' ?)

These characters get converted to 'gibberish' when viewing
the generated rtf-doc in Word.
Then I thought that I probably needed to add 'Encoding.Unicode'
when writing the file, but when I do that, the generated file
is no longer recognized by Word as a valid RTF-doc.
Word then complains, 'this is an encoded file, install importfilters,
etc ...'.
My two questions now are :

1. How can I write unicode-characters to my RTF-template 'the
right way' ?

2. Why doesn't Word recognize a simple RTF-document no longer
after it was written using 'Encoding.Unicode' ?
I thought a RTF-document is basically just plain text (however
containing a lot of mark-up code) and by using 'Encoding.Unicode',
I'm only telling, 'this plain-text may contain unicode-characters'.
Right ?

//---

This is the code :

private void writeForm(string pathTemplate, string pathTempFile, Hashtable formData)
{
TextReader syncReader = TextReader.Synchronized(new StreamReader(pathTemplate));
TextWriter syncWriter = TextWriter.Synchronized(new StreamWriter(pathTempFile));

StringBuilder emptyTemplate = new StringBuilder(syncReader.ReadToEnd());
StringBuilder filledDoc = fillTemplate(emptyTemplate, formData);
syncWriter.Write(filledDoc);

syncReader.Close();
syncWriter.Close();
}

private StringBuilder fillTemplate(StringBuilder doc, Hashtable formData)
{
IDictionaryEnumerator myEnumerator = formData.GetEnumerator();
while (myEnumerator.MoveNext())
{
System.Diagnostics.Debug.WriteLine("1 : " + (string) myEnumerator.Value);
doc = doc.Replace( ( (string) myEnumerator.Key), (string) myEnumerator.Value);
}
return doc;
}

//---
Nov 17 '05 #1
3 17937
<jo**@wezayzo.com> wrote:
I need to produce a RTF-document which is filled with
data from a database.
I've created a RTF-document in WordPad (a template,
so to speak) which contains 'placeholders', for example
'<dd01>', '<dd02>', etc.

I read the entire template into a StringBuilder and
then perform a simple 'replace' on it, using a Hashtable.
The keys in the Hashtable are strings representing the
placeholders and the Hashtable's values contain data
from the database.

After the replace-action, I write the content of the
StringBuilder to a file with extension '.rtf'
(See code below).

It works like a charm, I can read the file with Word
(or WordPad) and it looks alright.

---

But ... Problems arise when the data from the database
contains characters like é ë è ï ó ö, etc. (Are these
called 'unicode-characters' ?)
Well, all characters in .NET are Unicode.
These characters get converted to 'gibberish' when viewing
the generated rtf-doc in Word.
Okay. I think you need to find the specifications for RTF and work out
which encoding to use. By default, StreamWriter will be using UTF-8. It
sounds like that's no good for you, but you shouldn't just pick
encodings at random - you could find one which appears to work, but
fails with some data you don't test it with.

Looking at the docs at www.wotsit.org, it looks like it *is* possible
to specify encodings, but that Word doesn't understand UTF-8 encoded
text. You may need to "manually" encode (with \UN) characters which
aren't in the appropriate code-page - I'd go with anything non-ASCII.
2. Why doesn't Word recognize a simple RTF-document no longer
after it was written using 'Encoding.Unicode' ?
I thought a RTF-document is basically just plain text (however
containing a lot of mark-up code) and by using 'Encoding.Unicode',
I'm only telling, 'this plain-text may contain unicode-characters'.
Right ?


No - it's entirely changing what the file looks like. See
http://www.pobox.com/~skeet/csharp/unicode.html to understand what
Encodings are about.

--
Jon Skeet - <sk***@pobox.com>
http://www.pobox.com/~skeet Blog: http://www.msmvps.com/jon.skeet
If replying to the group, please do not mail me too
Nov 17 '05 #2
> <jo**@wezayzo.com> wrote:
I need to produce a RTF-document which is filled with
data from a database.
I've created a RTF-document in WordPad (a template,
so to speak) which contains 'placeholders', for example
'<dd01>', '<dd02>', etc.

I read the entire template into a StringBuilder and
then perform a simple 'replace' on it, using a Hashtable.
The keys in the Hashtable are strings representing the
placeholders and the Hashtable's values contain data
from the database.

After the replace-action, I write the content of the
StringBuilder to a file with extension '.rtf'
(See code below).

It works like a charm, I can read the file with Word
(or WordPad) and it looks alright.

---

But ... Problems arise when the data from the database
contains characters like é ë è ï ó ö, etc. (Are these
called 'unicode-characters' ?)


Well, all characters in .NET are Unicode.
These characters get converted to 'gibberish' when viewing
the generated rtf-doc in Word.


Okay. I think you need to find the specifications for RTF and work out
which encoding to use. By default, StreamWriter will be using UTF-8. It
sounds like that's no good for you, but you shouldn't just pick
encodings at random - you could find one which appears to work, but
fails with some data you don't test it with.

Looking at the docs at www.wotsit.org, it looks like it *is* possible
to specify encodings, but that Word doesn't understand UTF-8 encoded
text. You may need to "manually" encode (with \UN) characters which
aren't in the appropriate code-page - I'd go with anything non-ASCII.
2. Why doesn't Word recognize a simple RTF-document no longer
after it was written using 'Encoding.Unicode' ?
I thought a RTF-document is basically just plain text (however
containing a lot of mark-up code) and by using 'Encoding.Unicode',
I'm only telling, 'this plain-text may contain unicode-characters'.
Right ?


No - it's entirely changing what the file looks like. See
http://www.pobox.com/~skeet/csharp/unicode.html to understand what
Encodings are about.

Thank you very much, Jon.
I'm gonna study your page on unicode.
Nov 17 '05 #3
The most portable RTF forms is ascii only.
All high-characters (all above 127) should be escaped with \u
See the specs here: http://support.microsoft.com/kb/q86999/

--
Mihai Nita [Microsoft MVP, Windows - SDK]
------------------------------------------
Replace _year_ with _ to get the real email
Nov 17 '05 #4

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

1
2294
by: kent sin | last post by:
Python support unicode, but some library don't. Write is one of them. When writing a csv file, The rows contains numbers and unicode str. It is a little pain to first convert all unicode str to...
1
2092
by: Twopair | last post by:
I have a problem DB:SQL Server Field type is varchar and some data is unicode like this "峯" It is a chinese. How do I transfer the data I retrive from DB and then I can write it into an text...
1
7593
by: Linda Chen | last post by:
I need to write some unicode symbols such as degree symbol (for example 36°) by using XMLTextWrite but couldn't make it work. I found the degree char in unicode is '\u030A' and here is my sample...
0
5631
by: Ahmed A. | last post by:
This will be very helpfull for many! Using RichTextBox Read/Write Unicode File http://www.microsoft.com/indonesia/msdn/wnf_RichTextBox.as p Private Function ReadFile(ByVal myfile As String)...
1
2408
by: raj.sinha | last post by:
I have to "PUT" data to a Unicode file... a file that has the "FF FE" mark at the beginning of the file. How do i do that. What HTTP header do i need to send so that the data is stored in the...
15
3874
by: luc.saffre | last post by:
Hello, here is something that surprises me. #coding: iso-8859-1 s1=u"Frau Müller machte große Augen" s2="Frau Müller machte große Augen" if s1 == s2: pass
2
11508
by: Frank Potter | last post by:
I want to change an srt file to unicode format so mpalyer can display Chinese subtitles properly. I did it like this: txt=open('dmd-guardian-cd1.srt').read() txt=unicode(txt,'gb18030')...
5
6613
by: Martin | last post by:
I get below error when trying to write unicode xml to a zipfile. zip.writestr('content.xml', content.toxml()) File "/usr/lib/python2.4/zipfile.py", line 460, in writestr zinfo.CRC =...
1
1539
by: Kambojia | last post by:
Hi, I want to write Unicode to file , Help me !
12
13707
by: freeseif | last post by:
Hi programmers, I want read line by line a Unicode (UTF-8) text file created by Notepad, i don't want display the Unicode string in the screen, i want just read and compare the strings!. This...
0
7027
marktang
by: marktang | last post by:
ONU (Optical Network Unit) is one of the key components for providing high-speed Internet services. Its primary function is to act as an endpoint device located at the user's premises. However,...
0
6899
by: Hystou | last post by:
Most computers default to English, but sometimes we require a different language, especially when relocating. Forgot to request a specific language before your computer shipped? No problem! You can...
0
7067
jinu1996
by: jinu1996 | last post by:
In today's digital age, having a compelling online presence is paramount for businesses aiming to thrive in a competitive landscape. At the heart of this digital strategy lies an intricately woven...
0
6847
tracyyun
by: tracyyun | last post by:
Dear forum friends, With the development of smart home technology, a variety of wireless communication protocols have appeared on the market, such as Zigbee, Z-Wave, Wi-Fi, Bluetooth, etc. Each...
1
4757
isladogs
by: isladogs | last post by:
The next Access Europe User Group meeting will be on Wednesday 1 May 2024 starting at 18:00 UK time (6PM UTC+1) and finishing by 19:30 (7.30PM). In this session, we are pleased to welcome a new...
0
4463
by: conductexam | last post by:
I have .net C# application in which I am extracting data from word file and save it in database particularly. To store word all data as it is I am converting the whole word file firstly in HTML and...
0
2980
by: TSSRALBI | last post by:
Hello I'm a network technician in training and I need your help. I am currently learning how to create and manage the different types of VPNs and I have a question about LAN-to-LAN VPNs. The...
0
1288
by: 6302768590 | last post by:
Hai team i want code for transfer the data from one system to another through IP address by using C# our system has to for every 5mins then we have to update the data what the data is updated ...
1
555
muto222
php
by: muto222 | last post by:
How can i add a mobile payment intergratation into php mysql website.

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.