473,785 Members | 2,639 Online
Bytes | Software Development & Data Engineering Community
+ Post

Home Posts Topics Members FAQ

Special character to &abc equivalents

Hi,

I'm reading a file and writing it to the html output for a page.

I've come across two difficulties which I would like to solve.

The files contain special characters from European alphabets, namely
those which have the two little dots above the vowels called umlauts.

Normally, these are rendered in html using "%auml;", but in the file
they are just ä.

1. I'm using a StreamReader to read the file and I have found that if I
don't use System.Text.Enc oding.UTF7 then the characters are lost
completely. Is this the correct way, or is there a way to automatically
get the Stream Reader to select the correct encoding, or use other code
to determine which would be best?

2. Having read the character from the file, it is output literally to
the html, which I guess is to be expected. Is there a way to process a
string in order to change the ä to &äuml; and so on.

Thanks in advance for any replies.
Nov 19 '05 #1
8 1275
My advice u set underlying operating system encoding whatever u want. And
use streamreader and streamwriter with System.Text.Enc oding.Default which
uses underlying OS encoding.

I had same problems with Turkish encoding but this is the best solution
(IMHO)

--

Thanks,
Yunus Emre ALPÖZEN
BSc, MCAD.NET

"Colin Peters" <cp*****@coldma il.com> wrote in message
news:42******** @news.bluewin.c h...
Hi,

I'm reading a file and writing it to the html output for a page.

I've come across two difficulties which I would like to solve.

The files contain special characters from European alphabets, namely those
which have the two little dots above the vowels called umlauts.

Normally, these are rendered in html using "%auml;", but in the file they
are just ä.

1. I'm using a StreamReader to read the file and I have found that if I
don't use System.Text.Enc oding.UTF7 then the characters are lost
completely. Is this the correct way, or is there a way to automatically
get the Stream Reader to select the correct encoding, or use other code to
determine which would be best?

2. Having read the character from the file, it is output literally to the
html, which I guess is to be expected. Is there a way to process a string
in order to change the ä to &äuml; and so on.

Thanks in advance for any replies.

Nov 19 '05 #2
Colin Peters wrote:
Hi,

I'm reading a file and writing it to the html output for a page.

I've come across two difficulties which I would like to solve.

The files contain special characters from European alphabets, namely
those which have the two little dots above the vowels called umlauts.

Normally, these are rendered in html using "%auml;", but in the file
they are just ä.

1. I'm using a StreamReader to read the file and I have found that if
I don't use System.Text.Enc oding.UTF7 then the characters are lost
completely.
UTF-7 is hardly what you want. Did you try ISO-8859-1? Or Windows-1252?
Is this the correct way, or is there a way to
automatically get the Stream Reader to select the correct encoding,
or use other code to determine which would be best?
In general, there's no way to guess a character encoding because
there's no universal metadata that could tell you what encoding is
being used.

To put it differently: You must know the encoding, or allow the user to
switch between possible encodings.

2. Having read the character from the file, it is output literally to
the html, which I guess is to be expected. Is there a way to process
a string in order to change the ä to &äuml; and so on.


That's not necessary if the page is encoded correctly.

Cheers,
--
http://www.joergjooss.de
mailto:ne****** **@joergjooss.d e
Nov 19 '05 #3
Yunus Emre ALPÖZEN [MCAD.NET] wrote:
My advice u set underlying operating system encoding whatever u want. And
use streamreader and streamwriter with System.Text.Enc oding.Default which
uses underlying OS encoding.

I had same problems with Turkish encoding but this is the best solution
(IMHO)


Unfortunately I'm using shared hosting. I have little influence over
operating system parameters.

Thanks anyway.
Nov 19 '05 #4
Joerg Jooss wrote:
UTF-7 is hardly what you want. Did you try ISO-8859-1? Or Windows-1252?

I didn't see this as an option provided by Intellisense for the class:
System.Text.Enc oding

Thanks anyway.
Colin Peters wrote:

Hi,

I'm reading a file and writing it to the html output for a page.

I've come across two difficulties which I would like to solve.

The files contain special characters from European alphabets, namely
those which have the two little dots above the vowels called umlauts.

Normally, these are rendered in html using "%auml;", but in the file
they are just ä.

1. I'm using a StreamReader to read the file and I have found that if
I don't use System.Text.Enc oding.UTF7 then the characters are lost
completely.

UTF-7 is hardly what you want. Did you try ISO-8859-1? Or Windows-1252?

Is this the correct way, or is there a way to
automatical ly get the Stream Reader to select the correct encoding,
or use other code to determine which would be best?

In general, there's no way to guess a character encoding because
there's no universal metadata that could tell you what encoding is
being used.

To put it differently: You must know the encoding, or allow the user to
switch between possible encodings.
2. Having read the character from the file, it is output literally to
the html, which I guess is to be expected. Is there a way to process
a string in order to change the ä to &äuml; and so on.

That's not necessary if the page is encoded correctly.

Cheers,

Nov 19 '05 #5
You can set the encoding as a Page directive.

<%@Page Language="VB" ResponseEncodin g="UTF-8"%>

<%@Page Language="C#" ResponseEncodin g="ISO-8859-1"%>

Juan T. Llibre
ASP.NET MVP
http://asp.net.do/foros/
Foros de ASP.NET en Español
Ven, y hablemos de ASP.NET...
=============== =======

"Colin Peters" <cp*****@coldma il.com> wrote in message
news:42******** ******@coldmail .com...
Joerg Jooss wrote:
UTF-7 is hardly what you want. Did you try ISO-8859-1? Or Windows-1252?

I didn't see this as an option provided by Intellisense for the class:
System.Text.Enc oding

Thanks anyway.

Colin Peters wrote:
Hi,

I'm reading a file and writing it to the html output for a page.

I've come across two difficulties which I would like to solve.

The files contain special characters from European alphabets, namely
those which have the two little dots above the vowels called umlauts. 4>>>Normally, these are rendered in html using "%auml;", but in the file
they are just ä.

1. I'm using a StreamReader to read the file and I have found that if
I don't use System.Text.Enc oding.UTF7 then the characters are lost
completely .

UTF-7 is hardly what you want. Did you try ISO-8859-1? Or Windows-1252?

Is this the correct way, or is there a way to
automaticall y get the Stream Reader to select the correct encoding,
or use other code to determine which would be best?

In general, there's no way to guess a character encoding because
there's no universal metadata that could tell you what encoding is
being used.

To put it differently: You must know the encoding, or allow the user to
switch between possible encodings.
2. Having read the character from the file, it is output literally to
the html, which I guess is to be expected. Is there a way to process
a string in order to change the ä to &äuml; and so on.

That's not necessary if the page is encoded correctly.

Cheers,

Nov 19 '05 #6
Colin Peters wrote:
Joerg Jooss wrote:
> UTF-7 is hardly what you want. Did you try ISO-8859-1? Or

Windows-1252?
I didn't see this as an option provided by Intellisense for the class:
System.Text.Enc oding


There are only a few default instances in Encoding. You can construct
all encodings by name using Encoding.GetEnc oding(), e.g.

Encoding enc = Encoding.GetEnc oding("ISO-8859-1").

Cheers,
--
http://www.joergjooss.de
mailto:ne****** **@joergjooss.d e
Nov 19 '05 #7
Aha! The penny has dropped. Or in this case, the Euro.

Many thanks to all.

Joerg Jooss wrote:
Colin Peters wrote:

Joerg Jooss wrote:
> UTF-7 is hardly what you want. Did you try ISO-8859-1? Or

Windows-1252?
I didn't see this as an option provided by Intellisense for the class:
System.Text.E ncoding

There are only a few default instances in Encoding. You can construct
all encodings by name using Encoding.GetEnc oding(), e.g.

Encoding enc = Encoding.GetEnc oding("ISO-8859-1").

Cheers,

Nov 19 '05 #8
Server.HtmlEnco de(string) will convert any "special chars" from a text file
to the relevant &abc; equivalent without having to worry about codepages... I
use it in my chat application to prevent malicious code being inserted into
the database.

Regards,

Paul Parkinson (www.elysaria.com)

"Colin Peters" wrote:
Aha! The penny has dropped. Or in this case, the Euro.

Many thanks to all.

Joerg Jooss wrote:
Colin Peters wrote:

Joerg Jooss wrote:

> UTF-7 is hardly what you want. Did you try ISO-8859-1? Or
Windows-1252?
I didn't see this as an option provided by Intellisense for the class:
System.Text.E ncoding

There are only a few default instances in Encoding. You can construct
all encodings by name using Encoding.GetEnc oding(), e.g.

Encoding enc = Encoding.GetEnc oding("ISO-8859-1").

Cheers,

Nov 19 '05 #9

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

8
5231
by: dbuser | last post by:
Hi, I need help on a problem, as described below. I am reading a file "input.txt"which has data like this: abc def gh izk lmnopq rst uvwxyz I am using fstream object to read the file and writing into a dynamic array. My problem is that the array shows extra z and probably because of this further processing gives run time error in borland compiler. Can you please tell me, if the problem is related to handling end-of line , how do i do...
1
4175
by: leo | last post by:
Hi ! I would like to know if there is a class or a library or something else allowing C# to read in a ABC File ( the database file of the exe generated by filemaker pro ) tank you
2
3758
by: Alan Searle | last post by:
I generate XML from an MS-Access 2002 database and find that I can format and display the data no problem with a XSL/HTML template. Perfect! However, then I found that I needed to replace my automatic MS-Access output with a customised dump of the XML and find that I am now having a problem with special characters (umlauts). For my tests I have built two identical extracts one generated by MS-Access and the other constructed by me...
2
1784
by: NAT | last post by:
I have a xml open and close tag <abc</abc>. I assign a string variable to this tag. <abc& NAT & </abc. The value of the variable NAT = Cal&FLo. But due to the presence of the ampersand. There is an exception. What do I need to do to display & nd other special chracters correctly in xml tags.
1
4575
by: sonald | last post by:
Dear All, I am working on a module that validates the provided CSV data in a text format, which must be in a predefined format. We check for the : 1. Number of fields provided in the text file, 2. Text checks for max. length of the field & whether the field is mandatory or optional Example:
44
9499
by: Kulgan | last post by:
Hi I am struggling to find definitive information on how IE 5.5, 6 and 7 handle character input (I am happy with the display of text). I have two main questions: 1. Does IE automaticall convert text input in HTML forms from the
1
1907
by: askcq | last post by:
what is the difference between these (void*)(&abc) and (void *)(abc) where abc is a local varaible inside a function
3
10202
KevinADC
by: KevinADC | last post by:
Purpose The purpose of this article is to discuss the difference between characters inside a character class and outside a character class and some special characters inside a character class. This is not a regular expression tutorial. Assumes you are already familiar with basic regular expression concepts and terminology. If not, you may want to read some regular expression tutorial. See the end of the article for links to online resources....
7
3761
by: neovantage | last post by:
Hey all, I am creating a xml file in this format <?xml version="1.0" encoding="utf-8" standalone="yes"?> <customers> <customer> <category>Brosch&uuml;ren</category> <number>4</number> <email>abc@xyz.net</email>
0
9481
by: Hystou | last post by:
Most computers default to English, but sometimes we require a different language, especially when relocating. Forgot to request a specific language before your computer shipped? No problem! You can effortlessly switch the default language on Windows 10 without reinstalling. I'll walk you through it. First, let's disable language synchronization. With a Microsoft account, language settings sync across devices. To prevent any complications,...
0
10155
jinu1996
by: jinu1996 | last post by:
In today's digital age, having a compelling online presence is paramount for businesses aiming to thrive in a competitive landscape. At the heart of this digital strategy lies an intricately woven tapestry of website design and digital marketing. It's not merely about having a website; it's about crafting an immersive digital experience that captivates audiences and drives business growth. The Art of Business Website Design Your website is...
1
10095
by: Hystou | last post by:
Overview: Windows 11 and 10 have less user interface control over operating system update behaviour than previous versions of Windows. In Windows 11 and 10, there is no way to turn off the Windows Update option using the Control Panel or Settings app; it automatically checks for updates and installs any it finds, whether you like it or not. For most users, this new feature is actually very convenient. If you want to control the update process,...
0
9953
tracyyun
by: tracyyun | last post by:
Dear forum friends, With the development of smart home technology, a variety of wireless communication protocols have appeared on the market, such as Zigbee, Z-Wave, Wi-Fi, Bluetooth, etc. Each protocol has its own unique characteristics and advantages, but as a user who is planning to build a smart home system, I am a bit confused by the choice of these technologies. I'm particularly interested in Zigbee because I've heard it does some...
0
8978
agi2029
by: agi2029 | last post by:
Let's talk about the concept of autonomous AI software engineers and no-code agents. These AIs are designed to manage the entire lifecycle of a software development project—planning, coding, testing, and deployment—without human intervention. Imagine an AI that can take a project description, break it down, write the code, debug it, and then launch it, all on its own.... Now, this would greatly impact the work of software developers. The idea...
0
6741
by: conductexam | last post by:
I have .net C# application in which I am extracting data from word file and save it in database particularly. To store word all data as it is I am converting the whole word file firstly in HTML and then checking html paragraph one by one. At the time of converting from word file to html my equations which are in the word document file was convert into image. Globals.ThisAddIn.Application.ActiveDocument.Select();...
0
5513
by: adsilva | last post by:
A Windows Forms form does not have the event Unload, like VB6. What one acts like?
1
4054
by: 6302768590 | last post by:
Hai team i want code for transfer the data from one system to another through IP address by using C# our system has to for every 5mins then we have to update the data what the data is updated we have to send another system
3
2881
bsmnconsultancy
by: bsmnconsultancy | last post by:
In today's digital era, a well-designed website is crucial for businesses looking to succeed. Whether you're a small business owner or a large corporation in Toronto, having a strong online presence can significantly impact your brand's success. BSMN Consultancy, a leader in Website Development in Toronto offers valuable insights into creating effective websites that not only look great but also perform exceptionally well. In this comprehensive...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.