By using this site, you agree to our updated Privacy Policy and our Terms of Use. Manage your Cookies Settings.
431,757 Members | 1,990 Online
Bytes IT Community
+ Ask a Question
Need help? Post your question and get tips & solutions from a community of 431,757 IT Pros & Developers. It's quick & easy.

Writing Unicode-16 to a text file

P: n/a
I tried to write some Unicode-16 characters (that were displayed
correctly, as expected, on the screen) to a file but it didn't work
out very well. I have those in an char[] as well as a String. Both
will give me a number of "?".

What do i miss?

--

Kindly
Konrad
---------------------------------------------------
May all spammers die an agonizing death; have no burial places;
their souls be chased by demons in Gehenna from one room to
another for all eternity and more.

Sleep - thing used by ineffective people
as a substitute for coffee

Ambition - a poor excuse for not having
enough sense to be lazy
---------------------------------------------------


Jul 17 '05 #1
Share this Question
Share on Google+
5 Replies


P: n/a
Konrad Den Ende wrote:
I tried to write some Unicode-16 characters (that were displayed
correctly, as expected, on the screen) to a file but it didn't work
out very well. I have those in an char[] as well as a String. Both
will give me a number of "?".

What do i miss?


When you wrote the characters to a file (what method did you use?) they
probably underwent a 16-bit to 8-bit conversion, using some encoding (what
encoding did you specify? or what is your Java installation using as its
default encoding?). When you looked at the file afterwards, the software
you used to do that (what did you use?) probably wasn't set up to grok that
encoding.

What happens when you read the file back into Java?

Good luck,

Chris

--
Chris Gray ch***@kiffer.eunet.be
/k/ Embedded Java Solutions

Jul 17 '05 #2

P: n/a
Konrad Den Ende wrote:
I tried to write some Unicode-16 characters (that were displayed
correctly, as expected, on the screen) to a file but it didn't work
out very well. I have those in an char[] as well as a String. Both
will give me a number of "?".

What do i miss?


When you wrote the characters to a file (what method did you use?) they
probably underwent a 16-bit to 8-bit conversion, using some encoding (what
encoding did you specify? or what is your Java installation using as its
default encoding?). When you looked at the file afterwards, the software
you used to do that (what did you use?) probably wasn't set up to grok that
encoding.

What happens when you read the file back into Java?

Good luck,

Chris

--
Chris Gray ch***@kiffer.eunet.be
/k/ Embedded Java Solutions

Jul 17 '05 #3

P: n/a
> When you wrote the characters to a file (what method did you use?) they
probably underwent a 16-bit to 8-bit conversion
try {
BufferedWriter writer = new BufferedWriter (new FileWriter
("nihongo.txt"));
writer.write (cc); // cc is a char[] that stores the characters
writer.close ();
}
catch (Exception e) {System.out.println (e.getMessage ());}

using some encoding (what encoding did you specify? or what is your Java
installation using as its default encoding?).
I didn't specify any encoding so i guess it's english. BUT i figured that
since
char is not more than a number then my char[] variable is just an array of
some
kind of integers (2-byte, i guess, so it will contain all the 65k
characters).
When you looked at the file afterwards, the software you used to do that
(what did you use?) probably wasn't set up to grok that encoding.
I used MS Word and a text reader with enabled japanese. Just to be sure i
checked a file that i can read japanese text from using my usual software,
and read from it using notepad. I didn's see japanese (oh, what a surprise)
but i could see a number of strange characters.
Yet, the file that my application creates, contains only "?"'s.
What happens when you read the file back into Java?


"?"'s only.

Any hint?
--

Kindly
Konrad
---------------------------------------------------
May all spammers die an agonizing death; have no burial places;
their souls be chased by demons in Gehenna from one room to
another for all eternity and more.

Sleep - thing used by ineffective people
as a substitute for coffee

Ambition - a poor excuse for not having
enough sense to be lazy
---------------------------------------------------


Jul 17 '05 #4

P: n/a
> When you wrote the characters to a file (what method did you use?) they
probably underwent a 16-bit to 8-bit conversion
try {
BufferedWriter writer = new BufferedWriter (new FileWriter
("nihongo.txt"));
writer.write (cc); // cc is a char[] that stores the characters
writer.close ();
}
catch (Exception e) {System.out.println (e.getMessage ());}

using some encoding (what encoding did you specify? or what is your Java
installation using as its default encoding?).
I didn't specify any encoding so i guess it's english. BUT i figured that
since
char is not more than a number then my char[] variable is just an array of
some
kind of integers (2-byte, i guess, so it will contain all the 65k
characters).
When you looked at the file afterwards, the software you used to do that
(what did you use?) probably wasn't set up to grok that encoding.
I used MS Word and a text reader with enabled japanese. Just to be sure i
checked a file that i can read japanese text from using my usual software,
and read from it using notepad. I didn's see japanese (oh, what a surprise)
but i could see a number of strange characters.
Yet, the file that my application creates, contains only "?"'s.
What happens when you read the file back into Java?


"?"'s only.

Any hint?
--

Kindly
Konrad
---------------------------------------------------
May all spammers die an agonizing death; have no burial places;
their souls be chased by demons in Gehenna from one room to
another for all eternity and more.

Sleep - thing used by ineffective people
as a substitute for coffee

Ambition - a poor excuse for not having
enough sense to be lazy
---------------------------------------------------


Jul 17 '05 #5

P: n/a
Konrad Den Ende wrote:
When you wrote the characters to a file (what method did you use?) they
probably underwent a 16-bit to 8-bit conversion
try {
BufferedWriter writer = new BufferedWriter (new FileWriter
("nihongo.txt"));
writer.write (cc); // cc is a char[] that stores the characters
writer.close ();
}
catch (Exception e) {System.out.println (e.getMessage ());} using some encoding (what encoding did you specify? or what is your Java
installation using as its default encoding?).

Any hint?


Sure.

You have been writing Japanese with an encoding that doensn't support
it. I bet your default encoding, derived from your operating system
locale (you may see that from System.getProperties() . .. ) is ISO-8859
or something like that. It does not support Japanese.

You should look at OutputStreamWriter, of which you can make an instance
that uses an encoding that supports Japanese. You can get an idea of
what encodings are supported by looking at the CharSet class of java
1.4's nio package. There is a static method there, I forgot its name,
that will return you a Set of the names of supported encodings.

You may end up using ISO-2022-something, but I prefer Unicode's UTF-8,
it's a lot nicer and cleaner, and it supports almost any language. You
will need Unicode fonts though.

En encoding is the mapping from bytes (sequences of 8 bits) to a higher
level of abstraction, namely characters. Streams are byte oriented,
readers/writers are character oriented, and encoding/decoding is in
between.

Hope that helped.
Soren
--
Fjern de 4 bogstaver i min mailadresse som er indsat for at hindre s...
Remove the 4 letter word meaning "junk mail" in my mail address.

Jul 17 '05 #6

This discussion thread is closed

Replies have been disabled for this discussion.