Ma*********@gmail.com writes:
What is still left unanswered is whether I can put utf-8 strings (ie
they have characters that take up to 4 bytes of space) and sprint f
that into a string without screwing up the byts of data. So something
like this:
unsigned int myVar= 0xDB0;
convertMyVarToUTF8(myVar);
char buff[512];
sprintf( buff, "Long string with %u", myVar);
is there a legitimate UTF-8 string in buff at this point?
Are you sure you meant that code? I suspect you meant something like:
#v+
char *utf8char(unsigned long code);
char buf[512];
sprintf(buf, "Long string with %s", utf8char(0xDB0));
#v-
Where "utf8char" converts given code to it's UTF-8 representation and
follows it by NUL byte returning pointer to first byte of the sequence.
If CHAR_BIT==8 and strings literals use ASCII codes for all alphanumeric
characters then in the end buf will contain a valid UTF-8 encoded
string.
Basically, if your implementation doesn't do anything funky with string
literals you can use UTF-8 encoded strings almost like any other
strings. The thing you'll have to remember is that some characters take
up more then one byte so ie. strlen() won't return string length, and
foo[10] won't necessarily get you the 11th character.
--
Best regards, _ _
.o. | Liege of Serenly Enlightened Majesty of o' \,=./ `o
..o | Computer Science, Michal "mina86" Nazarewicz (o o)
ooo +--<mina86*tlen.pl>--<jid:mina86*jabber.org>--ooO--(_)--Ooo--