Anyone know if the standard sprintf supports utf8 characters that
extend beyond the normal ascii characters?
Thanks! 4 15675 Ma*********@gma il.com writes:
Anyone know if the standard sprintf supports utf8 characters that
extend beyond the normal ascii characters?
That depends on what you mean by "support". If you do thing like:
sprintf(buf, "%s", some_string);
(but never do that unless you are sure buf has enough space) or
something along the lines of:
sprintf(buf, format, arg1, arg2 /* ... */);
(of course be sure format is valid and buf has enough space) and all
strings are UTF-8 encoded you'll get UTF-8 encoded string in the end.
This is guaranteed because UTF-8 is designed in such a way that NUL
bytes never occur in sequences encoding other characters.
--
Best regards, _ _
.o. | Liege of Serenly Enlightened Majesty of o' \,=./ `o
..o | Computer Science, Michal "mina86" Nazarewicz (o o)
ooo +--<mina86*tlen.pl >--<jid:mina86*jab ber.org>--ooO--(_)--Ooo--
On Apr 14, 2:39 pm, Michal Nazarewicz <min...@tlen.pl wrote:
Mandrago...@gma il.com writes:
Anyone know if the standard sprintf supports utf8 characters that
extend beyond the normal ascii characters?
That depends on what you mean by "support". If you do thing like:
sprintf(buf, "%s", some_string);
(but never do that unless you are sure buf has enough space) or
something along the lines of:
sprintf(buf, format, arg1, arg2 /* ... */);
(of course be sure format is valid and buf has enough space) and all
strings are UTF-8 encoded you'll get UTF-8 encoded string in the end.
This is guaranteed because UTF-8 is designed in such a way that NUL
bytes never occur in sequences encoding other characters.
--
Best regards, _ _
.o. | Liege of Serenly Enlightened Majesty of o' \,=./ `o
..o | Computer Science, Michal "mina86" Nazarewicz (o o)
ooo +--<mina86*tlen.pl >--<jid:mina86*jab ber.org>--ooO--(_)--Ooo--
Thanks for the reply.
What is still left unanswered is whether I can put utf-8 strings (ie
they have characters that take up to 4 bytes of space) and sprint f
that into a string without screwing up the byts of data. So something
like this:
unsigned int myVar= 0xDB0;
convertMyVarToU TF8(myVar);
char buff[512];
sprintf( buff, "Long string with %u", myVar);
is there a legitimate UTF-8 string in buff at this point?
Thanks!
Mandragon
On 15 avr, 23:02, Mandrago...@gma il.com wrote:
On Apr 14, 2:39 pm, Michal Nazarewicz <min...@tlen.pl wrote:
[...]
What is still left unanswered is whether I can put utf-8 strings (ie
they have characters that take up to 4 bytes of space) and sprint f
that into a string without screwing up the byts of data. So something
like this:
unsigned int myVar= 0xDB0;
convertMyVarToU TF8(myVar);
char buff[512];
sprintf( buff, "Long string with %u", myVar);
is there a legitimate UTF-8 string in buff at this point?
If the native encoding of narrow character strings is ASCII, or
an encoding which uses ASCII for its lower 128 code points, yes.
Because "%u" will only generated characters in the range
[0-9a-f], and all of those characters have the same encoding in
ASCII and in UTF-8.
However, I suspect that the function convertMyVarToU TF8 is
supposed to do something. But I don't see what, and I don't see
what it could do which would affect the results here.
--
James Kanze (GABI Software) email:ja******* **@gmail.com
Conseils en informatique orientée objet/
Beratung in objektorientier ter Datenverarbeitu ng
9 place Sémard, 78210 St.-Cyr-l'École, France, +33 (0)1 30 23 00 34 Ma*********@gma il.com writes:
What is still left unanswered is whether I can put utf-8 strings (ie
they have characters that take up to 4 bytes of space) and sprint f
that into a string without screwing up the byts of data. So something
like this:
unsigned int myVar= 0xDB0;
convertMyVarToU TF8(myVar);
char buff[512];
sprintf( buff, "Long string with %u", myVar);
is there a legitimate UTF-8 string in buff at this point?
Are you sure you meant that code? I suspect you meant something like:
#v+
char *utf8char(unsig ned long code);
char buf[512];
sprintf(buf, "Long string with %s", utf8char(0xDB0) );
#v-
Where "utf8char" converts given code to it's UTF-8 representation and
follows it by NUL byte returning pointer to first byte of the sequence.
If CHAR_BIT==8 and strings literals use ASCII codes for all alphanumeric
characters then in the end buf will contain a valid UTF-8 encoded
string.
Basically, if your implementation doesn't do anything funky with string
literals you can use UTF-8 encoded strings almost like any other
strings. The thing you'll have to remember is that some characters take
up more then one byte so ie. strlen() won't return string length, and
foo[10] won't necessarily get you the 11th character.
--
Best regards, _ _
.o. | Liege of Serenly Enlightened Majesty of o' \,=./ `o
..o | Computer Science, Michal "mina86" Nazarewicz (o o)
ooo +--<mina86*tlen.pl >--<jid:mina86*jab ber.org>--ooO--(_)--Ooo-- This thread has been closed and replies have been disabled. Please start a new discussion. Similar topics |
by: Pikkel |
last post by:
i'm looking for a way to replace special characters with characters
without accents, cedilles, etc.
|
by: Spamtrap |
last post by:
I only work in Perl occasionaly, and have been searching for a
solution for a conversion, and everything I found seems much too
complex.
All I need to do is take a simple text file and copy it, however some specific lines are in
fact in UTF8 as printed garbagy characters and they need to be
converted to Unicode, so that the new text file can be imported into a
desktop program and into some Word documents.
For the moment I would be...
|
by: ryang |
last post by:
I am trying to understand how to work with Unicode in Perl. I have
read the relevant man pages (perluniintro, perlunicode, etc.) and have
written severl scripts to test/verifiy my understanding. However, I
created a script that has unexpected output. The script is below and
it contains some UTF-8 encoded characters which represent all five
Spanish accented vowels plus the enye (n with a tilde over it) in upper
and lower case. I hope...
|
by: pramod |
last post by:
Two different platforms communicate over protocols which consist of
functions and arguments in ascii form. System might be little
endian/big endian.
It is possible to format string using sprintf and retreive it using
sscanf.
Each parameter has a delimiter, data type size is ported to the
platform, and expected argument order is known.
Is this approach portable w.r.t. endianess ?
|
by: Sean |
last post by:
I have a MySQL 4.1.11 database, table and table columns all configured as
utf8 as I need to accept data in a number of languages. The MySQL database
is hosted so I use SET NAMES utf8 in the connection string in ASP e.g.
sCon = "dsn=mydsn;uid=user;pwd=pass;stmt=set names utf8;option=3;".
The ASP pages are all charset utf8.
Now the ASP pages *seem* to work fine - I add some test characters (for
example special Turkish characters) from...
| |
by: chris_fieldhouse |
last post by:
Hi,
I'm almost done with a php driven email filter and automated forwarder,
I've tested it out with various emails and ironed out plain text and
html.
But this final item has me stumped.
When processing an email which contains UTF8 encoded characters, I
can't work out how to detect the presence of the UTF8 characters, so I
get =E2=80=99 displayed instead of a '.
|
by: Jason |
last post by:
Hi,
I was wondering if anyone could advise me on this.
Right now I am setting up a DB2 UDB V8.2.3 database with UTF8
character set, which will work with a J2EE application running on
WebSphere Application Server.
I have two questions:
1. How many characters, such as Chinese, Japanese, can a CHAR(128) or
|
by: krister |
last post by:
Hello,
I'm working in a quite large system that has some limitations. One of
those is that I can't use printf() to get an output on a screen. I'm
forced to use a special function, let's call it PrintOnConsole(), to get
the output on a console. The problem with PrintOnConsole() is that it
only takes strings as input arguments. On the other hand, I'm free to
use sprintf(), so I can convert everything I want to print into a string
and then...
|
by: Ron Ford |
last post by:
I'm looking for a freeware c99 compiler for windows. I had intended to use
MS's Visual C++ Express and use its C capability. In the past with my MS
products, I've simply needed to make .c the filetype to invoke the C
compiler. Here's a link
http://www.microsoft.com/express/download/#webInstall
The download is 2.6 megs, which is near a reasonable size for a compiler,
but then setup.exe wants to download 87 megs of dot net framework...
|
by: Hystou |
last post by:
Most computers default to English, but sometimes we require a different language, especially when relocating. Forgot to request a specific language before your computer shipped? No problem! You can effortlessly switch the default language on Windows 10 without reinstalling. I'll walk you through it.
First, let's disable language synchronization. With a Microsoft account, language settings sync across devices. To prevent any complications,...
|
by: Oralloy |
last post by:
Hello folks,
I am unable to find appropriate documentation on the type promotion of bit-fields when using the generalised comparison operator "<=>".
The problem is that using the GNU compilers, it seems that the internal comparison operator "<=>" tries to promote arguments from unsigned to signed.
This is as boiled down as I can make it.
Here is my compilation command:
g++-12 -std=c++20 -Wnarrowing bit_field.cpp
Here is the code in...
| |
by: jinu1996 |
last post by:
In today's digital age, having a compelling online presence is paramount for businesses aiming to thrive in a competitive landscape. At the heart of this digital strategy lies an intricately woven tapestry of website design and digital marketing. It's not merely about having a website; it's about crafting an immersive digital experience that captivates audiences and drives business growth.
The Art of Business Website Design
Your website is...
|
by: tracyyun |
last post by:
Dear forum friends,
With the development of smart home technology, a variety of wireless communication protocols have appeared on the market, such as Zigbee, Z-Wave, Wi-Fi, Bluetooth, etc. Each protocol has its own unique characteristics and advantages, but as a user who is planning to build a smart home system, I am a bit confused by the choice of these technologies. I'm particularly interested in Zigbee because I've heard it does some...
|
by: agi2029 |
last post by:
Let's talk about the concept of autonomous AI software engineers and no-code agents. These AIs are designed to manage the entire lifecycle of a software development project—planning, coding, testing, and deployment—without human intervention. Imagine an AI that can take a project description, break it down, write the code, debug it, and then launch it, all on its own....
Now, this would greatly impact the work of software developers. The idea...
|
by: isladogs |
last post by:
The next Access Europe User Group meeting will be on Wednesday 1 May 2024 starting at 18:00 UK time (6PM UTC+1) and finishing by 19:30 (7.30PM).
In this session, we are pleased to welcome a new presenter, Adolph Dupré who will be discussing some powerful techniques for using class modules.
He will explain when you may want to use classes instead of User Defined Types (UDT). For example, to manage the data in unbound forms.
Adolph will...
|
by: TSSRALBI |
last post by:
Hello
I'm a network technician in training and I need your help.
I am currently learning how to create and manage the different types of VPNs and I have a question about LAN-to-LAN VPNs.
The last exercise I practiced was to create a LAN-to-LAN VPN between two Pfsense firewalls, by using IPSEC protocols.
I succeeded, with both firewalls in the same network. But I'm wondering if it's possible to do the same thing, with 2 Pfsense firewalls...
|
by: adsilva |
last post by:
A Windows Forms form does not have the event Unload, like VB6. What one acts like?
| |
by: 6302768590 |
last post by:
Hai team
i want code for transfer the data from one system to another through IP address by using C# our system has to for every 5mins then we have to update the data what the data is updated we have to send another system
| |