473,785 Members | 2,858 Online
Bytes | Software Development & Data Engineering Community
+ Post

Home Posts Topics Members FAQ

basic source character set

Hi

Please let me know if I have this clear. The basic source character
set is the list of (96) characters that all implementations must have
in their vocabulary. All other characters recognized by an
implementation are implementation defined, and will not necessarily be
the same across implementations . The key issue as far as developers
are concerned is that if they want their code to be perfectly
portable, then they must restrict their source files to using only
characters from the basic source character set, or use universal
character names to insert characters outside of the basic source
character set.

For example, the following code is not strictly portable:

char *str = "$";

since the "$" character is not a member of the basic source character
set. To make it portable, you would need to do the following

char *str = "\u0024";

regards, B.

Aug 25 '07 #1
6 4538
bo*******@gmail .com said:
Hi

Please let me know if I have this clear. The basic source character
set is the list of (96) characters that all implementations must have
in their vocabulary. All other characters recognized by an
implementation are implementation defined, and will not necessarily be
the same across implementations . The key issue as far as developers
are concerned is that if they want their code to be perfectly
portable, then they must restrict their source files to using only
characters from the basic source character set, or use universal
character names to insert characters outside of the basic source
character set.
Yes, that's basically it. In practice, I think you'll be okay with all
the printable characters that are in the common subset of ASCII and
EBCDIC, although I await correction on the matter from those who have
used conforming C implementations that employ more esoteric source
character sets. Unfortunately, however, AFAICT this only extends the
basic character set by two: $ and @
For example, the following code is not strictly portable:

char *str = "$";

since the "$" character is not a member of the basic source character
set.
Strictly speaking, you are correct, yes. Of course, you can /read/ a '$'
character from an open stream at runtime without any trouble at all, if
one happens to be present and is representable as an unsigned char.

--
Richard Heathfield <http://www.cpax.org.uk >
Email: -www. +rjh@
Google users: <http://www.cpax.org.uk/prg/writings/googly.php>
"Usenet is a strange place" - dmr 29 July 1999
Aug 25 '07 #2
bo*******@gmail .com wrote:
>
Please let me know if I have this clear. The basic source
character set is the list of (96) characters that all
implementations must have in their vocabulary. All other
characters recognized by an implementation are implementation
defined, and will not necessarily be the same across
implementations . The key issue as far as developers are
concerned is that if they want their code to be perfectly
portable, then they must restrict their source files to using
only characters from the basic source character set, or use
universal character names to insert characters outside of the
basic source character set.
Not quite. Including space, there are 92 printing chars in the
basic set (not 96). Chars such as $ are language dependant, and
may therefore be different on other machines. Other missing chars
are '@', '`' and the rubout (hex 7f in ASCII). The following is an
extract from N869:

[#3] Both the basic source and basic execution character
sets shall have at least the following members: the 26
uppercase letters of the Latin alphabet

A B C D E F G H I J K L M
N O P Q R S T U V W X Y Z

the 26 lowercase letters of the Latin alphabet

a b c d e f g h i j k l m
n o p q r s t u v w x y z

the 10 decimal digits

0 1 2 3 4 5 6 7 8 9

the following 29 graphic characters

! " # % & ' ( ) * + , - . / :
; < = ? [ \ ] ^ _ { | } ~

the space character, and control characters representing
horizontal tab, vertical tab, and form feed. The

--
Chuck F (cbfalconer at maineline dot net)
Available for consulting/temporary embedded and systems.
<http://cbfalconer.home .att.net>

--
Posted via a free Usenet account from http://www.teranews.com

Aug 25 '07 #3
On Sat, 25 Aug 2007 07:07:56 -0400, CBFalconer wrote:
bo*******@gmail .com wrote:
>>
Please let me know if I have this clear. The basic source
character set is the list of (96) characters that all
implementation s must have in their vocabulary. All other
characters recognized by an implementation are implementation
defined, and will not necessarily be the same across
implementation s. The key issue as far as developers are
concerned is that if they want their code to be perfectly
portable, then they must restrict their source files to using
only characters from the basic source character set, or use
universal character names to insert characters outside of the
basic source character set.

Not quite. Including space, there are 92 printing chars in the
basic set (not 96).
He did not specify "printing characters", so he's only off by one.
[...]
the space character, and control characters representing
horizontal tab, vertical tab, and form feed. The
--
Army1987 (Replace "NOSPAM" with "email")
No-one ever won a game by resigning. -- S. Tartakower

Aug 26 '07 #4
On Aug 25, 5:39 pm, boroph...@gmail .com wrote:
...if [developers] want their code to be perfectly
portable, then they must restrict their source files to
using only characters from the basic source character set,
Yes.
or use universal character names to insert characters
outside of the basic source character set.
If you have a supporting compiler.
>
For example, the following code is not strictly portable:

char *str = "$";

since the "$" character is not a member of the basic source
character set.
Correct.
To make it portable, you would need to do the following

char *str = "\u0024";
That's fine for the source, but it won't actually help you
when the program executes. There is still no guarantee that
the dollar sign is a member of the execution character set,
even though you can now 'name' it.

You'll get a dollar sign on the systems that have them, but
you'll get an implementation defined character on the systems
that don't.

Given that programs that _need_ $ and @ invariably need 'A'
to be 65 as well, you might as well go ahead and use them in
the source.

[Aside: One of the pre-standard drafts of C99 actually
precluded the naming of $ and @ with universal character
escapes. Fortunately, someone alerted the Committee of
their apparent use in some circles. :-]

--
Peter

Aug 27 '07 #5
Peter Nilsson said:

<snip>
Given that programs that _need_ $ and @ invariably need 'A'
to be 65 as well, you might as well go ahead and use them in
the source.
But this is not true. I've worked on a number of programs that needed a
'$' but which were quite happy for 'A' to have a non-65 code point (and
it's just as well, since they often had to run on systems where 'A' was
in fact not 65).

--
Richard Heathfield <http://www.cpax.org.uk >
Email: -www. +rjh@
Google users: <http://www.cpax.org.uk/prg/writings/googly.php>
"Usenet is a strange place" - dmr 29 July 1999
Aug 27 '07 #6
Peter Nilsson <ai***@acay.com .auwrote:
Given that programs that _need_ $ and @ invariably need 'A'
to be 65 as well, you might as well go ahead and use them in
the source.
A large amount of accounting software written to run on IBM systems
would be surprised to hear that (though I don't know whether any of that
software was written in C).

Richard
Aug 27 '07 #7

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

30
2221
by: Hallvard B Furuseth | last post by:
Now that the '-*- coding: <charset> -*-' feature has arrived, I'd like to see an addition: # -*- str7bit:True -*- After the source file has been converted to Unicode, cause a parse error if a non-u'' string contains a non-7bit source character. It can be used to ensure that the source file doesn't contain national characters that the program will treat as characters in the current
27
2608
by: John Roth | last post by:
PEP 263 is marked finished in the PEP index, however I haven't seen the specified Phase 2 in the list of changes for 2.4 which is when I expected it. Did phase 2 get cancelled, or is it just not in the changes document? John Roth
28
4602
by: Dave | last post by:
Below is the code ive written just to count the characters typed in. I assumed EOF is -1, so if i type -1 and then press enter shouldnt the program end? It orks if i put something like 'q' in the while loop to end the loop. what is up? <code> #include <stdio.h>
6
2198
by: Chris Lane | last post by:
Hi, I have been searching for a possible list that shows what methods or properties in the System Names replace the ones in the Visual Basic Namespace so I can stop using the Visual Basic Namespace on new projects. Thus far I have been unable to find this information, please inform me. For example in the Visual Basic Namespace there are Control Characters like Tab and Back and so on. Where are the equivalents in the System Namespace?
11
6198
by: cmay | last post by:
I am having this problem... Lets say that your source XML is formatted like this: <somenode> Here is some text Here is some more text </somenode> When to a <xsl:value-of select="somenode" /I want the output to be free of the tabs that are included in the source XML. These tabs are not really part of the content, but rather just there
111
5595
by: Enteng | last post by:
Hi I'm thinking about learning C as my first programming language. Would you recommend it? Also how do you suggest that I learn it?What books/tutorials should I read for someone like me? Thanks in advance! -entengk
28
3597
by: Randy Reimers | last post by:
(Hope I'm posting this correctly, otherwise - sorry!, don't know what else to do) I wrote a set of programs "many" years ago, running in a type of basic, called "Thoroughbred Basic", a type of business basic. I need to re-write it, bring it kicking and screaming to run on Windows XP. This is for a video rental place, tracks movie and game rentals, customers, employee transactions, reservations, does reports,..... and on. I know some of...
3
2069
by: siddhartag | last post by:
This is not strictly a python question, but I'm hoping someone here has come across a similar situation. I have a django app and I've protected some views with basic authentication. The user can use any unicode character in the username and password fields. When this happens, the data is not properly encoded by the browser before transmission. How can I get the browser to encode the data as utf-8 before sending it over? Is there some...
6
38520
Atli
by: Atli | last post by:
This is an easy to digest 12 step guide on basics of using MySQL. It's a great refresher for those who need it and it work's great for first time MySQL users. Anyone should be able to get through this without much trouble. Programming knowledge is not required. Index What is SQL? Why MySQL? Installing MySQL. Using the MySQL command line interface
0
9480
by: Hystou | last post by:
Most computers default to English, but sometimes we require a different language, especially when relocating. Forgot to request a specific language before your computer shipped? No problem! You can effortlessly switch the default language on Windows 10 without reinstalling. I'll walk you through it. First, let's disable language synchronization. With a Microsoft account, language settings sync across devices. To prevent any complications,...
0
10324
Oralloy
by: Oralloy | last post by:
Hello folks, I am unable to find appropriate documentation on the type promotion of bit-fields when using the generalised comparison operator "<=>". The problem is that using the GNU compilers, it seems that the internal comparison operator "<=>" tries to promote arguments from unsigned to signed. This is as boiled down as I can make it. Here is my compilation command: g++-12 -std=c++20 -Wnarrowing bit_field.cpp Here is the code in...
1
10090
by: Hystou | last post by:
Overview: Windows 11 and 10 have less user interface control over operating system update behaviour than previous versions of Windows. In Windows 11 and 10, there is no way to turn off the Windows Update option using the Control Panel or Settings app; it automatically checks for updates and installs any it finds, whether you like it or not. For most users, this new feature is actually very convenient. If you want to control the update process,...
0
9949
tracyyun
by: tracyyun | last post by:
Dear forum friends, With the development of smart home technology, a variety of wireless communication protocols have appeared on the market, such as Zigbee, Z-Wave, Wi-Fi, Bluetooth, etc. Each protocol has its own unique characteristics and advantages, but as a user who is planning to build a smart home system, I am a bit confused by the choice of these technologies. I'm particularly interested in Zigbee because I've heard it does some...
0
5380
by: TSSRALBI | last post by:
Hello I'm a network technician in training and I need your help. I am currently learning how to create and manage the different types of VPNs and I have a question about LAN-to-LAN VPNs. The last exercise I practiced was to create a LAN-to-LAN VPN between two Pfsense firewalls, by using IPSEC protocols. I succeeded, with both firewalls in the same network. But I'm wondering if it's possible to do the same thing, with 2 Pfsense firewalls...
0
5511
by: adsilva | last post by:
A Windows Forms form does not have the event Unload, like VB6. What one acts like?
1
4050
by: 6302768590 | last post by:
Hai team i want code for transfer the data from one system to another through IP address by using C# our system has to for every 5mins then we have to update the data what the data is updated we have to send another system
2
3645
muto222
by: muto222 | last post by:
How can i add a mobile payment intergratation into php mysql website.
3
2879
bsmnconsultancy
by: bsmnconsultancy | last post by:
In today's digital era, a well-designed website is crucial for businesses looking to succeed. Whether you're a small business owner or a large corporation in Toronto, having a strong online presence can significantly impact your brand's success. BSMN Consultancy, a leader in Website Development in Toronto offers valuable insights into creating effective websites that not only look great but also perform exceptionally well. In this comprehensive...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.