473,839 Members | 1,611 Online
Bytes | Software Development & Data Engineering Community
+ Post

Home Posts Topics Members FAQ

code portability

My question is more generic, but it involves what I consider ANSI standard C
and portability.

I happen to be a system admin for multiple platforms and as such a lot of
the applications that my users request are a part of the OpenSource
community. Many if not most of those applications strongly require the
presence of the GNU compiling suite to work properly. My assumption is that
this is due to the author/s creating the applications with the GNU suite.
Many of the tools requested/required are GNU replacements for make,
configure, the loader, and lastly the C compiler itself. Where I'm going
with this is, has the OpenSource community as a whole committed itself to at
the very least encouraging its contributing members to conform to ANSI
standards of programming?

My concern is that as an admin I am sometimes compelled to port these
applications to multiple platforms running the same OS and as the user
community becomes more and more insistent on OpenSource applications will
gotcha's appear due to lack of portability in coding? I fully realize that
independent developers may or may not conform to standards, but again is it
at least encouraged?

11.32 of the FAQ seemed to at least outline the crux of what I am asking.
If I loaded up my home machine to the gills will all open source compiler
applications (gcc, imake, autoconfig, etc....) would my applications that I
compile and link and load conform?
Aug 1 '06
239 10352
Ben Pfaff <bl*@cs.stanfor d.eduwrites:
Keith Thompson <ks***@mib.orgw rites:
>I might consider adding a check at program startup, something
like
if ('A' != 65) {
/* yes, it's an incomplete check */
fprintf(stderr, "This program won't work on a non-ASCII system\n");
exit(EXIT_FAILU RE);
}

Is there some reason that this can't be done at compile time:
#if 'A' != 65
#error Needs ASCII character set
#endif
Yes.

(Barely resisting the temptation to leave it at that...)

N1124 6.10.1p3:

This includes interpreting character constants, which may involve
converting escape sequences into execution character set
members. Whether the numeric value for these character constants
matches the value obtained when an identical character constant
occurs in an expression (other than within a #if or #elif
directive) is implementation-defined.

Footnote:

Thus, the constant expression in the following #if directive and
if statement is not guaranteed to evaluate to the same value in
these two contexts.

#if 'z' - 'a' == 25

if ('z' - 'a' == 25)

I did refer to this upthread, just after the portion you quoted:

| or I might not bother; I'd at least document the assumption somewhere
| in the code. (No, you can't reliably test this in the preprocessor;
| see C99 6.10.1p3.)

--
Keith Thompson (The_Other_Keit h) ks***@mib.org <http://www.ghoti.net/~kst>
San Diego Supercomputer Center <* <http://users.sdsc.edu/~kst>
We must do something. This is something. Therefore, we must do this.
Aug 5 '06 #71
Keith Thompson <ks***@mib.orgw rites:
Ben Pfaff <bl*@cs.stanfor d.eduwrites:
>Keith Thompson <ks***@mib.orgw rites:
>>I might consider adding a check at program startup, something
like
if ('A' != 65) {
/* yes, it's an incomplete check */
fprintf(stderr, "This program won't work on a non-ASCII system\n");
exit(EXIT_FAILU RE);
}

Is there some reason that this can't be done at compile time:
#if 'A' != 65
#error Needs ASCII character set
#endif

Yes.
I need to do a better job of reading. Thank you for your
patience.
--
int main(void){char p[]="ABCDEFGHIJKLM NOPQRSTUVWXYZab cdefghijklmnopq rstuvwxyz.\
\n",*q="kl BIcNBFr.NKEzjwC IxNJC";int i=sizeof p/2;char *strchr();int putchar(\
);while(*q){i+= strchr(p,*q++)-p;if(i>=(int)si zeof p)i-=sizeof p-1;putchar(p[i]\
);}return 0;}
Aug 5 '06 #72
Keith Thompson wrote:
we******@gmail. com writes:
Keith Thompson wrote:
"Malcolm" <re*******@btin ternet.comwrite s:
<we******@gmail .comwrote:
Eigenvector wrote:
[...] I fully realize that
independent developers may or may not conform to standards, but again is
it at least encouraged?

Not really. By its very nature C encourages non-portable programming.
In general, I try to write code portably, but the only thing keeping me
honest is actually compiling my stuff with multiple compilers to see
what happens.

Yes. There is a tension between efficiency and portability. In Java they
resolved it by compromising efficiency, in C we have to be careful to make
our portable code genuinely portable, which is why the topic is so often
discussed.
There is also the problem of "good enough" portability, for instance
assuming ASCII and two's complement integers.

I rarely find it useful to assume ASCII.
Who cares what *YOU* find useful or not.

Gosh, I don't know. Do you care? Because, as you know, your opinion
matters a great deal to me. It's probably because of your charming
manner.
Its just so typical of you to answer generic questions with what
happens to suit you. As if you represent the only kind of C programmer
that there is, or should be.
I would like to auto-initialize an array that maps from unsigned char
-parse-state, which makes, say, letters to one value, numbers to
another, etc. The reason I want to auto-initialize this rather than
calling some init() routine that sets them up is because I want to
support correct multithreading, and my inner loops that use such an
array are going so fast, that auto-first-time checking actually is
unacceptable overhead.

If I can't assume ASCII, then this solution has simply been taken away
from me. Compare this with the Lua language, which allows unordered
specific index auto-initialization.

I can think of several ways to do this. You can use some automated
process to generate the C code for you during the build process,
perhaps with a build-time option to select ASCII or some other
character set.
The subject for this thread is "code portability". So of course, I
assume you have a way of doing this portably.
[...] Or you can explicitly invoke an initialization routine
exactly once as your program is starting up and save the expense of
checking on each use.
Ok, read carefully, I just told you I can't do that. If I am willing
to sacrifice performance (remember we're setting up a constant
addressed look-up table so we're expecting a throughput of 1/3 of a
single clock (or even 1/4 of a clock on these new Intel Core CPUs) for
this operation) why would I bother doing this through a look up table
in the first place?
[...] Or (and this may or may not be available to
you), you can use C99,
Again, the subject of this thread is "code portability". Use C99 is
diametrically opposite to this goal.
[...] This works with gcc 3.4.5 and 4.1.1 with "-std=c99".
The irony of this statement is just unbelievable ... . Two versions of
gcc counts as portability?
Or you can just (drum roll please) assume ASCII. If you'll look very
closely at what I wrote above:

| I rarely find it useful to assume ASCII.

you'll see the word "rarely", not "never".
That's nice, but you've removed the context. This is not a response to
the generic question posed. This is just a statement about *your*
predilictions. The fact is that *I* rarely find it useful as well,
because I don't write a lot of code that does parsing. But that is
completely irrelevant, which is why, of course, I refrained from making
such ridiculous non-sequitor statements. *rarely* is not the only word
you wrote there, you also wrote the word *I*.
[...] If assuming ASCII, and
therefore making your code non-portable to non-ASCII platforms, makes
it significantly faster, that's great. I might consider adding a
check at program startup, something like
if ('A' != 65) {
/* yes, it's an incomplete check */
fprintf(stderr, "This program won't work on a non-ASCII system\n");
exit(EXIT_FAILU RE);
}
or I might not bother; I'd at least document the assumption somewhere
in the code. (No, you can't reliably test this in the preprocessor;
see C99 6.10.1p3.)

The fact that you've managed to cite a single application where
assuming ASCII happens to be useful does not refute anything I've
said.
This is a *single* application? I am talking about a technique, not an
application. The fact is, this comes up for a wide variety of string
parsing scenarios, where speed (or in fact *simplicity*) might be a
concern. We're talking about ASCII here -- where else would such a
concern apply?
Write portable code if you can. If you need to write non-portable
code, keep it as isolated as you can (but you may *sometimes* find
that a portable implementation would have worked just as well in the
first place).
Now why couldn't you have posted this more reasoned position instead of
the drivel that you did in the first place?
It's usually just as easy to
write code that depends only on the guarantees in the standard, and
will just work regardless of the character set. It would be
marginally more convenient to be able to assume that the character
codes for the letters are contiguous, but that's easy enough to work
around.
Yeah, well obviously you don't work in environments where performance
and portability matters.

Obviously you have no clue about the environments in which I work.
Ok, well then maybe you are just bad at your job, or maybe you have
long term memory problems like the guy from the movie Memento.
As for two's complement, I typically don't care about that either.
Numbers are numbers. If I need to do bit-twiddling, I use unsigned.
And if you need a correctly functioning ring modulo 2**n? If you can
assume 2s complement then you've *got one*. Otherwise, you get to
construct one somehow (not sure how hard this is, I have never ever
been exposed to a system that didn't *ONLY* support 2s complement).

It's been a while since my last abstract algebra class, but isn't a
"ring module 2**n" simply the set of integers from 0 to 2**n-1?
No, that would be a list or a set.

Your bizarre relationship with the definition of technical words is a
real curiosity. How can you pretend to be a computer programmer, and
be so far removed from standard nomenclature? It would be ok if you
just mixed up a few words or something I wouldn't make a big deal about
it. But you appear to not know the concepts on the other side of these
words.
[...] And isn't that precisely what C's *unsigned* integer types are?
First of all no, and second of all if it was, then it wouldn't be a
ring.

A Ring is a set with a 0, a + operator and a * operator. And the point
is that its completely *closed* under these operations. In typical 2s
complement implementations , I know that integers (signed or not) are
rings. In 1s complement machines -- I have no idea; I don't have
access to such a machine (I never have in the past, and I almost
certainly never will in the future), and just don't have familliarity
with 1s complement. It doesn't have the natural wrapping properties
that 2s complement has, so my intuition is that its *not* a ring, but I
just don't know.

The reason why this is important is for verification purposes. Suppose
I write the following:

x = (y << 7) - (y << 2);

Well, that should be the same as x = y * 124. How do I know this?
Because I know that y << 7 is the same as y * 128, and y << 2 is the
same as y * 4. After that, there is a concern that one of operands of
the subtract might wrap around, while the other one doesn't. Or both
might. Because of that, direct verification of this fact might lead
you to believe that you need to look at these as seperate cases and
very carefully examine the bits to make sure that the results are still
correct. But we don't have to. If we *know* that the expression is
equivalent to y*128 - y*4, then because 2s complement integers form an
actual ring, then we are allowed rely on ordinary algebra without
concern. Wrap around doesn't matter -- its always correct.
Verification of just straight *algebra* is unnecessary, we can just
rely on mathematics.

--
Paul Hsieh
http://www.pobox.com/~qed/
http://bstring.sf.net/

Aug 5 '06 #73
In article <11************ *********@i42g2 000cwa.googlegr oups.com>
<we******@gmail .comwrote (replying to someone else):
>A Ring is a set with a 0, a + operator and a * operator. And the point
is that its completely *closed* under these operations.
This is ... hardly a thorough definition. You need to add
commutativity (for +) and distribution (of * over +), in particular.
>In typical 2s complement implementations , I know that integers
(signed or not) are rings. In 1s complement machines -- I have
no idea ...
And that is where you have missed Keith Thompson's point -- because
even on ones' complement machines, *unsigned* integers (in C) are
still rings. So use "unsigned"; they give you the very property
you want. They *guarantee* it.
--
In-Real-Life: Chris Torek, Wind River Systems
Salt Lake City, UT, USA (40°39.22'N, 111°50.29'W) +1 801 277 2603
email: forget about it http://web.torek.net/torek/index.html
Reading email is like searching for food in the garbage, thanks to spammers.
Aug 5 '06 #74
Chris Torek wrote:
In article <11************ *********@i42g2 000cwa.googlegr oups.com>
<we******@gmail .comwrote (replying to someone else):
A Ring is a set with a 0, a + operator and a * operator. And the point
is that its completely *closed* under these operations.

This is ... hardly a thorough definition.
I didn't claim it was. This isn't a classroom; thoroughness is not the
same as correctness.
[...] You need to add commutativity (for +) and distribution (of * over +), in
particular.
In typical 2s complement implementations , I know that integers
(signed or not) are rings. In 1s complement machines -- I have
no idea ...

And that is where you have missed Keith Thompson's point -- because
even on ones' complement machines, *unsigned* integers (in C) are
still rings. So use "unsigned"; they give you the very property
you want. They *guarantee* it.
And now you are starting to make Keith-style mistakes. What if I need
to do algebra on signed integers? I need the "ring properties" for
proofs of correctness -- this is not an useful end in of itself. If I
cannot apply these properties to signed integers, then I cannot do
algebra on signed integers without great difficulty.

Compare this to the situation in 2s complement. Suppose its
*difficult* to prove something on signed integers, but easy to prove it
for unsigned. But if it turns out you can "lift" from signed to
unsigned through casting and your theorem still makes sense, then you
likely can just apply the proof through this mechanism.

What Keith said is tantamount to saying "don't use negative numbers, if
you plan on doing sound arithmetic". This is kind of useless.

--
Paul Hsieh
http://www.pobox.com/~qed/
http://bstring.sf.net/

Aug 5 '06 #75
we******@gmail. com writes:
Keith Thompson wrote:
>we******@gmail. com writes:
Keith Thompson wrote:
"Malcolm" <re*******@btin ternet.comwrite s:
<we******@gmail .comwrote:
Eigenvector wrote:
[...] I fully realize that independent developers may or may
not conform to standards, but again is it at least
encouraged?

Not really. By its very nature C encourages non-portable
programming. In general, I try to write code portably, but
the only thing keeping me honest is actually compiling my
stuff with multiple compilers to see what happens.

Yes. There is a tension between efficiency and portability. In
Java they resolved it by compromising efficiency, in C we have
to be careful to make our portable code genuinely portable,
which is why the topic is so often discussed. There is also
the problem of "good enough" portability, for instance
assuming ASCII and two's complement integers.

I rarely find it useful to assume ASCII.

Who cares what *YOU* find useful or not.

Gosh, I don't know. Do you care? Because, as you know, your opinion
matters a great deal to me. It's probably because of your charming
manner.

Its just so typical of you to answer generic questions with what
happens to suit you. As if you represent the only kind of C programmer
that there is, or should be.
I never said or implied that.
I would like to auto-initialize an array that maps from unsigned char
-parse-state, which makes, say, letters to one value, numbers to
another, etc. The reason I want to auto-initialize this rather than
calling some init() routine that sets them up is because I want to
support correct multithreading, and my inner loops that use such an
array are going so fast, that auto-first-time checking actually is
unacceptable overhead.

If I can't assume ASCII, then this solution has simply been taken away
from me. Compare this with the Lua language, which allows unordered
specific index auto-initialization.

I can think of several ways to do this. You can use some automated
process to generate the C code for you during the build process,
perhaps with a build-time option to select ASCII or some other
character set.

The subject for this thread is "code portability". So of course, I
assume you have a way of doing this portably.
No, I don't.
>[...] Or you can explicitly invoke an initialization routine
exactly once as your program is starting up and save the expense of
checking on each use.

Ok, read carefully, I just told you I can't do that. If I am willing
to sacrifice performance (remember we're setting up a constant
addressed look-up table so we're expecting a throughput of 1/3 of a
single clock (or even 1/4 of a clock on these new Intel Core CPUs) for
this operation) why would I bother doing this through a look up table
in the first place?
You said you wanted "an array that maps from unsigned char ->
parse-state, which makes, say, letters to one value, numbers to
another, etc.". I took that to be a description of a lookup table.
If you were referring to something else, I suggest you write more
clearly.
>[...] Or (and this may or may not be available to
you), you can use C99,

Again, the subject of this thread is "code portability". Use C99 is
diametrically opposite to this goal.
>[...] This works with gcc 3.4.5 and 4.1.1 with "-std=c99".

The irony of this statement is just unbelievable ... . Two versions of
gcc counts as portability?
No, of course not. You described a problem; I suggested some
solutions, and I clearly stated that some of them are not portable.
The fact that the subject of this thread happens to be "code
portability" does not mean that I am obligated to discuss only
portable solutions. In fact, I am discussing portable
vs. non-portable code.
>Or you can just (drum roll please) assume ASCII. If you'll look very
closely at what I wrote above:

| I rarely find it useful to assume ASCII.

you'll see the word "rarely", not "never".

That's nice, but you've removed the context. This is not a response to
the generic question posed. This is just a statement about *your*
predilictions. The fact is that *I* rarely find it useful as well,
because I don't write a lot of code that does parsing. But that is
completely irrelevant, which is why, of course, I refrained from making
such ridiculous non-sequitor statements. *rarely* is not the only word
you wrote there, you also wrote the word *I*.
Yes, I wrote the word "I" because "I" was talking about my own
experience. If it's not useful to you, that's too bad.
>[...] If assuming ASCII, and
therefore making your code non-portable to non-ASCII platforms, makes
it significantly faster, that's great. I might consider adding a
check at program startup, something like
if ('A' != 65) {
/* yes, it's an incomplete check */
fprintf(stderr, "This program won't work on a non-ASCII system\n");
exit(EXIT_FAILU RE);
}
or I might not bother; I'd at least document the assumption somewhere
in the code. (No, you can't reliably test this in the preprocessor;
see C99 6.10.1p3.)

The fact that you've managed to cite a single application where
assuming ASCII happens to be useful does not refute anything I've
said.

This is a *single* application? I am talking about a technique, not an
application. The fact is, this comes up for a wide variety of string
parsing scenarios, where speed (or in fact *simplicity*) might be a
concern. We're talking about ASCII here -- where else would such a
concern apply?
>Write portable code if you can. If you need to write non-portable
code, keep it as isolated as you can (but you may *sometimes* find
that a portable implementation would have worked just as well in the
first place).

Now why couldn't you have posted this more reasoned position instead of
the drivel that you did in the first place?
It's what I've been saying all along. Pay attention.
>It's usually just as easy to
write code that depends only on the guarantees in the standard, and
will just work regardless of the character set. It would be
marginally more convenient to be able to assume that the character
codes for the letters are contiguous, but that's easy enough to work
around.

Yeah, well obviously you don't work in environments where performance
and portability matters.

Obviously you have no clue about the environments in which I work.

Ok, well then maybe you are just bad at your job, or maybe you have
long term memory problems like the guy from the movie Memento.
The subject of this thread is "code portability", not "gratuitous
insults".
>As for two's complement, I typically don't care about that either.
Numbers are numbers. If I need to do bit-twiddling, I use unsigned.

And if you need a correctly functioning ring modulo 2**n? If you can
assume 2s complement then you've *got one*. Otherwise, you get to
construct one somehow (not sure how hard this is, I have never ever
been exposed to a system that didn't *ONLY* support 2s complement).

It's been a while since my last abstract algebra class, but isn't a
"ring module 2**n" simply the set of integers from 0 to 2**n-1?

No, that would be a list or a set.

Your bizarre relationship with the definition of technical words is a
real curiosity. How can you pretend to be a computer programmer, and
be so far removed from standard nomenclature? It would be ok if you
just mixed up a few words or something I wouldn't make a big deal about
it. But you appear to not know the concepts on the other side of these
words.
>[...] And isn't that precisely what C's *unsigned* integer types are?

First of all no, and second of all if it was, then it wouldn't be a
ring.

A Ring is a set with a 0, a + operator and a * operator. And the point
is that its completely *closed* under these operations. In typical 2s
complement implementations , I know that integers (signed or not) are
rings. In 1s complement machines -- I have no idea; I don't have
access to such a machine (I never have in the past, and I almost
certainly never will in the future), and just don't have familliarity
with 1s complement. It doesn't have the natural wrapping properties
that 2s complement has, so my intuition is that its *not* a ring, but I
just don't know.
A ring is not just a set, nor is it just a set with a 0, a + operator,
and a - operator. There are several other properties it has to have.
You flame me for an incomplete definition, then offer another
incomplete definition yourself.

I believe that unsigned int satisfies those properties, but signed int
may or may not; for example, the standard makes no guarantee that any
signed type is closed under addition. It's probably true that signed
integers on most 2's-complement systems (which are almost all existing
systems) also happen to satisfy those properties.
The reason why this is important is for verification purposes. Suppose
I write the following:

x = (y << 7) - (y << 2);

Well, that should be the same as x = y * 124. How do I know this?
Because I know that y << 7 is the same as y * 128, and y << 2 is the
same as y * 4. After that, there is a concern that one of operands
of the subtract might wrap around, while the other one doesn't. Or
both might. Because of that, direct verification of this fact might
lead you to believe that you need to look at these as seperate cases
and very carefully examine the bits to make sure that the results
are still correct. But we don't have to. If we *know* that the
expression is equivalent to y*128 - y*4, then because 2s complement
integers form an actual ring, then we are allowed rely on ordinary
algebra without concern. Wrap around doesn't matter -- its always
correct. Verification of just straight *algebra* is unnecessary, we
can just rely on mathematics.
If you *know* that 2's-complement integers form a ring, then you are
depending on properties not guaranteed by the C standard. (You are,
of course, free to do so.)

Incidentally, you might find that it's possible to have a technical
discussion without being a hypocritical jerk. Try it.

--
Keith Thompson (The_Other_Keit h) ks***@mib.org <http://www.ghoti.net/~kst>
San Diego Supercomputer Center <* <http://users.sdsc.edu/~kst>
We must do something. This is something. Therefore, we must do this.
Aug 5 '06 #76
we******@gmail. com wrote:
Keith Thompson wrote:
>we******@gmail. com writes:
>>Keith Thompson wrote:
"Malcolm" <re*******@btin ternet.comwrite s:
<we******@g mail.comwrote:
>Eigenvecto r wrote:
>>[...] I fully realize that
>>independe nt developers may or may not conform to standards, but again is
>>it at least encouraged?
>Not really. By its very nature C encourages non-portable programming.
>In general, I try to write code portably, but the only thing keeping me
>honest is actually compiling my stuff with multiple compilers to see
>what happens.
Yes. There is a tension between efficiency and portability. In Java they
resolved it by compromising efficiency, in C we have to be careful to make
our portable code genuinely portable, which is why the topic is so often
discussed .
There is also the problem of "good enough" portability, for instance
assuming ASCII and two's complement integers.
I rarely find it useful to assume ASCII.
Who cares what *YOU* find useful or not.
Gosh, I don't know. Do you care? Because, as you know, your opinion
matters a great deal to me. It's probably because of your charming
manner.

Its just so typical of you to answer generic questions with what
happens to suit you. As if you represent the only kind of C programmer
that there is, or should be.
Unless someone knows *every* domain, which no one does, then all they
can do is talk about the areas they do. Therefore *any* response to a
generic question will be based on what the person answering it comes >
across.

Keith rarely finds it useful to assume ASCII, it appears you regularly
find it useful to assume ASCII. Neither shows what the situation is
across all domains.
>>I would like to auto-initialize an array that maps from unsigned char
-parse-state, which makes, say, letters to one value, numbers to
another, etc. The reason I want to auto-initialize this rather than
calling some init() routine that sets them up is because I want to
support correct multithreading, and my inner loops that use such an
array are going so fast, that auto-first-time checking actually is
unacceptabl e overhead.

If I can't assume ASCII, then this solution has simply been taken away
from me. Compare this with the Lua language, which allows unordered
specific index auto-initialization.
I can think of several ways to do this. You can use some automated
process to generate the C code for you during the build process,
perhaps with a build-time option to select ASCII or some other
character set.

The subject for this thread is "code portability". So of course, I
assume you have a way of doing this portably.
I'm sure Keith can.
>[...] Or you can explicitly invoke an initialization routine
exactly once as your program is starting up and save the expense of
checking on each use.

Ok, read carefully, I just told you I can't do that. If I am willing
to sacrifice performance (remember we're setting up a constant
addressed look-up table so we're expecting a throughput of 1/3 of a
single clock (or even 1/4 of a clock on these new Intel Core CPUs) for
this operation) why would I bother doing this through a look up table
in the first place?
int main(void)
{
do_init()
/* Throw off as many off topic threads as you want */
/* rest of program */
}

Calling do_init has a major impact on the performance of the program?
>[...] Or (and this may or may not be available to
you), you can use C99,

Again, the subject of this thread is "code portability". Use C99 is
diametrically opposite to this goal.
Keith noted that C99 might not be available to you. However, if it is
available on all platforms of interest then it might be portable enough.
>[...] This works with gcc 3.4.5 and 4.1.1 with "-std=c99".

The irony of this statement is just unbelievable ... . Two versions of
gcc counts as portability?
If it is valid C99, and I have no reason to believe it isn't, there are
other compilers it will work on.
>Or you can just (drum roll please) assume ASCII. If you'll look very
closely at what I wrote above:

| I rarely find it useful to assume ASCII.

you'll see the word "rarely", not "never".

That's nice, but you've removed the context. This is not a response to
the generic question posed. This is just a statement about *your*
predilictions. The fact is that *I* rarely find it useful as well,
because I don't write a lot of code that does parsing. But that is
completely irrelevant, which is why, of course, I refrained from making
such ridiculous non-sequitor statements. *rarely* is not the only word
you wrote there, you also wrote the word *I*.
Which means that what Keith wrote is perfectly clear. You (and probably
Keith) do not know whether for the majority of programs it is useful to
assume ASCII or not, all you know is the domains you know about.
>[...] If assuming ASCII, and
therefore making your code non-portable to non-ASCII platforms, makes
it significantly faster, that's great. I might consider adding a
check at program startup, something like
if ('A' != 65) {
/* yes, it's an incomplete check */
fprintf(stderr, "This program won't work on a non-ASCII system\n");
exit(EXIT_FAILU RE);
}
or I might not bother; I'd at least document the assumption somewhere
in the code. (No, you can't reliably test this in the preprocessor;
see C99 6.10.1p3.)

The fact that you've managed to cite a single application where
assuming ASCII happens to be useful does not refute anything I've
said.

This is a *single* application? I am talking about a technique, not an
application. The fact is, this comes up for a wide variety of string
parsing scenarios, where speed (or in fact *simplicity*) might be a
concern. We're talking about ASCII here -- where else would such a
concern apply?
So if I come up a technique for two things covering two wide varieties
of scenarios where assuming ASCII provides no benefit that will prove
that generally you don't need to assume ASCII?

<snip>
>>>It's usually just as easy to
write code that depends only on the guarantees in the standard, and
will just work regardless of the character set. It would be
marginally more convenient to be able to assume that the character
codes for the letters are contiguous, but that's easy enough to work
around.
Yeah, well obviously you don't work in environments where performance
and portability matters.
Obviously you have no clue about the environments in which I work.

Ok, well then maybe you are just bad at your job, or maybe you have
long term memory problems like the guy from the movie Memento.
Or maybe Keith is good at his job and does things where it is rarely
useful to assume ASCII?
>>>As for two's complement, I typically don't care about that either.
Numbers are numbers. If I need to do bit-twiddling, I use unsigned.
And if you need a correctly functioning ring modulo 2**n? If you can
assume 2s complement then you've *got one*. Otherwise, you get to
construct one somehow (not sure how hard this is, I have never ever
been exposed to a system that didn't *ONLY* support 2s complement).
It's been a while since my last abstract algebra class, but isn't a
"ring module 2**n" simply the set of integers from 0 to 2**n-1?

No, that would be a list or a set.

Your bizarre relationship with the definition of technical words is a
real curiosity. How can you pretend to be a computer programmer, and
be so far removed from standard nomenclature? It would be ok if you
just mixed up a few words or something I wouldn't make a big deal about
it. But you appear to not know the concepts on the other side of these
words.
There are large fields of computing where algebra is not required.
Certainly large fields where rings are not required.
>[...] And isn't that precisely what C's *unsigned* integer types are?

First of all no, and second of all if it was, then it wouldn't be a
ring.

A Ring is a set with a 0, a + operator and a * operator. And the point
is that its completely *closed* under these operations.
Which unsigned integer types are.
In typical 2s
complement implementations , I know that integers (signed or not) are
rings.
You obviously no very little about how unsigned integers are defined in
C. They are the same *whatever* representation is used for signed integers.
In 1s complement machines -- I have no idea;
Had you bothered to look you would know that the signed integer
representation does not affect the unsigned integer representation.
Keith *explicitly* stated *unsigned*.
I don't have
access to such a machine (I never have in the past, and I almost
certainly never will in the future), and just don't have familliarity
with 1s complement. It doesn't have the natural wrapping properties
that 2s complement has, so my intuition is that its *not* a ring, but I
just don't know.
Singed integers are not defined as being a ring *whatever*
representation is used. I've used processors that use 2s complement
where they will limit on overflow of addition/subtraction instead of
wrapping. There are times in signal processing where this is a *very*
useful property.
The reason why this is important is for verification purposes. Suppose
I write the following:

x = (y << 7) - (y << 2);

Well, that should be the same as x = y * 124. How do I know this?
Because I know that y << 7 is the same as y * 128, and y << 2 is the
same as y * 4. After that, there is a concern that one of operands of
If you understood unsigned integers in C you would understand that it
applies whatever the signed representation is. I would still use
multiplication rather than a shift/subtract when I want multiplication
and let the compiler sort out the optimisation. After all, that is what
the optimisation phase is for. In any case, on some processors it would
be *faster* to multiply because they have single cycle hardware multipliers.
the subtract might wrap around, while the other one doesn't. Or both
might. Because of that, direct verification of this fact might lead
you to believe that you need to look at these as seperate cases and
very carefully examine the bits to make sure that the results are still
correct. But we don't have to. If we *know* that the expression is
equivalent to y*128 - y*4, then because 2s complement integers form an
actual ring, then we are allowed rely on ordinary algebra without
concern. Wrap around doesn't matter -- its always correct.
Verification of just straight *algebra* is unnecessary, we can just
rely on mathematics.
As Keith said, you get these guarantees on unsigned integers. So if you
need a ring use unsigned integers. Since unsigned integers are
guaranteed to be a ring by the C standard.
--
Flash Gordon
Still sigless on this computer.
Aug 5 '06 #77
In article <11************ **********@p79g 2000cwp.googleg roups.com>,
<we******@gmail .comwrote:
>Chris Torek wrote:
>In article <11************ *********@i42g2 000cwa.googlegr oups.com>
<we******@gmai l.comwrote (replying to someone else):
>A Ring is a set with a 0, a + operator and a * operator. And the point
is that its completely *closed* under these operations.
>This is ... hardly a thorough definition.
>I didn't claim it was. This isn't a classroom; thoroughness is not the
same as correctness.
You gave a definition for ring, but there are sets that match your
definition that are NOT rings, because your definition was incomplete
even for common types of rings.
http://mathworld.wolfram.com/Ring.html

It is not clear to me how someone can complain about someone
else's "bizarre relationship to technical terms" and then themselves
misuse a technical term that they themself have indicated is important
to part of their discussion.
--
Okay, buzzwords only. Two syllables, tops. -- Laurie Anderson
Aug 5 '06 #78
In article <11************ *********@h48g2 000cwc.googlegr oups.com>,
<we******@gmail .comwrote:
>And if you need a correctly functioning ring modulo 2**n? If you can
assume 2s complement then you've *got one*. Otherwise, you get to
construct one somehow (not sure how hard this is, I have never ever
been exposed to a system that didn't *ONLY* support 2s complement).
Caution: on most 2s complement machines, the *signed* integers do
not form a ring. In cases where INT_MIN is (-INT_MAX - 1)
(e.g., INT_MIN is -32768 for an INT_MAX of 32767) then there
is no "additive inverse" for INT_MIN -- no element in the set
such that INT_MIN plus the element is 0.

This is not an issue for *unsigned* integers: operations on the
unsigned integers are defined such that the additive inverse of
the maximum unsigned integer is always 1 [if I recall correctly.]
--
There are some ideas so wrong that only a very intelligent person
could believe in them. -- George Orwell
Aug 5 '06 #79
Walter Roberson wrote:
In article <11************ *********@h48g2 000cwc.googlegr oups.com>,
<we******@gmail .comwrote:
And if you need a correctly functioning ring modulo 2**n? If you can
assume 2s complement then you've *got one*. Otherwise, you get to
construct one somehow (not sure how hard this is, I have never ever
been exposed to a system that didn't *ONLY* support 2s complement).

Caution: on most 2s complement machines, the *signed* integers do
not form a ring. In cases where INT_MIN is (-INT_MAX - 1)
(e.g., INT_MIN is -32768 for an INT_MAX of 32767) then there
is no "additive inverse" for INT_MIN -- no element in the set
such that INT_MIN plus the element is 0.
What do you mean? The additive inverse of INT_MIN is INT_MIN.

--
Paul Hsieh
http://www.pobox.com/~qed/
http://bstring.sf.net/

Aug 5 '06 #80

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

1
1848
by: Lefevre | last post by:
Hello. I recently discovered that this kind of code : | struct Object | { | string f() { return string("Toto"); } | } | | int main( ... )
0
9696
by: Hystou | last post by:
Most computers default to English, but sometimes we require a different language, especially when relocating. Forgot to request a specific language before your computer shipped? No problem! You can effortlessly switch the default language on Windows 10 without reinstalling. I'll walk you through it. First, let's disable language synchronization. With a Microsoft account, language settings sync across devices. To prevent any complications,...
0
10584
jinu1996
by: jinu1996 | last post by:
In today's digital age, having a compelling online presence is paramount for businesses aiming to thrive in a competitive landscape. At the heart of this digital strategy lies an intricately woven tapestry of website design and digital marketing. It's not merely about having a website; it's about crafting an immersive digital experience that captivates audiences and drives business growth. The Art of Business Website Design Your website is...
0
10290
tracyyun
by: tracyyun | last post by:
Dear forum friends, With the development of smart home technology, a variety of wireless communication protocols have appeared on the market, such as Zigbee, Z-Wave, Wi-Fi, Bluetooth, etc. Each protocol has its own unique characteristics and advantages, but as a user who is planning to build a smart home system, I am a bit confused by the choice of these technologies. I'm particularly interested in Zigbee because I've heard it does some...
0
9425
agi2029
by: agi2029 | last post by:
Let's talk about the concept of autonomous AI software engineers and no-code agents. These AIs are designed to manage the entire lifecycle of a software development project—planning, coding, testing, and deployment—without human intervention. Imagine an AI that can take a project description, break it down, write the code, debug it, and then launch it, all on its own.... Now, this would greatly impact the work of software developers. The idea...
1
7827
isladogs
by: isladogs | last post by:
The next Access Europe User Group meeting will be on Wednesday 1 May 2024 starting at 18:00 UK time (6PM UTC+1) and finishing by 19:30 (7.30PM). In this session, we are pleased to welcome a new presenter, Adolph Dupré who will be discussing some powerful techniques for using class modules. He will explain when you may want to use classes instead of User Defined Types (UDT). For example, to manage the data in unbound forms. Adolph will...
0
5681
by: TSSRALBI | last post by:
Hello I'm a network technician in training and I need your help. I am currently learning how to create and manage the different types of VPNs and I have a question about LAN-to-LAN VPNs. The last exercise I practiced was to create a LAN-to-LAN VPN between two Pfsense firewalls, by using IPSEC protocols. I succeeded, with both firewalls in the same network. But I'm wondering if it's possible to do the same thing, with 2 Pfsense firewalls...
0
5865
by: adsilva | last post by:
A Windows Forms form does not have the event Unload, like VB6. What one acts like?
2
4063
muto222
by: muto222 | last post by:
How can i add a mobile payment intergratation into php mysql website.
3
3131
bsmnconsultancy
by: bsmnconsultancy | last post by:
In today's digital era, a well-designed website is crucial for businesses looking to succeed. Whether you're a small business owner or a large corporation in Toronto, having a strong online presence can significantly impact your brand's success. BSMN Consultancy, a leader in Website Development in Toronto offers valuable insights into creating effective websites that not only look great but also perform exceptionally well. In this comprehensive...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.