By using this site, you agree to our updated Privacy Policy and our Terms of Use. Manage your Cookies Settings.
446,227 Members | 1,290 Online
Bytes IT Community
+ Ask a Question
Need help? Post your question and get tips & solutions from a community of 446,227 IT Pros & Developers. It's quick & easy.

character byte str[i] treated as signed, I need unsigned

P: n/a
I'm comparing characters via

return(str1[i] - str2[i]);

and I'm having problems with 8-bit characters being treated as signed
instead of unsigned integers. The disassembly is using

movsx eax,byte ptr[edx]

to load my character in to EAX register. I need it to use movzx.
How can I recode this to treat my characters as unsigned instead of signed?
Nov 2 '06 #1
Share this Question
Share on Google+
18 Replies


P: n/a
Susan Rice wrote:
I'm comparing characters via

return(str1[i] - str2[i]);

and I'm having problems with 8-bit characters being treated as signed
instead of unsigned integers. The disassembly is using

movsx eax,byte ptr[edx]

to load my character in to EAX register. I need it to use movzx.
How can I recode this to treat my characters as unsigned instead of signed?
By defining (or if all else fails) casting them as unsigned. You
haven't shown the definition of str1 or str2.

--
Ian Collins.
Nov 2 '06 #2

P: n/a
In article <FZ*****************@newsfe10.phx>,
Susan Rice <sr****@cox.netwrote:
>I'm comparing characters via
return(str1[i] - str2[i]);
>and I'm having problems with 8-bit characters being treated as signed
instead of unsigned integers. The disassembly is using
movsx eax,byte ptr[edx]
>to load my character in to EAX register. I need it to use movzx.
How can I recode this to treat my characters as unsigned instead of signed?
Anything down at the assembly level is out of scope for this newsgroup,
which does not deal with implementation specifics.

Fortunately, you do not need to go down to that level. Try just

return( (unsigned char)str1[i] - (unsigned char)str2[i] );

--
"No one has the right to destroy another person's belief by
demanding empirical evidence." -- Ann Landers
Nov 2 '06 #3

P: n/a
Susan Rice wrote:
I'm comparing characters via

return(str1[i] - str2[i]);
How are str1, str2 and i declared? What's the rest of the function?
How is the result meant to be used?
and I'm having problems with 8-bit characters
Why do you care how many bits in a character there are?
being treated as signed instead of unsigned integers.
You're telling us what you _think_ the problem is, rather than
explaining the problem itself, e.g. "i inputted this, the output
I got was this, the output I wanted was this, here is my code
and what it is meant to do."

[In other words, don't tell us the sign difference is your problem,
tell us
_why_ it's a problem.]

You should know that knee-jerk "this'll fix it" responses may not be
addressing other important issues of your code. For instance, your
methodology is not guaranteed to yeild alphabetical ordering.

[In other words, your minimalist presentation may mean you only get
a superficial (and possibly broken) solution to your problem, whilst
deeper issues with your code are left uncorrected.]
The disassembly is using

movsx eax,byte ptr[edx]
Learning C by examining the disassembly is the WORST thing you can
do. When you change architectures you may find that there's an awful
lot of assumtions on your part that you'll have to unlearn.

--
Peter

Nov 3 '06 #4

P: n/a
Walter Roberson wrote:
Susan Rice <sr****@cox.netwrote:
I'm comparing characters via
return(str1[i] - str2[i]);
and I'm having problems with 8-bit characters being treated as signed
instead of unsigned integers.

return( (unsigned char)str1[i] - (unsigned char)str2[i] );
A note to the OP: this will still return a negative value for
cases such as '1' - '2'. If the intent is to return a positive
value mod 256 in all cases then write:

return (unsigned int)(str1[i] - str2[i]) % 256;

(note that the return statement does not need brackets around
its expression).

Nov 3 '06 #5

P: n/a
Here's the real problem I was unaware of, as explained by
Kernighan & Ritchie (whom you probably know as K&R):

"There is one subtle point about the conversion of characters to
integers. The language does not specify whether variables of type
char are signed or unsigned quantities. When a char is converted
to an int, can it ever produce a negative result? The answer varies
from machine to machine, reflecting differences in architecture.
On some machines a char whose leftmost bit is 1 will be converted
to a negative integer ("sign extension"). On others, a char is
promoted to an int by adding zeros at the left end, and thus is
always positive."
--Kernighan & Ritchie: "The C Programming Language"
(K&R, the inventors of the language.)

Susan Rice wrote:
I'm comparing characters via

return(str1[i] - str2[i]);

and I'm having problems with 8-bit characters being treated as signed
instead of unsigned integers. The disassembly is using

movsx eax,byte ptr[edx]

to load my character in to EAX register. I need it to use movzx.
How can I recode this to treat my characters as unsigned instead of signed?
Nov 3 '06 #6

P: n/a
Susan Rice wrote:
I'm comparing characters via

return(str1[i] - str2[i]);

and I'm having problems with 8-bit characters being treated as signed
instead of unsigned integers. The disassembly is using

movsx eax,byte ptr[edx]

to load my character in to EAX register. I need it to use movzx.
How can I recode this to treat my characters as unsigned instead of signed?
By declaring your array to hold unsigned chars.
Nov 3 '06 #7

P: n/a
Susan Rice:
I'm comparing characters via

return(str1[i] - str2[i]);

and I'm having problems with 8-bit characters being treated as signed
instead of unsigned integers. The disassembly is using

movsx eax,byte ptr[edx]

to load my character in to EAX register. I need it to use movzx.
How can I recode this to treat my characters as unsigned instead of
signed?
I can't say for sure without knowing exactly what you're trying to do (e.g.
do you want roll-around, etc.), but here's something simple:

return (char unsigned)( (unsigned)str1[i] - str2[i] );

Somebody else offered something akin to the following:

return (char unsigned)str1[i] - (char unsigned)str2[i];

, but the casts are redudant, as both operands will be promoted to either
"signed int" or "unsigned int" before the subtraction takes place.

Of course, I don't know what you're trying to do, but at first glance, it
looks like you're going the wrong way about it (e.g. why are you using
plain char in the first place?)

--

Frederick Gotham
Nov 3 '06 #8

P: n/a
In article <2G*******************@news.indigo.ie>,
Frederick Gotham <fg*******@SPAM.comwrote:
>I can't say for sure without knowing exactly what you're trying to do (e.g.
do you want roll-around, etc.), but here's something simple:
return (char unsigned)( (unsigned)str1[i] - str2[i] );
>Somebody else offered something akin to the following:
return (char unsigned)str1[i] - (char unsigned)str2[i];
>, but the casts are redudant, as both operands will be promoted to either
"signed int" or "unsigned int" before the subtraction takes place.
No, I used (unsigned char) not (char unsigned) .

You are being inconsistant in your reasoning for using (char unsigned) .
Your stated reasons have to do with your usage of Irish, which
(you have said) puts the most important information first. In this
case, the part that is most important is not the size of the item
but rather the unsigned-ness, so unsigned would go first in your
reasoning.

(You might, I suppose, argue that it is quite important in the cast
operation to know that you are casting to an integral type rather than
a floating type, and that on that basis that the char should go first.
However, there are no unsigned floating types, so the appearance
of unsigned already tells you that you cannot be working
with an integral type, so using unsigned first already provides
the "This will be an integral type" hint.)
--
There are some ideas so wrong that only a very intelligent person
could believe in them. -- George Orwell
Nov 3 '06 #9

P: n/a
Walter Roberson:
No, I used (unsigned char) not (char unsigned).

ARE YOU BRAIN DEAD ?

If I misquote you as using "int const" rather than "const int", will you
roar from a mountain top that I got it wrong?

You are being inconsistant in your reasoning for using (char unsigned).
Your stated reasons have to do with your usage of Irish, which
(you have said) puts the most important information first. In this
case, the part that is most important is not the size of the item
but rather the unsigned-ness, so unsigned would go first in your
reasoning.

Have you drilled a hole into my skull and had a look at my brain?

Don't pretend to know how I think.

(You might, I suppose, argue that it is quite important in the cast
operation to know that you are casting to an integral type rather than
a floating type, and that on that basis that the char should go first.
However, there are no unsigned floating types, so the appearance
of unsigned already tells you that you cannot be working
with an integral type, so using unsigned first already provides
the "This will be an integral type" hint.)

How about you spend more time focusing on the functionality of the code
rather than whether the pretty ribbons are green or yellow, and whether
they curl clockwise or anticlockwise.

--

Frederick Gotham
Nov 3 '06 #10

P: n/a
In article <5g*******************@news.indigo.ie>,
Frederick Gotham <fg*******@SPAM.comwrote:
>How about you spend more time focusing on the functionality of the code
rather than whether the pretty ribbons are green or yellow, and whether
they curl clockwise or anticlockwise.
I would point out that your offering was functionally equivilent to
mine (the one that used explicit casts in both locations), so -you-
were the one worrying about prettiness, not functionality.

You were commenting on elements of my code that did not affect
the functionality but did affect the readability, so it was completely
fair for me to comment on the elements of your code that did not
affect the functionality but did affect the readability.
>You are being inconsistant in your reasoning for using (char unsigned).
Your stated reasons have to do with your usage of Irish, which
(you have said) puts the most important information first. In this
case, the part that is most important is not the size of the item
but rather the unsigned-ness, so unsigned would go first in your
reasoning.
>Have you drilled a hole into my skull and had a look at my brain?
Do I need to locate and cite your previous articles in which
you explain your choice of syntactical order? You *did* make such
an explanation, and your most recent usage was contrary to that
explanation. You did not apply the reasoning that you had earlier
stated. We must therefore conclude that you apply your
previously-stated reasons inconsistantly; or that your previously
stated reasons were not your real reasons; or that your previously
stated reasons were not your -complete- reasons.
>Don't pretend to know how I think.
You are correct that I made a misstatement. I should not have
said that,
"You are being inconsistant in your reasoning for using (char unsigned)",
I should have said,
"You are being inconsistant with your stated reasoning for using
(char unsigned)".

This allows for a possibility that I did not allow for earlier,
namely that your actual reasoning might be quite consistant but that
your actual reasoning does not match your statements about your
reasoning.

--
"It is important to remember that when it comes to law, computers
never make copies, only human beings make copies. Computers are given
commands, not permission. Only people can be given permission."
-- Brad Templeton
Nov 3 '06 #11

P: n/a
Susan Rice wrote:
Susan Rice wrote:
>I'm comparing characters via

return(str1[i] - str2[i]);

and I'm having problems with 8-bit characters being treated as
signed instead of unsigned integers. The disassembly is using

movsx eax,byte ptr[edx]

to load my character in to EAX register. I need it to use movzx.
How can I recode this to treat my characters as unsigned instead
of signed?

Here's the real problem I was unaware of, as explained by
Kernighan & Ritchie (whom you probably know as K&R):

"There is one subtle point about the conversion of characters to
integers. The language does not specify whether variables of type
char are signed or unsigned quantities. When a char is converted
to an int, can it ever produce a negative result? The answer varies
from machine to machine, reflecting differences in architecture.
On some machines a char whose leftmost bit is 1 will be converted
to a negative integer ("sign extension"). On others, a char is
promoted to an int by adding zeros at the left end, and thus is
always positive."
--Kernighan & Ritchie: "The C Programming Language"
(K&R, the inventors of the language.)
Please don't top-post. Your answer belongs after (or intermixed
with) the material you quote, after snipping portions irrelevant to
your reply. I fixed this one.

As others have said, simply use unsigned chars.

--
Chuck F (cbfalconer at maineline dot net)
Available for consulting/temporary embedded and systems.
<http://cbfalconer.home.att.net>
Nov 3 '06 #12

P: n/a
Walter Roberson:
I would point out that your offering was functionally equivilent to
mine (the one that used explicit casts in both locations), so -you-
were the one worrying about prettiness, not functionality.

Actually, my intent was to point out a flaw. Let's start off with two
char's:

char a,b;

Let's say we want to add the two of these together, and for the result to
be unsigned. All we need do is:

(unsigned)a + b;

However, what _you_ proposed was:

(char unsigned)a + (char unsigned)b;

Which might be equivalent to:

(int)(char unsigned)a + (int)(char unsigned)b;

, depending on whether "char unsigned" promotes to "int" or "unsigned". On
the majority of implementations, it promotes to "int". On such systems, the
result will therefore be a signed int.

You were commenting on elements of my code that did not affect
the functionality but did affect the readability, so it was completely
fair for me to comment on the elements of your code that did not
affect the functionality but did affect the readability.

I pointed out the flaw. At times though, I also point out redundancies. If
I see:

double a;
long b,c;

a = (double)b / (double)c;

, then I'd point out that only one cast is required:

a = (double)b/c;

However I tend not to comment on things like:

int const Vs const int
i++ Vs ++i

>>Have you drilled a hole into my skull and had a look at my brain?

Do I need to locate and cite your previous articles in which
you explain your choice of syntactical order?

You suggested that my word order would change because of the context.

You *did* make such an explanation, and your most recent usage was
contrary to that explanation.

_You_ think so, because of the context. Perhaps was reasoning doesn't go so
far as to take the context into account, but rather picks one syntax that
should be used throughout. Who knows?! I stopped thinking about it a long
time ago and I just go with the flow now.

You did not apply the reasoning that you had earlier
stated. We must therefore conclude that you apply your
previously-stated reasons inconsistantly; or that your previously
stated reasons were not your real reasons; or that your previously
stated reasons were not your -complete- reasons.

Or you could conclude that you do not understand my thinking, or that my
thinking takes into account the probablity that Alaska will suffer flash-
floods on account of Global Warming.

This allows for a possibility that I did not allow for earlier,
namely that your actual reasoning might be quite consistant but that
your actual reasoning does not match your statements about your
reasoning.

I am done explaining why I like red ribbons that turn clockwise on my
bicycle. Please see past the ribbons and look at the actual bicycle, as
I've had my fill of explaining my preference.

--

Frederick Gotham
Nov 3 '06 #13

P: n/a
Frederick Gotham said:
If I see:

double a;
long b,c;

a = (double)b / (double)c;

, then I'd point out that only one cast is required:

a = (double)b/c;
....and then I'd point out that *no* cast is required:

a = b;
a /= c;

--
Richard Heathfield
"Usenet is a strange place" - dmr 29/7/1999
http://www.cpax.org.uk
email: rjh at above domain (but drop the www, obviously)
Nov 3 '06 #14

P: n/a
Richard Heathfield:
...and then I'd point out that *no* cast is required:

a = b;
a /= c;
In performing an assignment, you give the idea that you need to store a
value. For instance, consider:

a = (Type)b+c;

in place of:

a = b;
a += c;

The latter version may result in less efficient code than the former
version, because when a compiler sees an assignment statement, it's first
thought will be "hmm, I have to store a value".

The former version explicitly demonstrates that both the value of b and c
can be discarded, leaving the door wide open for the compiler to do
whatever it likes (e.g. make use of CPU registers).

Of course, I'm sure you can find an optimiser which will make the same
machine code for both of them.

Both of our methods work. Perhaps you prefer _your_ method. Perhaps _I_
prefer _my_ method. Let's not argue over whether pretty green anticlockwise
ribbons are better than pretty red clockwise ribbons.

--

Frederick Gotham
Nov 3 '06 #15

P: n/a
Frederick Gotham said:

<snip>
Both of our methods work. Perhaps you prefer _your_ method. Perhaps _I_
prefer _my_ method. Let's not argue over whether pretty green
anticlockwise ribbons are better than pretty red clockwise ribbons.
This isn't a matter of preference, but of fact. You claimed that one cast is
*required*. I merely demonstrated that your claim is false.

--
Richard Heathfield
"Usenet is a strange place" - dmr 29/7/1999
http://www.cpax.org.uk
email: rjh at above domain (but drop the www, obviously)
Nov 3 '06 #16

P: n/a
Frederick Gotham wrote:
Richard Heathfield:

>>...and then I'd point out that *no* cast is required:

a = b;
a /= c;


In performing an assignment, you give the idea that you need to store a
value. For instance, consider:

a = (Type)b+c;

in place of:

a = b;
a += c;

The latter version may result in less efficient code than the former
version, because when a compiler sees an assignment statement, it's first
thought will be "hmm, I have to store a value".
Or it may (probably) won't, it could impede the optimiser, making it
less efficient. Don't get so hung up on speculative micro
optimisations, let the compiler do it's job.

--
Ian Collins.
Nov 4 '06 #17

P: n/a
Frederick Gotham wrote:
>
Have you drilled a hole into my skull and had a look at my brain?
Don't pretend to know how I think.
Do you know something about neuroscience that the rest
of us don't ?

Nov 5 '06 #18

P: n/a
Old Wolf:
>Have you drilled a hole into my skull and had a look at my brain?
Don't pretend to know how I think.

Do you know something about neuroscience that the rest
of us don't ?

No, but Wikipedia is your friend:

http://en.wikipedia.org/wiki/Neuroscience

--

Frederick Gotham
Nov 5 '06 #19

This discussion thread is closed

Replies have been disabled for this discussion.