By using this site, you agree to our updated Privacy Policy and our Terms of Use. Manage your Cookies Settings.
429,401 Members | 793 Online
Bytes IT Community
+ Ask a Question
Need help? Post your question and get tips & solutions from a community of 429,401 IT Pros & Developers. It's quick & easy.

Strange behaviour of simple code

P: n/a
I'm getting crazy. Look at this code:

#include <string.h>
#include <stdio.h>
#include <iostream.h>

using namespace std ;

char ini_code[2] = {0xFF, 0xFE} ;
char line_sep[2] = {0x20, 0x28} ;
char para_sep[2] = {0x20, 0x29} ;
char end_code[2] = {0xFF, 0xFF} ;
char tab_code[2] = {0x00, 0x09} ;
char alf_code[2] = {0x00, 0x0A} ;
char acr_code[2] = {0x00, 0x0D} ;

int main ()
{
char code[2] ;
bool gotCR = false ;

cin.read(&code[0], 2) ;

code[0] = ini_code[1] ;
code[1] = ini_code[0] ;
printf("0x%02X%02X\n",code[0],code[1]);
while (cin.read(&code[0], 2))
{
if (code[0] == tab_code[1] && code[1] == tab_code[0])
{
code[0] = line_sep[1] ;
code[1] = line_sep[0] ;
printf("0x%02X%02X\n",code[0],code[1]);
}
else if (code[0] == acr_code[1] && code[1] == acr_code[0])
{
gotCR = true ;
}
else if (code[0] == alf_code[1] && code[1] == alf_code[0])
{
if (gotCR)
{
code[0] = para_sep[1] ;
code[1] = para_sep[0] ;
printf("0x%02X%02X\n",code[0],code[1]);
}
else
{
gotCR = false ;
}
}
else
{
printf("0x%02X%02X\n",code[0],code[1]);
}
}

code[0] = end_code[1] ;
code[1] = end_code[0] ;
printf("0x%02X%02X\n",code[0],code[1]);

return 0 ;
}

I expect a list of

0xNNNN
0xNNNN
....
0xNNNN

Instead I obtain stuff like:

0x004B
0x006F
0x0072
0x0065
0x0061
0x006E
0x0009
0x00FFFFFFC6
0xFFFFFFC5FFFFFFC8
0xFFFFFFC5FFFFFFB5
0xFFFFFFC2FFFFFFC8
0xFFFFFFB2FFFFFFE4
0xFFFFFFB209
0x0009
0x0009
0x0000
0x4800
0x6500
0x6200
0x7200
0x6500

Why??????

~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Dr. Dario de Judicibus
http://www.dejudicibus.it/
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Jul 22 '05 #1
Share this Question
Share on Google+
16 Replies


P: n/a
Dario de Judicibus escribió:
char ini_code[2] = {0xFF, 0xFE} ;
char line_sep[2] = {0x20, 0x28} ;
char para_sep[2] = {0x20, 0x29} ;
char end_code[2] = {0xFF, 0xFF} ;
char tab_code[2] = {0x00, 0x09} ;
char alf_code[2] = {0x00, 0x0A} ;
char acr_code[2] = {0x00, 0x0D} ;


Use unsigned char.

Regards.
Jul 22 '05 #2

P: n/a
char ini_code[2] = {0xFF, 0xFE} ;
char line_sep[2] = {0x20, 0x28} ;
char para_sep[2] = {0x20, 0x29} ;
char end_code[2] = {0xFF, 0xFF} ; <= what should happen when you detect this? char tab_code[2] = {0x00, 0x09} ;
char alf_code[2] = {0x00, 0x0A} ;
char acr_code[2] = {0x00, 0x0D} ;

int main ()
{
char code[2] ;
bool gotCR = false ;

cin.read(&code[0], 2) ;

code[0] = ini_code[1] ;
code[1] = ini_code[0] ;
printf("0x%02X%02X\n",code[0],code[1]);
while (cin.read(&code[0], 2))
{
if (code[0] == tab_code[1] && code[1] == tab_code[0])
{
code[0] = line_sep[1] ;
code[1] = line_sep[0] ;
printf("0x%02X%02X\n",code[0],code[1]);
}
else if (code[0] == acr_code[1] && code[1] == acr_code[0])
{
gotCR = true ;
}
else if (code[0] == alf_code[1] && code[1] == alf_code[0])
{
if (gotCR)
{
code[0] = para_sep[1] ;
code[1] = para_sep[0] ;
printf("0x%02X%02X\n",code[0],code[1]);
}
else
{
gotCR = false ; <= this is ALREADY false here!
}
}
else
{
printf("0x%02X%02X\n",code[0],code[1]);
}
}

code[0] = end_code[1] ;
code[1] = end_code[0] ;
printf("0x%02X%02X\n",code[0],code[1]);

return 0 ;
}

I expect a list of

0xNNNN
0xNNNN
...
0xNNNN

Instead I obtain stuff like:

0x004B
0x006F
0x0072
0x0065
0x0061
0x006E
0x0009
0x00FFFFFFC6
0xFFFFFFC5FFFFFFC8
0xFFFFFFC5FFFFFFB5
0xFFFFFFC2FFFFFFC8
0xFFFFFFB2FFFFFFE4
0xFFFFFFB209
0x0009
0x0009
0x0000
0x4800
0x6500
0x6200
0x7200
0x6500

Why??????

~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Dr. Dario de Judicibus
http://www.dejudicibus.it/
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~


Without seeing your input, it's hard to tell. I have a question: why do
you define the codes in the reverse order that you expect to see them? Is
it intentional (for some reason I can't imagine), or is your code doing the
checks wrong?

I do see at least one clear problem: The handling of GotCR is not
correct. You never set it to false after the first time it gets set to
true. Your code to set it to false is in the else of an "if (GotCR)", which
means it only gets set to false when it is ALREADY false!

It also looks like you're getting those "end" codes, which probably
means you have to handle them differently, but there's no code to detect and
handle them. Same with the other codes, like the tab, etc..

But again, with no input to go by, we can't tell how the output gets
generated for sure. Try walking through your app in the debugger and see
what the variable values are at each step. You might also try doing it on
paper to check your design.

-Howard

Jul 22 '05 #3

P: n/a

char ini_code[2] = {0xFF, 0xFE} ;
char line_sep[2] = {0x20, 0x28} ;
char para_sep[2] = {0x20, 0x29} ;
char end_code[2] = {0xFF, 0xFF} ;
char tab_code[2] = {0x00, 0x09} ;
char alf_code[2] = {0x00, 0x0A} ;
char acr_code[2] = {0x00, 0x0D} ;
Use unsigned char.

Why? The char type is neither unsigned nor signed unless explicitly
stated, and he's not doing any math or '>' or '<' comparisons where signed
vs. unsigned might make a difference.

The problems, I think, are that his logic is incorrect and incomplete.
(He's not handling all cases, and he's handling the CR incorrectly.)

-Howard

Jul 22 '05 #4

P: n/a
Howard escribió:
char ini_code[2] = {0xFF, 0xFE} ;
char line_sep[2] = {0x20, 0x28} ;
char para_sep[2] = {0x20, 0x29} ;
char end_code[2] = {0xFF, 0xFF} ;
char tab_code[2] = {0x00, 0x09} ;
char alf_code[2] = {0x00, 0x0A} ;
char acr_code[2] = {0x00, 0x0D} ;

Use unsigned char.

Why? The char type is neither unsigned nor signed unless explicitly
stated, and he's not doing any math or '>' or '<' comparisons where signed
vs. unsigned might make a difference.


The char type is a separated type at many effects, but or it has sign or
it has not. If is has sign, 0xFF when converted to int and outputted in
hex gives many more F, as the ouput of the OP shows. Them I suppose that
is the case,

Regards.
Jul 22 '05 #5

P: n/a

"Howard" <al*****@hotmail.com> wrote in message news:bp********@dispatch.concentric.net...

char ini_code[2] = {0xFF, 0xFE} ;
Use unsigned char.

Why? The char type is neither unsigned nor signed unless explicitly
stated, and he's not doing any math or '>' or '<' comparisons where signed
vs. unsigned might make a difference.


If char is 8 bits and signed, 0xFF isn't a defined initializer.
Jul 22 '05 #6

P: n/a

"Dario de Judicibus" <no****@nowhere.com> wrote in message news:bp***********@newsreader2.mclink.it...
0x00FFFFFFC6
0xFFFFFFC5FFFFFFC8
0xFFFFFFC5FFFFFFB5
0xFFFFFFC2FFFFFFC8
0xFFFFFFB2FFFFFFE4
0xFFFFFFB209


Classic sign extension bug. Your signed char gets expanded to int (standard procedure for
vararg'd function like printf). For example 0xFF most likely initialized the char value as -1.
"%X", -1 prints 0xFFFFFFFFF.

You either should use unsigned char or you will have to mask off the sign extensions.
Jul 22 '05 #7

P: n/a

"Ron Natalie" <ro*@sensor.com> wrote in message
news:3f*********************@news.newshosting.com. ..

"Howard" <al*****@hotmail.com> wrote in message

news:bp********@dispatch.concentric.net...

> char ini_code[2] = {0xFF, 0xFE} ; Use unsigned char.

Why? The char type is neither unsigned nor signed unless explicitly
stated, and he's not doing any math or '>' or '<' comparisons where signed vs. unsigned might make a difference.


If char is 8 bits and signed, 0xFF isn't a defined initializer.


???

But I thought char was *neither* signed nor unsigned, unlike int, which
is signed by default. Are there some implementations that treat assigning
255 to a char as undefined behavior? (That would kind of screw up a lot of
code that uses "extended" ASCII characters, wouldn't it?)

-Howard

Jul 22 '05 #8

P: n/a

"Howard" <al*****@hotmail.com> wrote in message news:bp********@dispatch.concentric.net...
But I thought char was *neither* signed nor unsigned,
It is a distinct type from signed char or unsigned char, but it will have
the representation of one of those two (it's clearly signed in the original
poster's case).
Are there some implementations that treat assigning
255 to a char as undefined behavior?
Implementation-defined. Attempting to convert numbers that
are larger than can be represented into signed values is implmentation
defined. Unsigneds on the hand are required to wrap module 2**number of bits.

(That would kind of screw up a lot of
code that uses "extended" ASCII characters, wouldn't it?)


The problem is not the char representation of "FF" but the fact
that using an integer 0xFF to initialize a signed char may not yield
the right value.
Jul 22 '05 #9

P: n/a

"Ron Natalie" <ro*@sensor.com> wrote in message
news:3f*********************@news.newshosting.com. ..

"Dario de Judicibus" <no****@nowhere.com> wrote in message news:bp***********@newsreader2.mclink.it...
0x00FFFFFFC6
0xFFFFFFC5FFFFFFC8
0xFFFFFFC5FFFFFFB5
0xFFFFFFC2FFFFFFC8
0xFFFFFFB2FFFFFFE4
0xFFFFFFB209
Classic sign extension bug. Your signed char gets expanded to int

(standard procedure for vararg'd function like printf). For example 0xFF most likely initialized the char value as -1. "%X", -1 prints 0xFFFFFFFFF.

You either should use unsigned char or you will have to mask off the sign extensions.


Oh, I see, said the blind man! :-) That's pretty poor behavior, in my
opinion. I don't recall ever using unsigned char to store C-style arrays of
characters. I've always used just char. Of course, I don't think I've ever
used printf on such an array either, so I guess I wouldn't have noticed this
strange effect.

-Howard

Jul 22 '05 #10

P: n/a
On 19 Nov 2003 16:58:14 GMT, "Howard" <al*****@hotmail.com> wrote:
But I thought char was *neither* signed nor unsigned, unlike int, which
is signed by default. Are there some implementations that treat assigning
255 to a char as undefined behavior? (That would kind of screw up a lot of
code that uses "extended" ASCII characters, wouldn't it?)


It isn't undefined behaviour, only implementation defined (as signed
integer overflow generally is). On ASCII platforms, it generally "does
the right thing".

There are 3 distinct types - char, unsigned char and signed char.
Although char is a separate type, it can take on the same values as
either unsigned char or signed char. Some compilers offer a switch to
choose which you prefer.

Tom
Jul 22 '05 #11

P: n/a

"Julián Albo" <JU********@terra.es> wrote in message
news:3F***************@terra.es...
Dario de Judicibus escribió:
Use unsigned char.


Gosh.... It's THAT! Thank you. As usual, I knew but I did not *see* the
error. Grrrrr...

DdJ
Jul 22 '05 #12

P: n/a
"Howard" <al*****@hotmail.com> wrote in message
news:bp********@dispatch.concentric.net...

FIRST OF ALL, the problem was UNSIGNED char. Solved.
Without seeing your input, it's hard to tell. I have a question: why do you define the codes in the reverse order that you expect to see them? Is
it intentional (for some reason I can't imagine), or is your code doing the checks wrong?
I'm reading a Little Endian file. Code is missing of some encoding I'll do
between reading and writing files, of course.
I do see at least one clear problem: The handling of GotCR is not
correct. You never set it to false after the first time it gets set to
true. Your code to set it to false is in the else of an "if (GotCR)", which means it only gets set to false when it is ALREADY false!
First time only. That's just a reset.
It also looks like you're getting those "end" codes, which probably
means you have to handle them differently, but there's no code to detect and handle them. Same with the other codes, like the tab, etc..


That's just the foundation of code. I have to add other code, but first I
had to ensure that the basic code works. I'm writing a pipe. First stem is
to ensure that what's center in pipe go out correctly. Then I'll add some
other logics in the middle.

DdJ
Jul 22 '05 #13

P: n/a

FIRST OF ALL, the problem was UNSIGNED char. Solved.


Cool. I wouldn't have seen that sign problem, either. :-)

I do see at least one clear problem: The handling of GotCR is not
correct. You never set it to false after the first time it gets set to
true. Your code to set it to false is in the else of an "if (GotCR)",

which
means it only gets set to false when it is ALREADY false!


First time only. That's just a reset.


I don't understand what the first time has to do with it. My point was that
this else block of code accomplishes nothing (ever):

if (gotCR)
{...}
else
{
gotCR = false ;
}

The else block *only*gets called if gotCR is false, and all it does is set
gotCR to false, but it already *is* false. Therefore, either you don't need
the else statement at all, or else your design is incorrect and you meant to
be doing something different.

Glad you got the real problem solved.

-Howard

Jul 22 '05 #14

P: n/a

"Howard" <al*****@hotmail.com> wrote in message
news:bp********@dispatch.concentric.net...
I don't understand what the first time has to do with it. My point was that this else block of code accomplishes nothing (ever):

if (gotCR)
{...}
else
{
gotCR = false ;
}


There is another point previously in code where gotCR is set:

else if (code[0] == acr_code[1] && code[1] == acr_code[0])
{
gotCR = true ;
}

You probably missed it.

DdJ
Jul 22 '05 #15

P: n/a
I don't understand what the first time has to do with it. My point was

that
this else block of code accomplishes nothing (ever):

if (gotCR)
{...}
else
{
gotCR = false ;
}


There is another point previously in code where gotCR is set:

else if (code[0] == acr_code[1] && code[1] == acr_code[0])
{
gotCR = true ;
}

You probably missed it.


That's not relevant to my point at all. The else portion of the code I'm
talking about does nothing, ever! That else clause is *only* executed when
gotCR is false already. And all it does is set gotCR to false. What is it
that you think it does, and when? Walk through it. Suppose that gotCR is
false. In that case, the "if (gotCR)" condition wil fail, and the "else"
clause wil execute. But the else clause only sets gotCR to false. But we
just said that gotCR is false, so what does it mean to set gotCR to false if
it is already false? On the other hand, if gotCR is true, then the else
block will *not* execute. Therefore, the else block of code will either do
nothing (since setting a variable to false when it is already false is the
same as doing nothing), or else it will not execute at all. There's no
other condition here, regardless of any code anywhere else in the program.
In ALL cases, the following two pieces of code do exactly the same thing:

if (boolVariable)
{ doSomething; };
else
{ boolVariable = false; };

- and -

if (boolVariable)
{doSomething; };
(Try it out if you don't believe me. Put a breakpoint in your else clause
and you'll see that the line "gotCR = false;" is only executed when gotCR is
already false.)

-Howard


Jul 22 '05 #16

P: n/a
You are right.

My

if () {
} else if() {
if () {
} else {
}
}

must become

if () {
} else if() {
} else if () {
} else {
}

(some condition must be midified too).

Thank you

DdJ
Jul 22 '05 #17

This discussion thread is closed

Replies have been disabled for this discussion.