473,416 Members | 1,698 Online
Bytes | Software Development & Data Engineering Community
Post Job

Home Posts Topics Members FAQ

Join Bytes to post your question to a community of 473,416 software developers and data experts.

Strange behaviour of simple code

I'm getting crazy. Look at this code:

#include <string.h>
#include <stdio.h>
#include <iostream.h>

using namespace std ;

char ini_code[2] = {0xFF, 0xFE} ;
char line_sep[2] = {0x20, 0x28} ;
char para_sep[2] = {0x20, 0x29} ;
char end_code[2] = {0xFF, 0xFF} ;
char tab_code[2] = {0x00, 0x09} ;
char alf_code[2] = {0x00, 0x0A} ;
char acr_code[2] = {0x00, 0x0D} ;

int main ()
{
char code[2] ;
bool gotCR = false ;

cin.read(&code[0], 2) ;

code[0] = ini_code[1] ;
code[1] = ini_code[0] ;
printf("0x%02X%02X\n",code[0],code[1]);
while (cin.read(&code[0], 2))
{
if (code[0] == tab_code[1] && code[1] == tab_code[0])
{
code[0] = line_sep[1] ;
code[1] = line_sep[0] ;
printf("0x%02X%02X\n",code[0],code[1]);
}
else if (code[0] == acr_code[1] && code[1] == acr_code[0])
{
gotCR = true ;
}
else if (code[0] == alf_code[1] && code[1] == alf_code[0])
{
if (gotCR)
{
code[0] = para_sep[1] ;
code[1] = para_sep[0] ;
printf("0x%02X%02X\n",code[0],code[1]);
}
else
{
gotCR = false ;
}
}
else
{
printf("0x%02X%02X\n",code[0],code[1]);
}
}

code[0] = end_code[1] ;
code[1] = end_code[0] ;
printf("0x%02X%02X\n",code[0],code[1]);

return 0 ;
}

I expect a list of

0xNNNN
0xNNNN
....
0xNNNN

Instead I obtain stuff like:

0x004B
0x006F
0x0072
0x0065
0x0061
0x006E
0x0009
0x00FFFFFFC6
0xFFFFFFC5FFFFFFC8
0xFFFFFFC5FFFFFFB5
0xFFFFFFC2FFFFFFC8
0xFFFFFFB2FFFFFFE4
0xFFFFFFB209
0x0009
0x0009
0x0000
0x4800
0x6500
0x6200
0x7200
0x6500

Why??????

~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Dr. Dario de Judicibus
http://www.dejudicibus.it/
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Jul 22 '05 #1
16 3077
Dario de Judicibus escribió:
char ini_code[2] = {0xFF, 0xFE} ;
char line_sep[2] = {0x20, 0x28} ;
char para_sep[2] = {0x20, 0x29} ;
char end_code[2] = {0xFF, 0xFF} ;
char tab_code[2] = {0x00, 0x09} ;
char alf_code[2] = {0x00, 0x0A} ;
char acr_code[2] = {0x00, 0x0D} ;


Use unsigned char.

Regards.
Jul 22 '05 #2
char ini_code[2] = {0xFF, 0xFE} ;
char line_sep[2] = {0x20, 0x28} ;
char para_sep[2] = {0x20, 0x29} ;
char end_code[2] = {0xFF, 0xFF} ; <= what should happen when you detect this? char tab_code[2] = {0x00, 0x09} ;
char alf_code[2] = {0x00, 0x0A} ;
char acr_code[2] = {0x00, 0x0D} ;

int main ()
{
char code[2] ;
bool gotCR = false ;

cin.read(&code[0], 2) ;

code[0] = ini_code[1] ;
code[1] = ini_code[0] ;
printf("0x%02X%02X\n",code[0],code[1]);
while (cin.read(&code[0], 2))
{
if (code[0] == tab_code[1] && code[1] == tab_code[0])
{
code[0] = line_sep[1] ;
code[1] = line_sep[0] ;
printf("0x%02X%02X\n",code[0],code[1]);
}
else if (code[0] == acr_code[1] && code[1] == acr_code[0])
{
gotCR = true ;
}
else if (code[0] == alf_code[1] && code[1] == alf_code[0])
{
if (gotCR)
{
code[0] = para_sep[1] ;
code[1] = para_sep[0] ;
printf("0x%02X%02X\n",code[0],code[1]);
}
else
{
gotCR = false ; <= this is ALREADY false here!
}
}
else
{
printf("0x%02X%02X\n",code[0],code[1]);
}
}

code[0] = end_code[1] ;
code[1] = end_code[0] ;
printf("0x%02X%02X\n",code[0],code[1]);

return 0 ;
}

I expect a list of

0xNNNN
0xNNNN
...
0xNNNN

Instead I obtain stuff like:

0x004B
0x006F
0x0072
0x0065
0x0061
0x006E
0x0009
0x00FFFFFFC6
0xFFFFFFC5FFFFFFC8
0xFFFFFFC5FFFFFFB5
0xFFFFFFC2FFFFFFC8
0xFFFFFFB2FFFFFFE4
0xFFFFFFB209
0x0009
0x0009
0x0000
0x4800
0x6500
0x6200
0x7200
0x6500

Why??????

~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Dr. Dario de Judicibus
http://www.dejudicibus.it/
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~


Without seeing your input, it's hard to tell. I have a question: why do
you define the codes in the reverse order that you expect to see them? Is
it intentional (for some reason I can't imagine), or is your code doing the
checks wrong?

I do see at least one clear problem: The handling of GotCR is not
correct. You never set it to false after the first time it gets set to
true. Your code to set it to false is in the else of an "if (GotCR)", which
means it only gets set to false when it is ALREADY false!

It also looks like you're getting those "end" codes, which probably
means you have to handle them differently, but there's no code to detect and
handle them. Same with the other codes, like the tab, etc..

But again, with no input to go by, we can't tell how the output gets
generated for sure. Try walking through your app in the debugger and see
what the variable values are at each step. You might also try doing it on
paper to check your design.

-Howard

Jul 22 '05 #3

char ini_code[2] = {0xFF, 0xFE} ;
char line_sep[2] = {0x20, 0x28} ;
char para_sep[2] = {0x20, 0x29} ;
char end_code[2] = {0xFF, 0xFF} ;
char tab_code[2] = {0x00, 0x09} ;
char alf_code[2] = {0x00, 0x0A} ;
char acr_code[2] = {0x00, 0x0D} ;
Use unsigned char.

Why? The char type is neither unsigned nor signed unless explicitly
stated, and he's not doing any math or '>' or '<' comparisons where signed
vs. unsigned might make a difference.

The problems, I think, are that his logic is incorrect and incomplete.
(He's not handling all cases, and he's handling the CR incorrectly.)

-Howard

Jul 22 '05 #4
Howard escribió:
char ini_code[2] = {0xFF, 0xFE} ;
char line_sep[2] = {0x20, 0x28} ;
char para_sep[2] = {0x20, 0x29} ;
char end_code[2] = {0xFF, 0xFF} ;
char tab_code[2] = {0x00, 0x09} ;
char alf_code[2] = {0x00, 0x0A} ;
char acr_code[2] = {0x00, 0x0D} ;

Use unsigned char.

Why? The char type is neither unsigned nor signed unless explicitly
stated, and he's not doing any math or '>' or '<' comparisons where signed
vs. unsigned might make a difference.


The char type is a separated type at many effects, but or it has sign or
it has not. If is has sign, 0xFF when converted to int and outputted in
hex gives many more F, as the ouput of the OP shows. Them I suppose that
is the case,

Regards.
Jul 22 '05 #5

"Howard" <al*****@hotmail.com> wrote in message news:bp********@dispatch.concentric.net...

char ini_code[2] = {0xFF, 0xFE} ;
Use unsigned char.

Why? The char type is neither unsigned nor signed unless explicitly
stated, and he's not doing any math or '>' or '<' comparisons where signed
vs. unsigned might make a difference.


If char is 8 bits and signed, 0xFF isn't a defined initializer.
Jul 22 '05 #6

"Dario de Judicibus" <no****@nowhere.com> wrote in message news:bp***********@newsreader2.mclink.it...
0x00FFFFFFC6
0xFFFFFFC5FFFFFFC8
0xFFFFFFC5FFFFFFB5
0xFFFFFFC2FFFFFFC8
0xFFFFFFB2FFFFFFE4
0xFFFFFFB209


Classic sign extension bug. Your signed char gets expanded to int (standard procedure for
vararg'd function like printf). For example 0xFF most likely initialized the char value as -1.
"%X", -1 prints 0xFFFFFFFFF.

You either should use unsigned char or you will have to mask off the sign extensions.
Jul 22 '05 #7

"Ron Natalie" <ro*@sensor.com> wrote in message
news:3f*********************@news.newshosting.com. ..

"Howard" <al*****@hotmail.com> wrote in message

news:bp********@dispatch.concentric.net...

> char ini_code[2] = {0xFF, 0xFE} ; Use unsigned char.

Why? The char type is neither unsigned nor signed unless explicitly
stated, and he's not doing any math or '>' or '<' comparisons where signed vs. unsigned might make a difference.


If char is 8 bits and signed, 0xFF isn't a defined initializer.


???

But I thought char was *neither* signed nor unsigned, unlike int, which
is signed by default. Are there some implementations that treat assigning
255 to a char as undefined behavior? (That would kind of screw up a lot of
code that uses "extended" ASCII characters, wouldn't it?)

-Howard

Jul 22 '05 #8

"Howard" <al*****@hotmail.com> wrote in message news:bp********@dispatch.concentric.net...
But I thought char was *neither* signed nor unsigned,
It is a distinct type from signed char or unsigned char, but it will have
the representation of one of those two (it's clearly signed in the original
poster's case).
Are there some implementations that treat assigning
255 to a char as undefined behavior?
Implementation-defined. Attempting to convert numbers that
are larger than can be represented into signed values is implmentation
defined. Unsigneds on the hand are required to wrap module 2**number of bits.

(That would kind of screw up a lot of
code that uses "extended" ASCII characters, wouldn't it?)


The problem is not the char representation of "FF" but the fact
that using an integer 0xFF to initialize a signed char may not yield
the right value.
Jul 22 '05 #9

"Ron Natalie" <ro*@sensor.com> wrote in message
news:3f*********************@news.newshosting.com. ..

"Dario de Judicibus" <no****@nowhere.com> wrote in message news:bp***********@newsreader2.mclink.it...
0x00FFFFFFC6
0xFFFFFFC5FFFFFFC8
0xFFFFFFC5FFFFFFB5
0xFFFFFFC2FFFFFFC8
0xFFFFFFB2FFFFFFE4
0xFFFFFFB209
Classic sign extension bug. Your signed char gets expanded to int

(standard procedure for vararg'd function like printf). For example 0xFF most likely initialized the char value as -1. "%X", -1 prints 0xFFFFFFFFF.

You either should use unsigned char or you will have to mask off the sign extensions.


Oh, I see, said the blind man! :-) That's pretty poor behavior, in my
opinion. I don't recall ever using unsigned char to store C-style arrays of
characters. I've always used just char. Of course, I don't think I've ever
used printf on such an array either, so I guess I wouldn't have noticed this
strange effect.

-Howard

Jul 22 '05 #10
On 19 Nov 2003 16:58:14 GMT, "Howard" <al*****@hotmail.com> wrote:
But I thought char was *neither* signed nor unsigned, unlike int, which
is signed by default. Are there some implementations that treat assigning
255 to a char as undefined behavior? (That would kind of screw up a lot of
code that uses "extended" ASCII characters, wouldn't it?)


It isn't undefined behaviour, only implementation defined (as signed
integer overflow generally is). On ASCII platforms, it generally "does
the right thing".

There are 3 distinct types - char, unsigned char and signed char.
Although char is a separate type, it can take on the same values as
either unsigned char or signed char. Some compilers offer a switch to
choose which you prefer.

Tom
Jul 22 '05 #11

"Julián Albo" <JU********@terra.es> wrote in message
news:3F***************@terra.es...
Dario de Judicibus escribió:
Use unsigned char.


Gosh.... It's THAT! Thank you. As usual, I knew but I did not *see* the
error. Grrrrr...

DdJ
Jul 22 '05 #12
"Howard" <al*****@hotmail.com> wrote in message
news:bp********@dispatch.concentric.net...

FIRST OF ALL, the problem was UNSIGNED char. Solved.
Without seeing your input, it's hard to tell. I have a question: why do you define the codes in the reverse order that you expect to see them? Is
it intentional (for some reason I can't imagine), or is your code doing the checks wrong?
I'm reading a Little Endian file. Code is missing of some encoding I'll do
between reading and writing files, of course.
I do see at least one clear problem: The handling of GotCR is not
correct. You never set it to false after the first time it gets set to
true. Your code to set it to false is in the else of an "if (GotCR)", which means it only gets set to false when it is ALREADY false!
First time only. That's just a reset.
It also looks like you're getting those "end" codes, which probably
means you have to handle them differently, but there's no code to detect and handle them. Same with the other codes, like the tab, etc..


That's just the foundation of code. I have to add other code, but first I
had to ensure that the basic code works. I'm writing a pipe. First stem is
to ensure that what's center in pipe go out correctly. Then I'll add some
other logics in the middle.

DdJ
Jul 22 '05 #13

FIRST OF ALL, the problem was UNSIGNED char. Solved.


Cool. I wouldn't have seen that sign problem, either. :-)

I do see at least one clear problem: The handling of GotCR is not
correct. You never set it to false after the first time it gets set to
true. Your code to set it to false is in the else of an "if (GotCR)",

which
means it only gets set to false when it is ALREADY false!


First time only. That's just a reset.


I don't understand what the first time has to do with it. My point was that
this else block of code accomplishes nothing (ever):

if (gotCR)
{...}
else
{
gotCR = false ;
}

The else block *only*gets called if gotCR is false, and all it does is set
gotCR to false, but it already *is* false. Therefore, either you don't need
the else statement at all, or else your design is incorrect and you meant to
be doing something different.

Glad you got the real problem solved.

-Howard

Jul 22 '05 #14

"Howard" <al*****@hotmail.com> wrote in message
news:bp********@dispatch.concentric.net...
I don't understand what the first time has to do with it. My point was that this else block of code accomplishes nothing (ever):

if (gotCR)
{...}
else
{
gotCR = false ;
}


There is another point previously in code where gotCR is set:

else if (code[0] == acr_code[1] && code[1] == acr_code[0])
{
gotCR = true ;
}

You probably missed it.

DdJ
Jul 22 '05 #15
I don't understand what the first time has to do with it. My point was

that
this else block of code accomplishes nothing (ever):

if (gotCR)
{...}
else
{
gotCR = false ;
}


There is another point previously in code where gotCR is set:

else if (code[0] == acr_code[1] && code[1] == acr_code[0])
{
gotCR = true ;
}

You probably missed it.


That's not relevant to my point at all. The else portion of the code I'm
talking about does nothing, ever! That else clause is *only* executed when
gotCR is false already. And all it does is set gotCR to false. What is it
that you think it does, and when? Walk through it. Suppose that gotCR is
false. In that case, the "if (gotCR)" condition wil fail, and the "else"
clause wil execute. But the else clause only sets gotCR to false. But we
just said that gotCR is false, so what does it mean to set gotCR to false if
it is already false? On the other hand, if gotCR is true, then the else
block will *not* execute. Therefore, the else block of code will either do
nothing (since setting a variable to false when it is already false is the
same as doing nothing), or else it will not execute at all. There's no
other condition here, regardless of any code anywhere else in the program.
In ALL cases, the following two pieces of code do exactly the same thing:

if (boolVariable)
{ doSomething; };
else
{ boolVariable = false; };

- and -

if (boolVariable)
{doSomething; };
(Try it out if you don't believe me. Put a breakpoint in your else clause
and you'll see that the line "gotCR = false;" is only executed when gotCR is
already false.)

-Howard


Jul 22 '05 #16
You are right.

My

if () {
} else if() {
if () {
} else {
}
}

must become

if () {
} else if() {
} else if () {
} else {
}

(some condition must be midified too).

Thank you

DdJ
Jul 22 '05 #17

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

0
by: Phil | last post by:
Hi, I don't understand this strange behaviour: I compile this code : #include <Python.h> #include"Numeric/arrayobject.h" static PyObject *
0
by: Ethel Aardvark | last post by:
I am running a 9.0.1 database on a W2K server and have come across some strange behaviour with a SQL query. I have a query which runs in a PL/SQL cursor which has several PL/SQL variables used to...
4
by: Torsten Reiners | last post by:
Hi, it might be a simple solution but I do not see it. The problem is that I have the following file stored on my local harddrive. All references are URL to a remote computer. It is working,...
0
by: Grzegorz Kaczor | last post by:
Hello all, I've got a VERY strange network problem with Win2k Server and .NET. I've got one central server (hub) getting raw binary data (files) from many locations. Both server and clients...
3
by: Sebastian C. | last post by:
Hello everybody Since I upgraded my Office XP Professional to SP3 I got strange behaviour. Pieces of code which works for 3 years now are suddenly stop to work properly. I have Office XP...
31
by: DeltaOne | last post by:
#include<stdio.h> typedef struct test{ int i; int j; }test; main(){ test var; var.i=10; var.j=20;
3
by: Michael Meckelein | last post by:
Hello, I run into trouble move down a selected item in a listbox. The code moving down the item is the following one: for (int j = lv.SelectedItems.Count-1; j >=0; j--) { ListViewItem...
1
by: JoReiners | last post by:
Hello, I have a really strange problem. I'm unable to figure it out on my own. I parse very simple xml documents, without any check for their form. These files look very similar and are encoded...
8
by: Dox33 | last post by:
I ran into a very strange behaviour of raw_input(). I hope somebody can tell me how to fix this. (Or is this a problem in the python source?) I will explain the problem by using 3 examples....
20
by: Pilcrow | last post by:
This behavior seems very strange to me, but I imagine that someone will be able to 'explain' it in terms of the famous C standard. -------------------- code -----------------------------------...
0
by: Hystou | last post by:
There are some requirements for setting up RAID: 1. The motherboard and BIOS support RAID configuration. 2. The motherboard has 2 or more available SATA protocol SSD/HDD slots (including MSATA, M.2...
0
marktang
by: marktang | last post by:
ONU (Optical Network Unit) is one of the key components for providing high-speed Internet services. Its primary function is to act as an endpoint device located at the user's premises. However,...
0
by: Hystou | last post by:
Most computers default to English, but sometimes we require a different language, especially when relocating. Forgot to request a specific language before your computer shipped? No problem! You can...
0
Oralloy
by: Oralloy | last post by:
Hello folks, I am unable to find appropriate documentation on the type promotion of bit-fields when using the generalised comparison operator "<=>". The problem is that using the GNU compilers,...
0
by: Hystou | last post by:
Overview: Windows 11 and 10 have less user interface control over operating system update behaviour than previous versions of Windows. In Windows 11 and 10, there is no way to turn off the Windows...
0
tracyyun
by: tracyyun | last post by:
Dear forum friends, With the development of smart home technology, a variety of wireless communication protocols have appeared on the market, such as Zigbee, Z-Wave, Wi-Fi, Bluetooth, etc. Each...
0
agi2029
by: agi2029 | last post by:
Let's talk about the concept of autonomous AI software engineers and no-code agents. These AIs are designed to manage the entire lifecycle of a software development project—planning, coding, testing,...
0
isladogs
by: isladogs | last post by:
The next Access Europe User Group meeting will be on Wednesday 1 May 2024 starting at 18:00 UK time (6PM UTC+1) and finishing by 19:30 (7.30PM). In this session, we are pleased to welcome a new...
0
by: conductexam | last post by:
I have .net C# application in which I am extracting data from word file and save it in database particularly. To store word all data as it is I am converting the whole word file firstly in HTML and...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.