473,756 Members | 3,051 Online
Bytes | Software Development & Data Engineering Community
+ Post

Home Posts Topics Members FAQ

Storage of char in 64 bit machine

Hi all,

I have a simple definitioin in a C file something like this.

main()
{
char a;
.......
int k;
}

Since character is 8 bit, how is it stored in the machine in a 64 bit
machine. If it is word aligned, what about the rest of the bytes. What
about the retrievel of the char c, will it be expensive. Is it
expensive w.r.t read or write.

Thanx and Regards,
Aruna

Aug 14 '06
74 4587
Flash Gordon wrote:
we******@gmail. com wrote:
Flash Gordon wrote:
we******@gmail. com wrote:
Mikhail Teterin wrote:
Stephen Sprunk wrote:
The implementation is likely to have a very, very clever strcmp() that
will perform at least as well as your code (possibly doing the same thing
internally , if it's known to be safe) and likely even better if the
compiler is reasonably modern due special knowledge and treatment of
common functions/idioms.
Well, here are some benchmarks comparing the use of strcmp() to compare
short character strings (4 characters).
You shouldn't fall for the "strawman" argument, especially when its
clearly wrong. Obviously int32_t comparison will annihilate strcmp()
in real world code regardless of how good the compiler is.
Wrong. I've changed the benchmark slightly and can show strcmp being no
slower.
I tried your code, and was not able to reproduce your results on any of
the following compilers: gcc 3.4.4, WATCOM C/C++ 11.0c, MSVC 7.x,
Borland C++, Intel C 4.0. This is all on an Athlon CPU. Your silly
theory that inlining would matter is irrelevant -- some of those
compilers successfully inlined strcmp, but it made no difference, the
integer equality was always faster.

It was not a silly theory it was the result of profiling. It's is not my
fault if none of the compilers your use are as good as the one I use.
I'm not sure you understand how unlikely that is.
As I said else where. You assume that your experience applies to
everything else. It does not.
I'm waiting for a counter example -- this doesn't count as one, BTW.
The only interesting thing to note is that Microsoft's compiler was
able to hoist the integer equality test right out of the inner loop,
and thus compute it as a constant. So it was always basically
infinitely faster. The difference was closest with Borland C/C++ which
did a very bad integer test and was only 62% faster. Watcom, which did
not inline strcmp was 10 times faster with the integer comparison.
Intel, the fairest, inlined but didn't hoist the integer comparison and
yeilded a 7x performance improvement.

So? I have a compiler which produces different results thus countering
your claim that the integer comparison is *always* faster. One only
needs one counter example to disprove a claim of always, and no number
of example where integer comparison is faster will change that.
I only mean to say that "extraordin ary claims demand extraordinary
evidence". I can only ask this if I establish that your claim is
extraordinary, which I think I did.
If you are seeing something different, you need to explain more about
your platform and environment, and show the results of the objdump -d
on the bench.o file. There does not appear to be any normal way that
strcmp can be anywhere close to the integer comparison.

I'm not claiming it is using a "normal" way only that it find *a* way
however devious. Quite possibly it is making use of the fact that it
knows the strings are exactly 4 bytes.
That itself would still be considered "normal". In my thinking on
this, I have incorporated a lot of consideration for what the compiler
can and cannot do. By not "normal" I am talking about tricks like some
compilers used to "detect SpecCPU" (a famous benchmark) and replace
code with hand tuned transformations that are not ordinarily within the
capabilities of any compiler.
It is a bog-standard Ubunto install on a Fujitsu Siemens Amilo Pro 2040V
notebook.

markg@brenda:~/tmp$ gcc --version
gcc (GCC) 4.0.3 (Ubuntu 4.0.3-1ubuntu5)
Copyright (C) 2006 Free Software Foundation, Inc.
This is free software; see the source for copying conditions. There is NO
warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.

I'm not prepared to post long assembler dumps to this group since this
is a C group. However I do have a web site so I've shoved the dump of
the executable (I did not bother producing a .o file) up at
http://home.flash-gordon.me.uk/asm.txt for now.
You don't get it do you? From my previous post, I was just shy of
calling you a liar. With this post, I now fully confirm it.

That assembly code has no significant loops in it of any kind (it does
call gets for some reason). But you aren't just missing a .o file,
that main doesn't call out anywhere. So this *is* all the code, but it
does *NOT* run the benchmark in question.
[...] You are just lucky I did
it on an x86 based machine rather than a PPC based one!
It would slow me down a bit, but it would serve the exact same purpose
-- to prove that you are lying.
Please note that I don't want to get in to discussing the intricacies of
x86 assembler or what gcc has done.
Deep analysis is not necessary. Look for the fragment of code under
<main>, look for any j?? instructions and notice that there aren't any
(so there are no loops in main). Then look at all the lines with call
hex-number <symboliclabel> . Each symboliclabel points into standard
library functions. (A quick check of these labels earlier in the code
shows they are all direct jumps into the standard library, so no tricky
shenanigans are happening here.)
[...] I'm providing this for your interest
and as a demonstration of how a compiler/optimiser can be devious
potentially meaning that you don't get the performance boost you expect
on all systems under all conditions.
This is "deviousnes s" of the original source construction and claims
being made, and nothing more. I.e., the assembly does not correspond
to the source you claim it does, Mr. Fraud Gordon.
[...] The fact that you don't see the
same effect just shows how unpredictable these things are.
You just don't have any idea. If forced to reveal the assembly, you
cannot trick me. You simply don't possess the skill. You *believed* I
would not check this assembly, or perhaps thought I was all talk and no
show. You don't even realize that I actually compiled your source, ran
all the benches *AND* disassembled each one to confirm my understanding
of what was going on on *5* compilers. If you understood this, perhaps
you would have no tried to perpetrate such a blatant attempt at a con.

Of course, nobody else has chimed in (all you have to do is compile and
run his source to see he's obviously wrong/lying). Perhaps this
explains your level of expectation. I don't know why you thought you
would be able to get away with this, but I'm the wrong person to try
pull a trick like this on.

--
Paul Hsieh
http://www.pobox.com/~qed/
http://bstring.sf.net/

Aug 21 '06 #71
we******@gmail. com wrote:
Flash Gordon wrote:
>we******@gmail. com wrote:
>>Flash Gordon wrote:
we******@gmail. com wrote:
Mikhail Teterin wrote:
>Stephen Sprunk wrote:
>>The implementation is likely to have a very, very clever strcmp() that
>>will perform at least as well as your code (possibly doing the same thing
>>internall y, if it's known to be safe) and likely even better if the
>>compile r is reasonably modern due special knowledge and treatment of
>>common functions/idioms.
>Well, here are some benchmarks comparing the use of strcmp() to compare
>short character strings (4 characters).
You shouldn't fall for the "strawman" argument, especially when its
clearly wrong. Obviously int32_t comparison will annihilate strcmp()
in real world code regardless of how good the compiler is.
Wrong. I've changed the benchmark slightly and can show strcmp being no
slower.
I tried your code, and was not able to reproduce your results on any of
the following compilers: gcc 3.4.4, WATCOM C/C++ 11.0c, MSVC 7.x,
Borland C++, Intel C 4.0. This is all on an Athlon CPU. Your silly
theory that inlining would matter is irrelevant -- some of those
compilers successfully inlined strcmp, but it made no difference, the
integer equality was always faster.
It was not a silly theory it was the result of profiling. It's is not my
fault if none of the compilers your use are as good as the one I use.

I'm not sure you understand how unlikely that is.
>As I said else where. You assume that your experience applies to
everything else. It does not.

I'm waiting for a counter example -- this doesn't count as one, BTW.
I made a simple, honest mistake. I forgot that I had built another
sample of code from here in the same directory. Since I normally don't
bother specifying an output name it writes it as a.out, thus I put up
the wrong sample.
>>The only interesting thing to note is that Microsoft's compiler was
able to hoist the integer equality test right out of the inner loop,
and thus compute it as a constant. So it was always basically
infinitely faster. The difference was closest with Borland C/C++ which
did a very bad integer test and was only 62% faster. Watcom, which did
not inline strcmp was 10 times faster with the integer comparison.
Intel, the fairest, inlined but didn't hoist the integer comparison and
yeilded a 7x performance improvement.
So? I have a compiler which produces different results thus countering
your claim that the integer comparison is *always* faster. One only
needs one counter example to disprove a claim of always, and no number
of example where integer comparison is faster will change that.

I only mean to say that "extraordin ary claims demand extraordinary
evidence". I can only ask this if I establish that your claim is
extraordinary, which I think I did.
I put up the wrong dump. That is now rectified.
>>If you are seeing something different, you need to explain more about
your platform and environment, and show the results of the objdump -d
on the bench.o file. There does not appear to be any normal way that
strcmp can be anywhere close to the integer comparison.
I'm not claiming it is using a "normal" way only that it find *a* way
however devious. Quite possibly it is making use of the fact that it
knows the strings are exactly 4 bytes.

That itself would still be considered "normal". In my thinking on
this, I have incorporated a lot of consideration for what the compiler
can and cannot do. By not "normal" I am talking about tricks like some
compilers used to "detect SpecCPU" (a famous benchmark) and replace
code with hand tuned transformations that are not ordinarily within the
capabilities of any compiler.
Well, whatever gcc is doing on my system it is within the bounds of an
"ordinary" compiler since I do not have a custom copy of gcc.
>It is a bog-standard Ubunto install on a Fujitsu Siemens Amilo Pro 2040V
notebook.

markg@brenda :~/tmp$ gcc --version
gcc (GCC) 4.0.3 (Ubuntu 4.0.3-1ubuntu5)
Copyright (C) 2006 Free Software Foundation, Inc.
This is free software; see the source for copying conditions. There is NO
warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.

I'm not prepared to post long assembler dumps to this group since this
is a C group. However I do have a web site so I've shoved the dump of
the executable (I did not bother producing a .o file) up at
http://home.flash-gordon.me.uk/asm.txt for now.

You don't get it do you? From my previous post, I was just shy of
calling you a liar. With this post, I now fully confirm it.
No, it shows that like anyone else I can make an honest mistake. Do you
never make mistakes then?
That assembly code has no significant loops in it of any kind (it does
call gets for some reason). But you aren't just missing a .o file,
that main doesn't call out anywhere. So this *is* all the code, but it
does *NOT* run the benchmark in question.
Agreed. I did the wrong code by mistake.
>[...] You are just lucky I did
it on an x86 based machine rather than a PPC based one!

It would slow me down a bit, but it would serve the exact same purpose
-- to prove that you are lying.
No. It was a simple mistake.

<snip>
>[...] The fact that you don't see the
same effect just shows how unpredictable these things are.

You just don't have any idea. If forced to reveal the assembly, you
cannot trick me. You simply don't possess the skill. You *believed* I
would not check this assembly, or perhaps thought I was all talk and no
show. You don't even realize that I actually compiled your source, ran
all the benches *AND* disassembled each one to confirm my understanding
of what was going on on *5* compilers. If you understood this, perhaps
you would have no tried to perpetrate such a blatant attempt at a con.
I have no intention of trying to trick you. I already know that you are
far better at x86 assembler than I am ever likely to be since I stopped
writing assembler a few years back and even then it was not x86.
Of course, nobody else has chimed in (all you have to do is compile and
run his source to see he's obviously wrong/lying). Perhaps this
explains your level of expectation. I don't know why you thought you
would be able to get away with this, but I'm the wrong person to try
pull a trick like this on.
See above. I made a mistake and posted the wrong executable.

This time I've put the correct executable up as
http://home.flash-gordon.me.uk/a.out and the objdump output at
http://home.flash-gordon.me.uk/asm.txt

If you want you can look at the PPC equivalent generated on my Gentoo
box. On that I am certain it is cheating heavily since I get a reported
execution time of 0. Those are at
http://home.flash-gordon.me.uk/ppcasm.txt and
http://home.flash-gordon.me.uk/p.out

That machine (which is the web server) I could even give you a shell
account on and you could build the code yourself. You would just have to
give me a public key and I would create an account for you if you don't
believe what I'm telling you now that I've posted links to the correct code.

Sorry for the mistake, but it really was an honest one. Whatever I may
think of some of the things you post I do respect your assembler
knowledge and know that it is greater than mine.
--
Flash Gordon
Still sigless on this computer.
Aug 21 '06 #72
Flash Gordon wrote:
we******@gmail. com wrote:
Flash Gordon wrote:
As I said else where. You assume that your experience applies to
everything else. It does not.
I'm waiting for a counter example -- this doesn't count as one, BTW.

I made a simple, honest mistake. I forgot that I had built another
sample of code from here in the same directory. Since I normally don't
bother specifying an output name it writes it as a.out, thus I put up
the wrong sample.
Ok fine. But this only shows the problem with your methodology.
Well, whatever gcc is doing on my system it is within the bounds of an
"ordinary" compiler since I do not have a custom copy of gcc.
Indeed, its exploiting your bad benchmarking methodology.
That assembly code has no significant loops in it of any kind (it does
call gets for some reason). But you aren't just missing a .o file,
that main doesn't call out anywhere. So this *is* all the code, but it
does *NOT* run the benchmark in question.

Agreed. I did the wrong code by mistake.
You didn't check it either. There was a call to gets in there, and
that didn't seem to ring an alarm bell of any kind for you.
[...] If you want you can look at the PPC equivalent generated on my Gentoo
box. On that I am certain it is cheating heavily since I get a reported
execution time of 0.
And this doesn't make you stop and think even for one second?

Here are the relevant loops in the x86 build:

mov 0xffffffec(%ebp ),%eax
cmp %eax,0xfffffff0 (%ebp)
sete %al
movzbl %al,%eax
xor %edx,%edx

80485cf: add %eax,%ebx
add $0x1,%edx
cmp %edx,%esi
jne 80485cf
lea 0xffffffec(%ebp ),%eax
lea 0xfffffff0(%ebp ),%edx
mov %eax,0x4(%esp)
mov %edx,(%esp)
call 80483ac <strcmp@plt>
test %eax,%eax
sete %al
movzbl %al,%eax
xor %edx,%edx

8048694: add %eax,%edi
add $0x1,%edx
cmp %edx,%esi
jne 8048694

These are equivalent to:

int tmp = cur1.iCUR == cur2.iCur;
for (i = 0; i < iterations; i++)
t2 += (long) tmp;

and

int tmp = strcmp(cur1.acC UR, cur2.acCUR) == 0;
for (i = 0; i < iterations; i++)
t1 += (long) tmp;

respectively. Now lets cogitate on these code fragments for a second.
Ok, so gcc 4.x has decided that higher level hoisting is a feature
worth implementing. No doubt this is why Mikhail Teterin made the
function calls external in the first place (so that gcc couldn't pull
that kind of trick.)

In any event, we can see clearly that this benchmark doesn't actually
compare the int== speed versus a call to strlen(). (But you can
eyeball a 5-instructions versus 9-instructions plus a call to an
external function in the assembly above. If you managed to put those
into an inner loop you would get results more in lne with what everyone
else has been getting.) So all you've proven is that if you change
what you are measuring, you can make the measurements whatever you
like.

--
Paul Hsieh
http://www.pobox.com/~qed/
http://bstring.sf.net/

Aug 23 '06 #73

Stephen Sprunk wrote:
"Chris Torek" <no****@torek.n etwrote in message
news:ec******** *@news4.newsguy .com...
>CBFalconer wrote:
Lets cut that down to the essentials, and make it compilable:

int ceq(const char *c1, const char *c2)
{
return (*c1 == *c2) & (c1[1] == c2[1]) & (c1[2] == c2[2]);
}

with the same comments about & as yours. This must perform at
least 4 indexing operations, 6 dereferences, 3 comparisons, and 2
bit ands.
In article <11************ **********@i3g2 000cwc.googlegr oups.com>
<we******@gmail .comwrote:
>A modern CPU will perform these operations substantially in
*PARALLEL*.
Indeed. However, I suspect most programmers would have written it
as:

return c1[0] == c2[0] && c1[1] == c2[1] && c1[2] == c2[2];

and the double "&"s would prohibit parallel evaluation (since
any of c1+1, c1+2, c2+1, or c2+2 could run off the end of a
page boundary and hence cause a page fault when followed). (If
the CPU has a speculative fetch where such a fault is not
taken until the value is used, we can go back to the parallel
loading method.)

True, most novices probably would have used && instead of &; heck,
Mikhail seems fairly bright but he managed to turn the &s I wrote into
&&s. I used & intentionally to prevent short-circuit evaluation, which
would actually be harmful in this scenario.

Most modern CPUs will do speculative loads for the six chars, but doing
the tests in series, with branches in between, will still kill
performance due to lower IPC and horrible branch misprediction penalties
(up to 20 cycles per branch on some CPUs). Not that this is strictly
on-topic anyways, but IMHO discussing how to make portable code as fast
as unportable code should be.
Undoubtedly true for some processors, but not for all.
My benchmarks showed the && version running slightly
faster than the & version in all cases (by 10%-20%
or so, depending on the case). FWIW.

Aug 24 '06 #74
On Fri, 18 Aug 2006 00:24:09 GMT, Keith Thompson <ks***@mib.or g>
wrote:
Mikhail Teterin <us****@aldan.a lgebra.comwrite s:
| So, comparing, say, 4-char arrays (like currency codes) can NOT be done in
| the following way?
|
| typedef union {
| char acCUR[4];
| int32_t iCUR;
| } xCUR;
<snip using iCUR>
In this case, it seems to me that there are solutions better than
either using strcmp() or pretending that a char[4] is an int32_t.
You can probably achieve this by storing and comparing the currency
codes *as integers*. One simple way to do this is to compute the
numeric codes arithmetically from the string values. I think you said
that all the currency codes are two characters; if so, it's as simple
He said they are 3 characters plus null, hence the int32=4*char test
would -- if not for that danged UB -- always give the same equality
result as strcmp, i.e. there will never be any trailing possibly
garbage byte(s) ignored by strcmp but included in int32.

Unsurprisingly, as the applicable official standard for interchange,
ISO 4217, defines codes of three characters = 2 characters country
code (mostly ISO 3166) + one letter allowing for multiple (i.e.
replacement) currencies, and IME is often used internally as well.
More relevant to the discussion here, ISO 4217 also provides numeric
codes which will fit easily in a (minimal standard) 16-bit short,
which are equally official but IME not as widely used.
as:

numeric_code = (string_code[0] << CHAR_BIT) + string_code[1];
<snip>
This avoids the need for strcmp() *and* it doesn't depend on type
punning.
But I suspect it isn't very likely to be more efficient than strcmp(),
which was the stated goal. (Setting aside the usual discussions,
already had, about whether/when/how to optimize.)

- David.Thompson1 at worldnet.att.ne t
Aug 28 '06 #75

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

4
2559
by: David Garamond | last post by:
Is it the 4+N (aka. same as VARCHAR(n)) or is it N? Sorry, it was 100% not clear for me after reading the docs, though the docs imply the first: "The storage requirement for data of these types is 4 bytes plus the actual string, and in case of character plus the padding." As a comparison, MySQL seems to do storage saving for fixed-length character (it doesn't store the length of the string). -- dave
5
3861
by: aneesh | last post by:
Hi all, I have a program, this works fine but if we declare static below "int i" it shows different storage class specifier. what will be the reason. #include <stdlib.h> static int i ; int i; int main()
7
6145
by: Jim Showalter | last post by:
I always thought that it is safe for a function to return a pointer to static storage. And the following code does compile quietly with: gcc -pedantic -Wall -o foo foo.c #include <stdio.h> static char *foo (int y) { static char s;
13
1659
by: S.Tobias | last post by:
I'm examining the existence of temporary objects by looking at their addresses. The trick is to create a structure that contains an array as its first member. In an expression the array rvalue is converted to a pointer to its first member. Since this address is also the address of the array, and that is the address of the structure, I conclude that this is also the address of the temporary storage for the structure (r)value. I'm...
3
2727
by: Bas Wassink | last post by:
Hello there, I'm having trouble understanding a warning produced by 'splint', a code-checker. The warning produced is: keywords.c: (in function keyw_get_string) keywords.c:60:31: Released storage Keywords.Keyword reachable from global A global variable does not satisfy its annotations when control is transferred. (Use -globstate to inhibit warning) keywords.c:60:11: Storage Keywords.Keyword released
9
2438
by: CptDondo | last post by:
I am working on an embedded platform which has a block of battery-backed RAM. I need to store various types of data in this block of memory - for example, bitmapped data for control registers, strings for logging, and structures for data points. I want to use one function to read data from this block and one function to write data, for example: sram_read(OBJECT_IDENTIFIER) would return a pointer to the appriate object and
0
1880
by: Namratha Shah \(Nasha\) | last post by:
Hey Group, After a long week end I am back again. Its nice and refreshing after a short vacation so lets get started with .NET once again. Today we will discuss about Isolated Storage. This is one of the topics which I find interesting as I feel that it has a lot of practical usage or applicability. We all know that all applications need some storage space to archive certain
7
2282
by: lithiumcat | last post by:
Hi, I'm not yet very confident in my use of standard terminology, so please be kind if I'm mis-calling something, I will do my best no to make it again once pointed out. I'm wondering what is the lifetime or a compile-time string constant, I think that is what is called the storage duration of a string litteral.
0
9872
jinu1996
by: jinu1996 | last post by:
In today's digital age, having a compelling online presence is paramount for businesses aiming to thrive in a competitive landscape. At the heart of this digital strategy lies an intricately woven tapestry of website design and digital marketing. It's not merely about having a website; it's about crafting an immersive digital experience that captivates audiences and drives business growth. The Art of Business Website Design Your website is...
1
9841
by: Hystou | last post by:
Overview: Windows 11 and 10 have less user interface control over operating system update behaviour than previous versions of Windows. In Windows 11 and 10, there is no way to turn off the Windows Update option using the Control Panel or Settings app; it automatically checks for updates and installs any it finds, whether you like it or not. For most users, this new feature is actually very convenient. If you want to control the update process,...
0
9711
tracyyun
by: tracyyun | last post by:
Dear forum friends, With the development of smart home technology, a variety of wireless communication protocols have appeared on the market, such as Zigbee, Z-Wave, Wi-Fi, Bluetooth, etc. Each protocol has its own unique characteristics and advantages, but as a user who is planning to build a smart home system, I am a bit confused by the choice of these technologies. I'm particularly interested in Zigbee because I've heard it does some...
0
8712
agi2029
by: agi2029 | last post by:
Let's talk about the concept of autonomous AI software engineers and no-code agents. These AIs are designed to manage the entire lifecycle of a software development project—planning, coding, testing, and deployment—without human intervention. Imagine an AI that can take a project description, break it down, write the code, debug it, and then launch it, all on its own.... Now, this would greatly impact the work of software developers. The idea...
1
7244
isladogs
by: isladogs | last post by:
The next Access Europe User Group meeting will be on Wednesday 1 May 2024 starting at 18:00 UK time (6PM UTC+1) and finishing by 19:30 (7.30PM). In this session, we are pleased to welcome a new presenter, Adolph Dupré who will be discussing some powerful techniques for using class modules. He will explain when you may want to use classes instead of User Defined Types (UDT). For example, to manage the data in unbound forms. Adolph will...
0
6534
by: conductexam | last post by:
I have .net C# application in which I am extracting data from word file and save it in database particularly. To store word all data as it is I am converting the whole word file firstly in HTML and then checking html paragraph one by one. At the time of converting from word file to html my equations which are in the word document file was convert into image. Globals.ThisAddIn.Application.ActiveDocument.Select();...
0
5141
by: TSSRALBI | last post by:
Hello I'm a network technician in training and I need your help. I am currently learning how to create and manage the different types of VPNs and I have a question about LAN-to-LAN VPNs. The last exercise I practiced was to create a LAN-to-LAN VPN between two Pfsense firewalls, by using IPSEC protocols. I succeeded, with both firewalls in the same network. But I'm wondering if it's possible to do the same thing, with 2 Pfsense firewalls...
0
5303
by: adsilva | last post by:
A Windows Forms form does not have the event Unload, like VB6. What one acts like?
1
3805
by: 6302768590 | last post by:
Hai team i want code for transfer the data from one system to another through IP address by using C# our system has to for every 5mins then we have to update the data what the data is updated we have to send another system

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.