473,769 Members | 6,404 Online
Bytes | Software Development & Data Engineering Community
+ Post

Home Posts Topics Members FAQ

Storage of char in 64 bit machine

Hi all,

I have a simple definitioin in a C file something like this.

main()
{
char a;
.......
int k;
}

Since character is 8 bit, how is it stored in the machine in a 64 bit
machine. If it is word aligned, what about the rest of the bytes. What
about the retrievel of the char c, will it be expensive. Is it
expensive w.r.t read or write.

Thanx and Regards,
Aruna

Aug 14 '06
74 4601
Mikhail Teterin wrote:
Stephen Sprunk wrote:
The implementation is likely to have a very, very clever strcmp() that
will perform at least as well as your code (possibly doing the same thing
internally, if it's known to be safe) and likely even better if the
compiler is reasonably modern due special knowledge and treatment of
common functions/idioms.

Well, here are some benchmarks comparing the use of strcmp() to compare
short character strings (4 characters).
You shouldn't fall for the "strawman" argument, especially when its
clearly wrong. Obviously int32_t comparison will annihilate strcmp()
in real world code regardless of how good the compiler is. The only
chance the compiler has is to support cross-file inlining, constant
propagation, and a kind of global string analysis that just isn't going
to yeild fruitful optimizations in general. And that would be just to
try to pull even with integer comparison. If you used memcmp() instead
of strcmp(), things *might* be different.

Benching this is unnecessary. *Any* inner loop that contains a str*
function is always going to be way slower than any reasonably
reconsidered alternative. That's a rule of thumb people should just
always be aware of. In many cases, the std C library is actually
behind by actual orders of complexity. Here its slower because of the
poor not really parallizable str* function design (it ends up
pointlessly scanning for 4 '\0' characters no matter what).
[...] It seems, that for the limited cases like this -- when the strings are of
the same length and fit nicely into an integer type -- treating them as such
is hugely beneficial. And, contrary to authoritative assertions posted in
this thread, compiler is NOT able to detect such cases.
These "authoritie s" are overstepping their bounds. They have decided
that this is a newsgroup just about the C standard. Practical C
programming is way beyond both the scope of this newsgroup and those so
called authorities. Their brains have, over the years, become
hardwired in a very specific way -- if you ask the question "how do I
improve the performance of some C code" they will definitively answer
"you can't; C is not characterizable in terms of performance and
furthermore your efforts will always be made irrelevant because of how
good your compiler is, and besides you are premature in your attempts
to optimize". It doesn't matter *how* you ask the question, they will
always answer the question that way. So you should not take their
naysaying too seriously, they are just replaying a
keyboard/brain-macro.

The issue with your "trick" is that its not portable. But that's
typical of anything useful that most people do in the C language.
Otherwise, yes of course it will lead to an enormous performance
improvement (after all you are removing a str* from an inner loop.)

To make it portable, you probably want to try to isolate the *system*
through ifdef's. The reason is that you need to isolate *systems* that
align things on at most 32-bit boundaries. The exceptions will be
characterized by system. Then there's the whole sizeof(int32_t) issue
-- you probably just want to deal with 3 cases of 1, 2, 4 and forget
the other ones. Then, since you know that these strings are exactly 3
characters, you should use *memcmp()*, not strcmp() (its a little
faster) on the fallback cases.

--
Paul Hsieh
http://www.pobox.com/~qed/
http://bstring.sf.net/

Aug 17 '06 #41
Eric Sosman wrote:
we******@gmail. com wrote:
Mikhail Teterin wrote:
>So, comparing, say, 4-char arrays (like currency codes) can NOT be done in
the following way?

typedef union {
char acCUR[4];
int32_t iCUR;
} xCUR;
[...]

Having to call a strcmp() in such cases seems like a bad waste to me, but I
don't see, how the compiler could possibly optimize such a code without the
trick above...
The compiler cannot do that optimization because its not correct.
strcmp() stops executing its inner loop once it reads a '\0'. I.e.,
its possible for the strcmp()'s to be equal where the int32_t's are not
equal. Also the compiler is allowed to align struct entries as they
like. So on a 64 bit big endian system, the int32_t might not
intersect with any of the 4 acCur[] characters.

Agree with the first part but not with the second. Look
again: it's not a struct, but a union.
If the system is a 64 bit one, then it might force all entries to be
aligned to 64 bit boundaries. Then if its big-endian, it might throw
the relevant bits of the int32_t entry into the high 4 bytes of the 8
byte word. I.e., the acCUR contents might just be overlapping the
*padding* that's put in front of the 32-bit entry. So what exactly do
I have to look again at?

--
Paul Hsieh
http://www.pobox.com/~qed/
http://bstring.sf.net/

Aug 17 '06 #42
we******@gmail. com wrote:
Eric Sosman wrote:
>>we******@gmai l.com wrote:
>>>Mikhail Teterin wrote:

So, comparing, say, 4-char arrays (like currency codes) can NOT be done in
the following way?

typedef union {
char acCUR[4];
int32_t iCUR;
} xCUR;
[...]

[...] Also the compiler is allowed to align struct entries as they
like. So on a 64 bit big endian system, the int32_t might not
intersect with any of the 4 acCur[] characters.

Agree with the first part but not with the second. Look
again: it's not a struct, but a union.

If the system is a 64 bit one, then it might force all entries to be
aligned to 64 bit boundaries. Then if its big-endian, it might throw
the relevant bits of the int32_t entry into the high 4 bytes of the 8
byte word. I.e., the acCUR contents might just be overlapping the
*padding* that's put in front of the 32-bit entry. So what exactly do
I have to look again at?
You could look at 6.5.8/5:

"[...] All pointers to members of the same union
object compare equal."

.... and at 6.7.2.1/14:

"[...] A pointer to a union object, suitably converted,
points to each of its members [...] and vice versa."

That is, there is no padding at the start of a union. The
four bytes (assuming CHAR_BIT==8) of the int32_t must occupy
the same four bytes as the char[4] array. If the union as a
whole is padded to eight bytes, the extra four are after the
int32_t and after the char[4], never before them.

Things would be different if he'd used int_least32_t or
int_fast32_t. With the exact-width int32_t he's on solid
ground (still assuming CHAR_BIT==8) as far as the overlap is
concerned, although he's on quicksand in other respects.

--
Eric Sosman
es*****@acm-dot-org.invalid
Aug 17 '06 #43
Mikhail Teterin wrote:
Keith Thompson wrote:
>Didn't we just go over this? Is there some reason you couldn't
have posted the C code as part of your article?

We did. And I ended up convinced, that only inertia (and the desire
to force a newcomer to obey the rules of the club), are what makes
this an issue in the first place.

People with news-readers, that are not MIME-aware will just see
these textual attachments as part of the article.

MIME-aware news-readers will be able to handle them better this way...

Sorry, if it were one file, I would've inlined it, but three --
that's just too much trouble.
I am restraining myself. Don't you realize there are good reasons
for these rules? Many systems will automatically strip any
attachments from news articles, or even destroy the article
altogether.

No, we don't automatically get together and harass all newbies for
the pure unadulterated pleasure of plaguing them. Yes, you are
suffering from paranoia.

--
"The power of the Executive to cast a man into prison without
formulating any charge known to the law, and particularly to
deny him the judgement of his peers, is in the highest degree
odious and is the foundation of all totalitarian government
whether Nazi or Communist." -- W. Churchill, Nov 21, 1943
Aug 17 '06 #44
we******@gmail. com wrote:
>I know. Currencies, however, are all 3-character strings (plus the
terminating '\0'). Thus they are perfectly suited to be treated as
int32_t, when convenient.

If that is true, then in fact this is a useful performance boost, but
its platform specific. Personally, I would just capture it like this:
*((int32_t *) &currency) rather than bothering with the union.
A plain string would not neccessarily allign the way an int32_t would
allign, would it? This can cause nasty SIGBUS-es...
>That it is not 100% portable is already rammed into me by the friendly
folks on this board. I'd like to know an example of the actual
hardware/compiler combo, where it would not work, though...

I am pretty sure there are real 64 bit systems (though likely they are
marginal) that will fail to do your trick correctly. Not AMD64, but
some old Crays or Sparc64s might in fact fail (they need to be big
Endian, and align struct/union entries to 64 bits).
That's why I used the union -- does not it ensure correctness even on these
8-byte alligning platforms? I thought, my only problem is with finding the
correct integer type to cover exactly the 4 characters.
And obviously those silly DSPs that don't support int32_t's would just
fail to compile your code.
Compile-time failure is acceptable -- it is the run-time, that worries me.

-mi

Aug 17 '06 #45
Eric Sosman wrote:
Elsethread you have posted an attempt to quantify HOW MUCH,
and your results suggest that the particular "cool" trick you
favor will save ...

0.0000000071303 7 seconds per comparison.
It is 4-6 times faster -- that, the relative, rather than absolute
difference -- is what counts.

A human being can not distinguish a millisecond from a microsecond.

Until the operation needs to be repeated a million time, that is. Because
millisecond becomes 16 minutes, while microsecond -- only a second.

An EOD (End-of-Day) process in a big bank/hedge fund makes (sometimes) many
millions of such currency comparisions as well as assignments (which don't
_really_ require a memcpy/strcpy), etc.

Using pessimal code for the sake of maintaining portability to Digital
Signal Processing boards is even more wasteful than owning an SUV for the
sake of an imaginary (but possible) once-in-a-lifetime off-road ride.
Mikhail, I can see your error and understand it and sympathize
Thank you, thank you. Agreeing to disagree.

-mi
Aug 17 '06 #46

Mikhail Teterin wrote:
Keith Thompson wrote:
Didn't we just go over this? Is there some reason you couldn't have
posted the C code as part of your article?

We did. And I ended up convinced, that only inertia (and the desire to force
a newcomer to obey the rules of the club), are what makes this an issue in
the first place.
What sort of response would have convinced you otherwise?

If the answer is, "There isn't one," then there doesn't
seem to be much point in talking to you, does there?

Aug 17 '06 #47
In article <16************ ****@aldan.alge bra.com>,
Mikhail Teterin <us*********@al dan.algebra.com wrote:
>An EOD (End-of-Day) process in a big bank/hedge fund makes (sometimes) many
millions of such currency comparisions as well as assignments (which don't
_really_ require a memcpy/strcpy), etc.
>Using pessimal code for the sake of maintaining portability to Digital
Signal Processing boards is even more wasteful than owning an SUV for the
sake of an imaginary (but possible) once-in-a-lifetime off-road ride.
In that End-of-Day processing, is the bank/fund doing nothing when
the currencies do not match, or is the bank/fund doing a currency
conversion? If it is doing a conversion, then eliminate the check
by simply setting the conversion rate for all currencies to themselves
to be 1, and then unconditionally using the conversion routine
with the conversion factor indexed by both conversions.

But the point is moot, because we know that in the OP's code, the
OP is not primarily concerned with speed: the primary concern for
the OP is ease of debugging, and the code is intended by the OP to
be as fast as practical provided the ease of debugging is preserved.
If speed were the primary concern, then the OP's code would have
converted to an arbitrary currency index and then would compare the
indices "millions of times".
--
"law -- it's a commodity"
-- Andrew Ryan (The Globe and Mail, 2005/11/26)
Aug 17 '06 #48
Mikhail Teterin wrote:
Keith Thompson wrote:
I, for one, will ignore any attachments posted to this newsgroup.

I have not spent much time on this group, but, so far, I have not seen
anything that would make me truly saddened by your decision...
Then you are a fool.

Brian

Aug 17 '06 #49
We went from "don't do this, you idiot, the compiler is much better
optimizing, than you will ever be" to "yeah, it is 4-6 times faster to do
things your way, but it is still not worth the effort".

I think, this is a considerable progress for one thread and will shut up for
a while...

-mi
Aug 17 '06 #50

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

4
2560
by: David Garamond | last post by:
Is it the 4+N (aka. same as VARCHAR(n)) or is it N? Sorry, it was 100% not clear for me after reading the docs, though the docs imply the first: "The storage requirement for data of these types is 4 bytes plus the actual string, and in case of character plus the padding." As a comparison, MySQL seems to do storage saving for fixed-length character (it doesn't store the length of the string). -- dave
5
3861
by: aneesh | last post by:
Hi all, I have a program, this works fine but if we declare static below "int i" it shows different storage class specifier. what will be the reason. #include <stdlib.h> static int i ; int i; int main()
7
6146
by: Jim Showalter | last post by:
I always thought that it is safe for a function to return a pointer to static storage. And the following code does compile quietly with: gcc -pedantic -Wall -o foo foo.c #include <stdio.h> static char *foo (int y) { static char s;
13
1660
by: S.Tobias | last post by:
I'm examining the existence of temporary objects by looking at their addresses. The trick is to create a structure that contains an array as its first member. In an expression the array rvalue is converted to a pointer to its first member. Since this address is also the address of the array, and that is the address of the structure, I conclude that this is also the address of the temporary storage for the structure (r)value. I'm...
3
2727
by: Bas Wassink | last post by:
Hello there, I'm having trouble understanding a warning produced by 'splint', a code-checker. The warning produced is: keywords.c: (in function keyw_get_string) keywords.c:60:31: Released storage Keywords.Keyword reachable from global A global variable does not satisfy its annotations when control is transferred. (Use -globstate to inhibit warning) keywords.c:60:11: Storage Keywords.Keyword released
9
2440
by: CptDondo | last post by:
I am working on an embedded platform which has a block of battery-backed RAM. I need to store various types of data in this block of memory - for example, bitmapped data for control registers, strings for logging, and structures for data points. I want to use one function to read data from this block and one function to write data, for example: sram_read(OBJECT_IDENTIFIER) would return a pointer to the appriate object and
0
1881
by: Namratha Shah \(Nasha\) | last post by:
Hey Group, After a long week end I am back again. Its nice and refreshing after a short vacation so lets get started with .NET once again. Today we will discuss about Isolated Storage. This is one of the topics which I find interesting as I feel that it has a lot of practical usage or applicability. We all know that all applications need some storage space to archive certain
7
2284
by: lithiumcat | last post by:
Hi, I'm not yet very confident in my use of standard terminology, so please be kind if I'm mis-calling something, I will do my best no to make it again once pointed out. I'm wondering what is the lifetime or a compile-time string constant, I think that is what is called the storage duration of a string litteral.
0
10211
Oralloy
by: Oralloy | last post by:
Hello folks, I am unable to find appropriate documentation on the type promotion of bit-fields when using the generalised comparison operator "<=>". The problem is that using the GNU compilers, it seems that the internal comparison operator "<=>" tries to promote arguments from unsigned to signed. This is as boiled down as I can make it. Here is my compilation command: g++-12 -std=c++20 -Wnarrowing bit_field.cpp Here is the code in...
0
10045
jinu1996
by: jinu1996 | last post by:
In today's digital age, having a compelling online presence is paramount for businesses aiming to thrive in a competitive landscape. At the heart of this digital strategy lies an intricately woven tapestry of website design and digital marketing. It's not merely about having a website; it's about crafting an immersive digital experience that captivates audiences and drives business growth. The Art of Business Website Design Your website is...
1
9993
by: Hystou | last post by:
Overview: Windows 11 and 10 have less user interface control over operating system update behaviour than previous versions of Windows. In Windows 11 and 10, there is no way to turn off the Windows Update option using the Control Panel or Settings app; it automatically checks for updates and installs any it finds, whether you like it or not. For most users, this new feature is actually very convenient. If you want to control the update process,...
0
8870
agi2029
by: agi2029 | last post by:
Let's talk about the concept of autonomous AI software engineers and no-code agents. These AIs are designed to manage the entire lifecycle of a software development project—planning, coding, testing, and deployment—without human intervention. Imagine an AI that can take a project description, break it down, write the code, debug it, and then launch it, all on its own.... Now, this would greatly impact the work of software developers. The idea...
1
7406
isladogs
by: isladogs | last post by:
The next Access Europe User Group meeting will be on Wednesday 1 May 2024 starting at 18:00 UK time (6PM UTC+1) and finishing by 19:30 (7.30PM). In this session, we are pleased to welcome a new presenter, Adolph Dupré who will be discussing some powerful techniques for using class modules. He will explain when you may want to use classes instead of User Defined Types (UDT). For example, to manage the data in unbound forms. Adolph will...
0
6672
by: conductexam | last post by:
I have .net C# application in which I am extracting data from word file and save it in database particularly. To store word all data as it is I am converting the whole word file firstly in HTML and then checking html paragraph one by one. At the time of converting from word file to html my equations which are in the word document file was convert into image. Globals.ThisAddIn.Application.ActiveDocument.Select();...
1
3958
by: 6302768590 | last post by:
Hai team i want code for transfer the data from one system to another through IP address by using C# our system has to for every 5mins then we have to update the data what the data is updated we have to send another system
2
3561
muto222
by: muto222 | last post by:
How can i add a mobile payment intergratation into php mysql website.
3
2815
bsmnconsultancy
by: bsmnconsultancy | last post by:
In today's digital era, a well-designed website is crucial for businesses looking to succeed. Whether you're a small business owner or a large corporation in Toronto, having a strong online presence can significantly impact your brand's success. BSMN Consultancy, a leader in Website Development in Toronto offers valuable insights into creating effective websites that not only look great but also perform exceptionally well. In this comprehensive...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.