473,320 Members | 2,035 Online
Bytes | Software Development & Data Engineering Community
Post Job

Home Posts Topics Members FAQ

Join Bytes to post your question to a community of 473,320 software developers and data experts.

Questions about "alignment" in memory

I posted a question some time back about accessing a char array as an
array of words. In order not to overrun the char array, I padded it
with enough 0x00 bytes to ensure that when accessed as words I
wouldn't overrun the array. I was told that this is dangerous and
that there could be alignment problems if, for example, I wanted to
access the char array elements from non-even multiples of sizeof(int).
For example, if I had the array:

char a[10];

and I wanted to access the 8 bytes (a[2], a[3],..., a[8], a[9]) as the
array:

int b[2];

where (b[0] contains the data in a[2] to a[5], and b[1] contains a[6]
to a[9])

I understand the alignment issue in this example. My question
is...can I turn this problem on its head...for example, create an
empty array of ints, then access this memory space as a char?

Here's what I'm talking about:
unsigned int* a_words;
char* a_bytes;

fstream in("myfile.dat", ios::in | ios::binary | ios::ate);
int filesize_bytes = in.tellg();
int filesize_words = filesize_bytes / sizeof(int) + ((filesize_bytes %
sizeof(int)) > 0); // add 1 if there is a remander...

a_words = new unsigned int[filesize_words];
a_bytes = reinterpret_cast<char*>(a_words);

in.seekg(2, ios::beg); //note...out of (word) alignment...starts on
3rd byte
in.read(a_bytes, filesize_bytes-3);
in.close();

at which point the file is in memory and can be accessed as bytes (by
indexing a_bytes[0 to filesize_bytes]) or as words (by indexing
a_words[0 to filesize_words].

This seems to work fine. Additionally, it shouldn't suffer potential
alignment problems since the array is defined to align with words, and
word addresses should be accessable to a byte address, even if the
converse of this is not true.

I can see that there will be compatibility problems with this system
if ported to a system where CHAR_BIT != 8. However, I don't care
about these systems. If I'm only doing logical operators on the bits
in the file, I don't even see any endian issues with doing this.

Thanks for the slap-in-the-face I'm sure I'll get for performing such
blastphomous operations in c++. Seriously, does this treatment
circumvent potential alignment issues?
Jul 19 '05 #1
14 4036
WW
J. Campbell wrote:
I posted a question some time back about accessing a char array as an
array of words. In order not to overrun the char array, I padded it
with enough 0x00 bytes to ensure that when accessed as words I
wouldn't overrun the array. I was told that this is dangerous and
that there could be alignment problems if, for example, I wanted to
access the char array elements from non-even multiples of sizeof(int).
For example, if I had the array:

char a[10];

and I wanted to access the 8 bytes (a[2], a[3],..., a[8], a[9]) as the
array:

int b[2];

where (b[0] contains the data in a[2] to a[5], and b[1] contains a[6]
to a[9])

I understand the alignment issue in this example. My question
is...can I turn this problem on its head...for example, create an
empty array of ints, then access this memory space as a char?


Yes you can, but only with char being the "other" thing. Another solution
is to define a union, with a char and an int array inside.

--
WW aka Attila
Jul 19 '05 #2
WW wrote:
Yes you can, but only with char being the "other" thing. Another solution
is to define a union, with a char and an int array inside.

This is not guaranteed. It is implementation-defined behavior if the
value of a member of a union object is used when the most recent store
to the object was to a different member, other than structs sharing a
common initial sequence.

Many implementations do allow it.

There are more portable ways, basically shifting and or-ing the bytes
onto an int.


Brian Rodenborn
Jul 19 '05 #3
WW
Default User wrote:
WW wrote:
Yes you can, but only with char being the "other" thing. Another
solution is to define a union, with a char and an int array inside.

This is not guaranteed. It is implementation-defined behavior if the
value of a member of a union object is used when the most recent store
to the object was to a different member, other than structs sharing a
common initial sequence.


yep. But we are talking about a char and an int array so far.

--
WW aka Attila
Jul 19 '05 #4
WW wrote:

Default User wrote:
WW wrote:
Yes you can, but only with char being the "other" thing. Another
solution is to define a union, with a char and an int array inside.

This is not guaranteed. It is implementation-defined behavior if the
value of a member of a union object is used when the most recent store
to the object was to a different member, other than structs sharing a
common initial sequence.


yep. But we are talking about a char and an int array so far.

Right, which don't come under the exemption. If I got the OP's problem
right, he had a buffer of char that he wanted to convert into a series
of ints. Using unions to do so would be implementation-defined behavior
(if I'm reading the standard correctly).

Here's a way from my personal library:

unsigned int CreateDataWord (unsigned char data[4])
{
unsigned int dataword = 0;

for (int i = 0; i < 4; i++)
{
dataword |= data[i] << (3-i) * 8;
}
return dataword;
}
Note that this uses unsigned char for the buffer, which is guaranteed to
be safe, requires CHAR_BIT == 8, and is predicated on 32-bit int, so it
has its own nonportabilities.

Brian Rodenborn
Jul 19 '05 #5
"WW" <wo***@freemail.hu> wrote in:
yep. But we are talking about a char and an int array so far.


Thanks...<so far :-)>

Indeed, the real question is: is it SAFE to access a region of
memory, defined as other than char, as a char array...if you are aware
of the issues? Your answer indicates a cautious "yes" if you are
gentle, and make sure never to overstep the char array bounds...as
long as CHAR_BIT is the length expected. Is this interpretation
correct??

Thanks for the response...still trying to learn...6 mos into the
process...still love QB45...;-)
Jul 19 '05 #6
Default User <fi********@boeing.com.invalid> wrote in message news:<3F***************@boeing.com.invalid>...
WW wrote:
Yes you can, but only with char being the "other" thing. Another solution
is to define a union, with a char and an int array inside.

This is not guaranteed. It is implementation-defined behavior if the
value of a member of a union object is used when the most recent store
to the object was to a different member, other than structs sharing a
common initial sequence.

Many implementations do allow it.

There are more portable ways, basically shifting and or-ing the bytes
onto an int.

Brian Rodenborn


Brian,

So...you raise issue with the use of union...but what about my
original solution where I take a char array and put it into an int
array...which I then access as both an int and a char array. Are
there alignment problems with this, or are the problems more local???

I somehow get the feeling you are posting from Galviston...if this is
the case, then it explains the dissarray. Cheers, ciao, and thanks in
advance for the c++ help.
Jul 19 '05 #7
"J. Campbell" <ma**********@yahoo.com> wrote in message
news:b9**************************@posting.google.c om...
[...]
So...you raise issue with the use of union...but what about my
original solution where I take a char array and put it into an int
array...which I then access as both an int and a char array. Are
there alignment problems with this, or are the problems more
local???
[...]


You would need to do a reinterpret cast, and that is not one of
the portable types for it. So technically, no. Doing what you
suggest will result in an ill-formed program (or maybe the
behaviour is just implementation-defined). On the other hand,
it will probably work on 99% of the compilers and systems out
there. Since it would be costly to do it the "right" way, I
personally would just run with it. But that's just me, and this is
a C++ newsgroup, so if I were toeing the party line like a good
programmer, I would revile you for suggesting a program which
might possibly contravene the sacred text which is the C++
standard. Anyway, good luck.

Dave

---
Outgoing mail is certified Virus Free.
Checked by AVG anti-virus system (http://www.grisoft.com).
Version: 6.0.521 / Virus Database: 319 - Release Date: 9/23/2003
Jul 19 '05 #8
"J. Campbell" wrote:
So...you raise issue with the use of union...but what about my
original solution where I take a char array and put it into an int
array...which I then access as both an int and a char array. Are
there alignment problems with this, or are the problems more local???

You can access any object as an array of unsigned char safely. That's
because unsigned char is guaranteed to have no trap representations. An
array of ints can be accessed as unsigned char. However, you then must
be cognizant of endianess of the ints in the array. It's generally kind
of tricky, I've found it easier and more portable (no method is
completely portable) to use bitwise operators.


Brian Rodenborn
Jul 19 '05 #9
"David B. Held" <dh***@codelogicconsulting.com> wrote in message
news:bm**********@news.astound.net...
[...]
On the other hand, it will probably work on 99% of the
compilers and systems out there.
[...]


After reading Default User's post, I realized I should have added
the caveat that it will probably work on 99% of the compilers
and systems out there *but in a generally non-portable way*.
That means that since you're reading raw bytes into an array
from a file, and assuming a certain byte order for int, the code
obviously won't work on a platform that has a different byte order.
But usually, people who do stuff like this aren't interested in
portability in the first place.

Dave

---
Outgoing mail is certified Virus Free.
Checked by AVG anti-virus system (http://www.grisoft.com).
Version: 6.0.521 / Virus Database: 319 - Release Date: 9/23/2003
Jul 19 '05 #10
"David B. Held" wrote:
After reading Default User's post, I realized I should have added
the caveat that it will probably work on 99% of the compilers
and systems out there *but in a generally non-portable way*.
That means that since you're reading raw bytes into an array
from a file, and assuming a certain byte order for int, the code
obviously won't work on a platform that has a different byte order.
But usually, people who do stuff like this aren't interested in
portability in the first place.

Byte order is a big problem for me, because my code has to work on
Windows for desktop testing, then to the target hardware, which has a
different endianess. My methods (bitwise ops) were compatible to both
without change. You'll have an easier time finding platforms with
CHAR_BIT == 8 and 32-bit integral types.

Once you devise the packing and unpacking routines for the data words,
then all you need to deal with is the unsigned char array.


Brian Rodenborn
Jul 19 '05 #11
ma**********@yahoo.com (J. Campbell) wrote in message news:<b9**************************@posting.google. com>...
I understand the alignment issue in this example. My question
is...can I turn this problem on its head...for example, create an
empty array of ints, then access this memory space as a char?
Sure.
I can see that there will be compatibility problems with this system
if ported to a system where CHAR_BIT != 8. However, I don't care
about these systems. If I'm only doing logical operators on the bits
in the file, I don't even see any endian issues with doing this.


If you access the array as int, you will be endian-specific. Whether
you use arithmetic or logic operations makes no difference.

Sam
Jul 19 '05 #12
Default User wrote in message news:<3F***************@boeing.com.invalid>...
Here's a way from my personal library:

unsigned int CreateDataWord (unsigned char data[4])
{
unsigned int dataword = 0;

for (int i = 0; i < 4; i++)
{
dataword |= data[i] << (3-i) * 8;
}
return dataword;
}
Note that this uses unsigned char for the buffer, which is guaranteed to
be safe, requires CHAR_BIT == 8, and is predicated on 32-bit int, so it
has its own nonportabilities.

Brian Rodenborn


Thanks for the input, Brian. Regarding your function
CreateDataWord...I just want to point out that if you just want to
pack a char buffer into ints, you can do this portabally while making
no assumptions of the system bit size, or the size of CHAR_BIT.
However, you actually need two functions...depending on how you want
to pack your word...the function you show packs the word Little
Endian. Here is compilable code that uses 2 portable versions of your
function.

#include <iostream>

using namespace std;

void wait();
unsigned int makeBE(unsigned char a[]);
unsigned int makeLE(unsigned char a[]);
bool endian_check();

int main(){
int ws = sizeof(int);
cout << "This is a " << ws * CHAR_BIT << "-bit system\n"
<< "Bytes are " << CHAR_BIT << "-bits\n"
<< "Words are " << ws << " bytes\n\n"
<< "Checking system endianness...System is ";

if(endian_check()) cout << "Little Endian (Intel)\n\n";
else cout << "Big Endian (Motorola)\n\n";

unsigned char data[ws]; // Make a 1-word char array and fill it
for(int i = 0; i < ws; ++i) data[i] = 0x41 + i;

cout << "The " << ws << " byte sequence \"";
for(int i = 0; i < ws; ++i) cout << data[i];
cout << "\" (Ascii)\n"
<< "is translated to a " << ws
<< " byte integer word (hex) as:\n\n" << hex;
cout << "Big Endian(Motorola): " << makeBE(data) << endl;
cout << "Little Endian(Intel): " << makeLE(data) << endl << endl;
wait();
return 0;
}

unsigned int makeBE (unsigned char data[sizeof(int)]){
unsigned int dataword = 0;

for (int i = 0; i < sizeof(int); i++)
dataword |= (data[i] << (i * CHAR_BIT));
return dataword;
}

unsigned int makeLE (unsigned char data[sizeof(int)]){
unsigned int dataword = 0;
int index = 0;

for (int i = sizeof(int); i > 0; )
dataword |= data[index++] << --i * CHAR_BIT;
return dataword;
}

bool endian_check(){
unsigned int word = 0x1;
unsigned char* byte = reinterpret_cast<unsigned char*>(&word);
return (byte[0]); // returns 1 if LE, 0 if BE
}

void wait(){
cout<<"<Enter> to continue..";
string z; getline(cin,z);
}
Jul 19 '05 #13
WW
Default User wrote:
WW wrote:

Default User wrote:
WW wrote:

Yes you can, but only with char being the "other" thing. Another
solution is to define a union, with a char and an int array inside.
This is not guaranteed. It is implementation-defined behavior if the
value of a member of a union object is used when the most recent
store to the object was to a different member, other than structs
sharing a common initial sequence.
yep. But we are talking about a char and an int array so far.

Right, which don't come under the exemption. If I got the OP's problem
right, he had a buffer of char that he wanted to convert into a series
of ints. Using unions to do so would be implementation-defined
behavior (if I'm reading the standard correctly).


Yeah, you do. Emerican Netiveness. :-)
Here's a way from my personal library:

unsigned int CreateDataWord (unsigned char data[4])
{
unsigned int dataword = 0;

for (int i = 0; i < 4; i++)
{
dataword |= data[i] << (3-i) * 8;
}
return dataword;
}

Note that this uses unsigned char for the buffer, which is guaranteed
to be safe, requires CHAR_BIT == 8, and is predicated on 32-bit int,
so it has its own nonportabilities.


Yepp... But if you did write long int, then it would be fully portable
IIRC.

--
WW aka Attila
Jul 19 '05 #14
WW wrote:
Note that this uses unsigned char for the buffer, which is guaranteed
to be safe, requires CHAR_BIT == 8, and is predicated on 32-bit int,
so it has its own nonportabilities.


Yepp... But if you did write long int, then it would be fully portable
IIRC.

Probably should have been long. My original code used our own local
guaranteed sized type, UINT_32, which is very nonportable.

Brian Rodenborn
Jul 19 '05 #15

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

3
by: signuts | last post by:
I am wondering what it means when a pointer is aligned? Could someone perhaps enlighten me or point me in the right direction? Thank you in advance. -- Sig
11
by: L. Chen | last post by:
The standard says that a char* or void* pointer has the least strict alignment. But I do not know what is a strict alignment. What does that mean?
12
by: Olaf Baeyens | last post by:
I am porting some of my buffer class code for C++ to C#. This C++ class allocates a block of memory using m_pBuffer=new BYTE; But since the class is also used for pointers for funtions that uses...
5
by: Hendrik Schober | last post by:
Hi, we just run into the problem, that "default" alignment in the project properies dialog seem to be different. We have a project that's a DLL, which is linked with a couple of LIBs. All are...
9
by: Oliver Block | last post by:
Hi, what is the most elegent way to center an image inside a web page. The image is radomly chosen by a cgi script may be 300x400 or 400x300. Are there any alignment commands for images?
0
by: Jean-François Michaud | last post by:
Hello, I was wondering if there was a way around leader-alignment. XSF V3.4 from Antenna House seems to be a very powerful FO -> PDF converter, but it doesn't support this particular attribute...
0
by: DolphinDB | last post by:
Tired of spending countless mintues downsampling your data? Look no further! In this article, you’ll learn how to efficiently downsample 6.48 billion high-frequency records to 61 million...
0
isladogs
by: isladogs | last post by:
The next Access Europe meeting will be on Wednesday 6 Mar 2024 starting at 18:00 UK time (6PM UTC) and finishing at about 19:15 (7.15PM). In this month's session, we are pleased to welcome back...
1
isladogs
by: isladogs | last post by:
The next Access Europe meeting will be on Wednesday 6 Mar 2024 starting at 18:00 UK time (6PM UTC) and finishing at about 19:15 (7.15PM). In this month's session, we are pleased to welcome back...
0
by: jfyes | last post by:
As a hardware engineer, after seeing that CEIWEI recently released a new tool for Modbus RTU Over TCP/UDP filtering and monitoring, I actively went to its official website to take a look. It turned...
0
by: ArrayDB | last post by:
The error message I've encountered is; ERROR:root:Error generating model response: exception: access violation writing 0x0000000000005140, which seems to be indicative of an access violation...
1
by: PapaRatzi | last post by:
Hello, I am teaching myself MS Access forms design and Visual Basic. I've created a table to capture a list of Top 30 singles and forms to capture new entries. The final step is a form (unbound)...
1
by: Shællîpôpï 09 | last post by:
If u are using a keypad phone, how do u turn on JavaScript, to access features like WhatsApp, Facebook, Instagram....
0
by: af34tf | last post by:
Hi Guys, I have a domain whose name is BytesLimited.com, and I want to sell it. Does anyone know about platforms that allow me to list my domain in auction for free. Thank you
0
by: Faith0G | last post by:
I am starting a new it consulting business and it's been a while since I setup a new website. Is wordpress still the best web based software for hosting a 5 page website? The webpages will be...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.