1. Will simply taking the address of a non-aligned variable (but not
deferencing it) produce undefined behavior?
If it does, are there any known architectures on which this fails?
2. Is char[MB_CUR_MAX] large enough to hold any arbitary multibyte char
beginning and ending in the initial shift state? I didnt quite
understand
the standard (WG14/N869) 7.1.1 #7 (footnote 141).
3. Must the portable characters on the default "C" locale have the same
bit representation when used on an extended locale? IOW, must extended
locales be backwards compatible with the "C" locale?
4. Can the first byte of a shift-byte sequence in a multibyte string have
the same representation as any character in the portable character set
when
in the initial shift state?
Thanks. 20 1677
Thomas Matthews wrote: Jin wrote:
1. Will simply taking the address of a non-aligned variable (but not deferencing it) produce undefined behavior? Not according to the _Standard_. Alignment restrictions are an implementation detail, not one of the standard.
No.
Code which relies on there being no alignment requirements
has undefined behavior.
When you have code that relies on
there being no alignment requirements,
all you know is that either it will work or it won't.
If it's going to fail from violating alignment requirements,
then you have no idea how it will fail: that's undefined behavior.
--
pete
In article <op**************@news.starhub.net.sg>, Jin <-> wrote: 1. Will simply taking the address of a non-aligned variable (but not deferencing it) produce undefined behavior? If it does, are there any known architectures on which this fails?
How would you generate a non-aligned variable without invoking undefined
behavior?
(It's the implementation's responsibility to make sure any variables
it gives you are aligned appropriately, so if taking a pointer to them
gives you an unaligned pointer that would indicate a problem with the
implementation.)
dave
--
Dave Vandervies dj******@csclub.uwaterloo.ca
Note that printf() could be very useful on a microwave oven.
--Richard Heathfield in comp.lang.c
Dave Vandervies <dj******@csclub.uwaterloo.ca> scribbled the following: In article <op**************@news.starhub.net.sg>, Jin <-> wrote:1. Will simply taking the address of a non-aligned variable (but not deferencing it) produce undefined behavior? If it does, are there any known architectures on which this fails?
How would you generate a non-aligned variable without invoking undefined behavior? (It's the implementation's responsibility to make sure any variables it gives you are aligned appropriately, so if taking a pointer to them gives you an unaligned pointer that would indicate a problem with the implementation.)
I infer from your message that this:
int i[2];
(int *)((char *)i+1);
causes undefined behaviour.
--
/-- Joona Palaste (pa*****@cc.helsinki.fi) ------------- Finland --------\
\-- http://www.helsinki.fi/~palaste --------------------- rules! --------/
"There's no business like slow business."
- Tailgunner
In article <br**********@oravannahka.helsinki.fi>,
Joona I Palaste <pa*****@cc.helsinki.fi> wrote: Dave Vandervies <dj******@csclub.uwaterloo.ca> scribbled the following: In article <op**************@news.starhub.net.sg>, Jin <-> wrote:1. Will simply taking the address of a non-aligned variable (but not deferencing it) produce undefined behavior? If it does, are there any known architectures on which this fails?
How would you generate a non-aligned variable without invoking undefined behavior? (It's the implementation's responsibility to make sure any variables it gives you are aligned appropriately, so if taking a pointer to them gives you an unaligned pointer that would indicate a problem with the implementation.)
I infer from your message that this: int i[2]; (int *)((char *)i+1); causes undefined behaviour.
I'd have to grovel through the standard (well, my copy of N869) to
be sure, but I think "might cause undefined behavior by generating an
unaligned pointer" is more accurate.
But it's not a pointer to a non-aligned variable generated by the
implementation; it's a (probably non-aligned) pointer to (sizeof(int)-1)
bytes of one int and 1 byte of another int, within a properly aligned
variable (array of 2 ints) generated by the implementation.
A slightly more reasonable way might be:
--------
double foo=SOME_WELL_DEFINED_VALUE;
char *bar=malloc(sizeof foo + 1); /*would be checked in real code, of course*/
double *baz;
memcpy(bar+1,foo,sizeof foo);
/*Is this valid?*/
baz=(double *)(bar+1);
--------
but here baz is still not pointing to a non-aligned variable, it's
pointing to a non-aligned copy of a variable.
But the OP didn't ask about generating a non-aligned pointer using silly
pointer tricks; the question (possibly just poorly worded; see "pedant")
was about taking a pointer to a non-aligned variable.
(And, since silly pointer tricks require knowing what you're getting
yourself into anyways, I would like to think that they can safely be
excluded from a general question like the OP's.)
dave
--
Dave Vandervies dj******@csclub.uwaterloo.ca
[W]hen I am among non-physician Ph.D.s who go by "Doctor" then I like
to be called "Bachelor Petrofsky" in honor of my B.S..
--Al Petrofsky in comp.lang.scheme
Jin <-> writes: 1. Will simply taking the address of a non-aligned variable (but not deferencing it) produce undefined behavior? If it does, are there any known architectures on which this fails?
As far as I know, you can't get a non-alighed variable in the first
place without invoking undefined behavior.
--
Keith Thompson (The_Other_Keith) ks***@mib.org <http://www.ghoti.net/~kst>
San Diego Supercomputer Center <*> <http://www.sdsc.edu/~kst>
Schroedinger does Shakespeare: "To be *and* not to be"
(Note new e-mail address)
Joona I Palaste <pa*****@cc.helsinki.fi> wrote: Dave Vandervies <dj******@csclub.uwaterloo.ca> scribbled the following: In article <op**************@news.starhub.net.sg>, Jin <-> wrote:1. Will simply taking the address of a non-aligned variable (but not deferencing it) produce undefined behavior? If it does, are there any known architectures on which this fails?
How would you generate a non-aligned variable without invoking undefined behavior? (It's the implementation's responsibility to make sure any variables it gives you are aligned appropriately, so if taking a pointer to them gives you an unaligned pointer that would indicate a problem with the implementation.)
I infer from your message that this: int i[2]; (int *)((char *)i+1); causes undefined behaviour.
You have created a pointer to a region of space that is probably
not properly aligned for an int. AFAIK, this is perfectly legal.
What is not legal is actually storing an int in this space.
E.g.
$ cat e.c
int main(void)
{
int i[2];
int *p = (int *)((char *)i+1);
*p = 2;
return 0;
}
$ ./a.out
Bus Error (core dumped)
Alex
On Tue, 16 Dec 2003 22:10:37 GMT, Keith Thompson <ks***@mib.org> wrote: Jin <-> writes: 1. Will simply taking the address of a non-aligned variable (but not deferencing it) produce undefined behavior? If it does, are there any known architectures on which this fails?
As far as I know, you can't get a non-alighed variable in the first place without invoking undefined behavior.
unsigned char a[16];
unsigned int *b = (unsigned int*)&a[1];
Not really a "variable" in the strict sense, but you get the idea.
Jin wrote: On Tue, 16 Dec 2003 22:10:37 GMT, Keith Thompson <ks***@mib.org> wrote:
Jin <-> writes: 1. Will simply taking the address of a non-aligned variable (but not deferencing it) produce undefined behavior? If it does, are there any known architectures on which this fails?
As far as I know, you can't get a non-alighed variable in the first place without invoking undefined behavior.
unsigned char a[16]; unsigned int *b = (unsigned int*)&a[1];
Not really a "variable" in the strict sense, but you get the idea.
That looks like an example of undefined behavior to me.
--
pete
On Tue, 16 Dec 2003 16:00:20 GMT, Thomas Matthews
<Th****************************@sbcglobal.net> wrote in comp.lang.c: Jin wrote:
1. Will simply taking the address of a non-aligned variable (but not deferencing it) produce undefined behavior? Not according to the _Standard_. Alignment restrictions are an implementation detail, not one of the standard. My understanding is that accessing data that is not aligned will just take more fetches by the processor.
You are very, very wrong here. Some platforms, such as Intel x86,
will perform additional memory cycles, to be sure.
But there are at least two other types of responses in current
architectures, particularly RISC and/or DSP:
1. The low bits of the pointer are merely ignored. On this type of
architecture if you tried to access a 32 bit value at any of the
addresses 0x1000, 0x1001, 0x1002, or 0x1003 would pick up the four
bytes at 0x1000 through 0x1003 inclusive. Even if you expected trying
to read a 32 bit value with a pointer containing 0x1003 to pick up
that byte plus 0x1004, 0x1005, and 0x1006.
2. The processor platform performs automatic alignment checks and
generates a trap or exception if a multi-byte object is accessed at an
address with incorrect alignment. If there is an operating system
involved, it generally terminates the program that did this with
something called "sigbus" or some such.
--
Jack Klein
Home: http://JK-Technology.Com
FAQs for
comp.lang.c http://www.eskimo.com/~scs/C-faq/top.html
comp.lang.c++ http://www.parashift.com/c++-faq-lite/
alt.comp.lang.learn.c-c++ ftp://snurse-l.org/pub/acllc-c++/faq
On Wed, 17 Dec 2003 02:53:20 GMT, pete <pf*****@mindspring.com> wrote
in comp.lang.c: Jin wrote: On Tue, 16 Dec 2003 22:10:37 GMT, Keith Thompson <ks***@mib.org> wrote:
Jin <-> writes: > 1. Will simply taking the address of a non-aligned variable (but not > deferencing it) produce undefined behavior? > If it does, are there any known architectures on which this fails?
As far as I know, you can't get a non-alighed variable in the first place without invoking undefined behavior.
unsigned char a[16]; unsigned int *b = (unsigned int*)&a[1];
Not really a "variable" in the strict sense, but you get the idea.
That looks like an example of undefined behavior to me.
Look again, it's perfectly well defined. Can't fail or trap.
Using the pointer to read or write a value of type int, on the other
hand, can fail or trap.
--
Jack Klein
Home: http://JK-Technology.Com
FAQs for
comp.lang.c http://www.eskimo.com/~scs/C-faq/top.html
comp.lang.c++ http://www.parashift.com/c++-faq-lite/
alt.comp.lang.learn.c-c++ ftp://snurse-l.org/pub/acllc-c++/faq
Jack Klein <ja*******@spamcop.net> writes: On Wed, 17 Dec 2003 02:53:20 GMT, pete <pf*****@mindspring.com> wrote in comp.lang.c:
Jin wrote: On Tue, 16 Dec 2003 22:10:37 GMT, Keith Thompson <ks***@mib.org> wrote:
> Jin <-> writes: >> 1. Will simply taking the address of a non-aligned variable (but not >> deferencing it) produce undefined behavior? >> If it does, are there any known architectures on which this fails? > > As far as I know, you can't get a non-alighed variable in the first > place without invoking undefined behavior. >
unsigned char a[16]; unsigned int *b = (unsigned int*)&a[1];
Not really a "variable" in the strict sense, but you get the idea.
That looks like an example of undefined behavior to me.
Look again, it's perfectly well defined. Can't fail or trap.
6.3.2.3#7: "A pointer to an object or incomplete type may be converted to
a pointer to a different object or incomplete type. If the resulting
pointer is not correctly aligned for the pointed-to type, the behavior is
undefined. [...]".
Martin
Jack Klein wrote: On Tue, 16 Dec 2003 16:00:20 GMT, Thomas Matthews <Th****************************@sbcglobal.net> wrote in comp.lang.c:
Jin wrote:
1. Will simply taking the address of a non-aligned variable (but not deferencing it) produce undefined behavior?
Not according to the _Standard_. Alignment restrictions are an implementation detail, not one of the standard. My understanding is that accessing data that is not aligned will just take more fetches by the processor.
You are very, very wrong here. Some platforms, such as Intel x86, will perform additional memory cycles, to be sure.
But there are at least two other types of responses in current architectures, particularly RISC and/or DSP:
1. The low bits of the pointer are merely ignored. On this type of architecture if you tried to access a 32 bit value at any of the addresses 0x1000, 0x1001, 0x1002, or 0x1003 would pick up the four bytes at 0x1000 through 0x1003 inclusive. Even if you expected trying to read a 32 bit value with a pointer containing 0x1003 to pick up that byte plus 0x1004, 0x1005, and 0x1006.
2. The processor platform performs automatic alignment checks and generates a trap or exception if a multi-byte object is accessed at an address with incorrect alignment. If there is an operating system involved, it generally terminates the program that did this with something called "sigbus" or some such.
#2 may also direct the trap or exception to a kernel handler that fixes
up the access with the expected results, using byte loads. Of course,
it's horrendously slow. It's sometimes used to help poorly written
(most often x86-centric) software limp along on things like ARM Linux
and such.
Mark F. Haigh mf*****@sbcglobal.net
Martin Dickopp wrote: Jack Klein <ja*******@spamcop.net> writes:
On Wed, 17 Dec 2003 02:53:20 GMT, pete <pf*****@mindspring.com> wrote in comp.lang.c:
Jin wrote:
On Tue, 16 Dec 2003 22:10:37 GMT, Keith Thompson <ks***@mib.org> wrote:
>Jin <-> writes: > >>1. Will simply taking the address of a non-aligned variable (but not >> deferencing it) produce undefined behavior? >> If it does, are there any known architectures on which this fails? > >As far as I know, you can't get a non-alighed variable in the first >place without invoking undefined behavior. >
unsigned char a[16]; unsigned int *b = (unsigned int*)&a[1];
Not really a "variable" in the strict sense, but you get the idea.
That looks like an example of undefined behavior to me.
Look again, it's perfectly well defined. Can't fail or trap.
6.3.2.3#7: "A pointer to an object or incomplete type may be converted to a pointer to a different object or incomplete type. If the resulting pointer is not correctly aligned for the pointed-to type, the behavior is undefined. [...]".
Martin
AARGH!! C99 is truly a minefield.
My C89 draft says (haven't looked at my hardcopy C90 + TC1 yet, as it's
at work):
3.3.4, Semantics
[...] A pointer to an object or incomplete type may be converted to a
pointer to a different object type or a different incomplete type. The
resulting pointer might not be valid if it is improperly aligned for the
type pointed to. It is guaranteed, however, that a pointer to an object
of a given alignment may be converted to a pointer to an object of the
same alignment or a less strict alignment and back again; the result
shall compare equal to the original pointer. (An object that has
character type has the least strict alignment.)
This would appear to be directly in conflict with your quoted 6.3.2.3#7,
above. I see no reason why this change was made -- if anything, I
believe 6.3.2.3#7 should state:
"A pointer to an object or incomplete type may be converted to a pointer
to a different object or incomplete type. If the resulting pointer is
not correctly aligned (50) for the pointed-to type, the pointer is not
valid, and dereferencing the pointer is implementation-defined."
Perhaps someone with insider knowledge can comment?
Mark F. Haigh mf*****@sbcglobal.net
In article <3g******************@newssvr29.news.prodigy.com >,
"Mark F. Haigh" <mf*****@sbcglobal.ten> wrote: My C89 draft says (haven't looked at my hardcopy C90 + TC1 yet, as it's at work):
3.3.4, Semantics
[...] A pointer to an object or incomplete type may be converted to a pointer to a different object type or a different incomplete type. The resulting pointer might not be valid if it is improperly aligned for the type pointed to. It is guaranteed, however, that a pointer to an object of a given alignment may be converted to a pointer to an object of the same alignment or a less strict alignment and back again; the result shall compare equal to the original pointer. (An object that has character type has the least strict alignment.)
If you convert a correctly aligned (pointer to X) to a (pointer to Y)
and Y has less strict alignment, then the resulting pointer _will_ be
correctly aligned; that is what "less strict alignment" means. And you
can also convert it back: Not all correctly aligned (pointer to Y) are
correctly aligned (pointer to X), but those that were created by casting
a correctly aligned (pointer to X) will be correctly aligned.
This would appear to be directly in conflict with your quoted 6.3.2.3#7, above. I see no reason why this change was made -- if anything, I believe 6.3.2.3#7 should state:
"A pointer to an object or incomplete type may be converted to a pointer to a different object or incomplete type. If the resulting pointer is not correctly aligned (50) for the pointed-to type, the pointer is not valid, and dereferencing the pointer is implementation-defined."
That would make it illegal for the implementation to produce a valid,
correctly aligned, but different pointer. With "undefined behavior" this
would be ok.
Mark F. Haigh wrote:
(snip regarding unaligned access and pointers) #2 may also direct the trap or exception to a kernel handler that fixes up the access with the expected results, using byte loads. Of course, it's horrendously slow. It's sometimes used to help poorly written (most often x86-centric) software limp along on things like ARM Linux and such.
Mostly because Fortran COMMON requires no padding bytes, so it is
easy to generate unaligned data.
(At least it used to. It may have changed by now.)
-- glen
Mark F. Haigh wrote:
(snip) This would appear to be directly in conflict with your quoted 6.3.2.3#7, above. I see no reason why this change was made -- if anything, I believe 6.3.2.3#7 should state:
"A pointer to an object or incomplete type may be converted to a pointer to a different object or incomplete type. If the resulting pointer is not correctly aligned (50) for the pointed-to type, the pointer is not valid, and dereferencing the pointer is implementation-defined."
Perhaps someone with insider knowledge can comment?
I believe there are machines where char pointers and
int pointers have a different representation. On word
addressable machines that have the ability to address
parts of words, this would likely by true. The PDP-10
might be one example, though there aren't many C compilers
for it to test this on.
Still, it is likely that just creating the pointer doesn't
cause any problems, but the standard allows it to.
-- glen
> >> 1. Will simply taking the address of a non-aligned variable (but not deferencing it) produce undefined behavior? If it does, are there any known architectures on which this fails?
unsigned char a[16]; unsigned int *b = (unsigned int*)&a[1];
Note that a[i] means *(a+i), so, strictly speaking, evaluating &a[1]
involves derefencing that pointer (the Standard mentions this).
Also, b could point to an int boundary (there's no requirement that a[]
be aligned, and it could have started 1 byte before the int boundary).
Assuming the intent is to create a pointer that's definitely not on
an int boundary, try this:
int a;
char *a_ptr = (char *)&a;
int *b = (int *) (a+1);
On one platform I develop for (m68000 in an embedded device), this
causes a runtime panic (the device's equivalent of Windows BSOD).
(Luckily, the compiler issues a warning about "bad alignment,
potential run-time error" if you actually do this).
Despite what other posters have said, the Standard is very correct
and this is undefined behaviour, because of platforms where a hardware
exception occurs if you load a memory access register with an
inaccessible value.
As an aside, for the same reason, it's also undefined behaviour to
construct a pointer that points to something that might not be
accessible by your process, eg.
a_ptr--;
causes undefined behaviour.
Another platform I develop for has three-byte pointers. It has 16-bit int,
and a 64k code-page and a 64k data-page, so the third byte of the pointer
stores whether it's a code-page pointer or a data-page pointer.
(Obviously you can't do tricks like casting pointers to ints, on this
platform). It's not a far stretch of the imagination from this, to a
platform where char pointers and int pointers have different forms. ol*****@inspire.net.nz (Old Wolf) writes: unsigned char a[16]; unsigned int *b = (unsigned int*)&a[1];
Note that a[i] means *(a+i), so, strictly speaking, evaluating &a[1] involves derefencing that pointer (the Standard mentions this).
In think you are misunderstanding 6.5.3.2#3:
| The unary & operator returns the address of its operand. If the operand
| has type ``type'', the result has type ``pointer to type''. If the
| operand is the result of a unary * operator, neither that operator nor
| the & operator is evaluated and the result is as if both were omitted,
| except that the constraints on the operators still apply and the result
| is not an lvalue. Similarly, if the operand is the result of a []
| operator, neither the & operator nor the unary * that is implied by the
| [] is evaluated and the result is as if the & operator were removed and
| the [] operator were changed to a + operator. Otherwise, the result is a
| pointer to the object or function designated by its operand.
That means the opposite of what you said: the expression `&a[1]' does
*not* behave as if `a' were dereferenced. Instead, it behaves like `a+1'.
Martin
On Wed, 17 Dec 2003 10:58:33 GMT, glen herrmannsfeldt
<ga*@ugcs.caltech.edu> wrote: Mark F. Haigh wrote:
(snip regarding unaligned access and pointers)
#2 may also direct the trap or exception to a kernel handler that fixes up the access with the expected results, using byte loads. Of course, it's horrendously slow. It's sometimes used to help poorly written (most often x86-centric) software limp along on things like ARM Linux and such. Mostly because Fortran COMMON requires no padding bytes, so it is easy to generate unaligned data.
To be clear, standardly *requires* that there *not* be any padding.
(At least it used to. It may have changed by now.)
COMMON hasn't. But there is a new improved and often preferable
*alternative*, a MODULE containing data (and possibly routines), which
does allow alignment and also (standardly) provides better checking.
Much as even C99 still has K&R1 functions, but prototypes are better.
- David.Thompson1 at worldnet.att.net This thread has been closed and replies have been disabled. Please start a new discussion. Similar topics
by: Ian Rastall |
last post by:
Sorry for the double question. I'm having a terrible time figuring out
how to escape apostrophes in my mySQL database. Perhaps they have to
be escaped in the PHP, using mysql_real_escape_string?
...
|
by: Staz |
last post by:
Hi all
I am getting a NoClassDefFoundError from
sun.misc.Unsafe.ensureClassInitialized, but the exception has no
message so I have no way of knowing what class it cannot find.
The application...
|
by: George Hester |
last post by:
<!-- Welcome to my home page in Japanese, code page 932 -->
<%
@CodePage = 932
Session("OriginalCodePage") = Session.CodePage
%>
<html>
<head>
<title>Code Page</title>
</head>
<body>
|
by: . |
last post by:
comp.infosystems.www.authoring.html,news.admin.hierarchies,comp.bugs.misc,comp.os.ms-windows.programmer.networks,comp.os.os2.bugs
|
by: Andre Berger |
last post by:
Hi there,
I'm migrating my pages towards XHTML 1.1 and have run into some
trouble. For example, I can't figure out how to translate these
into classes:
<table cellpadding="5" align="center"...
|
by: Support4John |
last post by:
a2k (9.0.6926) SP-3 Jet 4.0 SP-7
Does anybody have US Accounts Payable Miscellaneous Income Form
1099-Misc report object they want to share?
Thanks, John
|
by: Bjørn Augestad |
last post by:
Hello, everyone.
It's time to add a new module to the libclc library, so this is an
invitation for all to contribute misc. functions which doesn't fit into
a category of their own. Not the big...
|
by: ASP Developer |
last post by:
I have a user control with exposed properties. I have about 100 properties.
I would like to organize them so they are easier for other programmers to
peruse. Is it possible to organize the misc...
|
by: eyeofsoul |
last post by:
Hello again..i have a problem on importing sun.misc.BASE64Decoder, when i type for example
System.out.println("Cipher Text: " + new sun.misc.BASE64Decoder(ciphertext));
i get an error saying...
|
by: selvamariappan |
last post by:
Hi all,
I am using vb.net 2008,
how to create a misc properties for a textbox in windows forms, Please help me its very urgent..
Regards
selvamariappan.c
|
by: ryjfgjl |
last post by:
If we have dozens or hundreds of excel to import into the database, if we use the excel import function provided by database editors such as navicat, it will be extremely tedious and time-consuming...
|
by: ryjfgjl |
last post by:
In our work, we often receive Excel tables with data in the same format. If we want to analyze these data, it can be difficult to analyze them because the data is spread across multiple Excel files...
|
by: BarryA |
last post by:
What are the essential steps and strategies outlined in the Data Structures and Algorithms (DSA) roadmap for aspiring data scientists? How can individuals effectively utilize this roadmap to progress...
|
by: Sonnysonu |
last post by:
This is the data of csv file
1 2 3
1 2 3
1 2 3
1 2 3
2 3
2 3
3
the lengths should be different i have to store the data by column-wise with in the specific length.
suppose the i have to...
|
by: Hystou |
last post by:
There are some requirements for setting up RAID:
1. The motherboard and BIOS support RAID configuration.
2. The motherboard has 2 or more available SATA protocol SSD/HDD slots (including MSATA, M.2...
|
by: marktang |
last post by:
ONU (Optical Network Unit) is one of the key components for providing high-speed Internet services. Its primary function is to act as an endpoint device located at the user's premises. However,...
|
by: Oralloy |
last post by:
Hello folks,
I am unable to find appropriate documentation on the type promotion of bit-fields when using the generalised comparison operator "<=>".
The problem is that using the GNU compilers,...
|
by: Hystou |
last post by:
Overview:
Windows 11 and 10 have less user interface control over operating system update behaviour than previous versions of Windows. In Windows 11 and 10, there is no way to turn off the Windows...
|
by: tracyyun |
last post by:
Dear forum friends,
With the development of smart home technology, a variety of wireless communication protocols have appeared on the market, such as Zigbee, Z-Wave, Wi-Fi, Bluetooth, etc. Each...
| |