473,394 Members | 1,946 Online
Bytes | Software Development & Data Engineering Community
Post Job

Home Posts Topics Members FAQ

Join Bytes to post your question to a community of 473,394 software developers and data experts.

Magic structs

I am trying to understand how ustr (http://www.and.org/ustr/design)
works. Its a string-handling library for C, and it stores strings in the
following struct, called a "magic struct"

struct Ustr
{
unsigned char data[1];
/* 0b_wxzy_nnoo =>
*
* w = allocated
* x = has size
* y = round up allocations (off == exact
* allocations)
* z = memory error
* nn = refn
* oo = lenn
*
* 0b_0000_0000 = "", const, no alloc fail, no mem
* err, refn=0, lenn=0 */
};

It seems to me that a struct Ustr will only occupy 1-byte of memory, so
how can it contain even a simple pointer to char??? What's happening
here?

Aug 27 '07 #1
19 1923
Name and address withheld wrote:
I am trying to understand how ustr (http://www.and.org/ustr/design)
works. Its a string-handling library for C, and it stores strings in the
following struct, called a "magic struct"

struct Ustr
{
unsigned char data[1];
/* 0b_wxzy_nnoo =>
*
* w = allocated
* x = has size
* y = round up allocations (off == exact
* allocations)
* z = memory error
* nn = refn
* oo = lenn
*
* 0b_0000_0000 = "", const, no alloc fail, no mem
* err, refn=0, lenn=0 */
};

It seems to me that a struct Ustr will only occupy 1-byte of memory, so
how can it contain even a simple pointer to char??? What's happening
here?
Search the Google Groups archive for this group on the topic 'struct hack'

Aug 27 '07 #2
Name and address withheld said:
I am trying to understand how ustr (http://www.and.org/ustr/design)
works. Its a string-handling library for C, and it stores strings in
the following struct, called a "magic struct"

struct Ustr
{
unsigned char data[1];
<comment snipped>
};

It seems to me that a struct Ustr will only occupy 1-byte of memory,
so how can it contain even a simple pointer to char???
It can't, unless pointers are only one byte wide (which is possible but
rare).
What's happening here?
Broken code is happening here.

--
Richard Heathfield <http://www.cpax.org.uk>
Email: -www. +rjh@
Google users: <http://www.cpax.org.uk/prg/writings/googly.php>
"Usenet is a strange place" - dmr 29 July 1999
Aug 27 '07 #3

"Name and address withheld" <do**@spam.mewrote in message
news:sl********************@nospam.invalid...
>I am trying to understand how ustr (http://www.and.org/ustr/design)
works. Its a string-handling library for C, and it stores strings in the
following struct, called a "magic struct"

struct Ustr
{
unsigned char data[1];
/* 0b_wxzy_nnoo =>
*
* w = allocated
* x = has size
* y = round up allocations (off == exact
* allocations)
* z = memory error
* nn = refn
* oo = lenn
*
* 0b_0000_0000 = "", const, no alloc fail, no mem
* err, refn=0, lenn=0 */
};

It seems to me that a struct Ustr will only occupy 1-byte of memory, so
how can it contain even a simple pointer to char??? What's happening
here?
struct Ustr x;

x.data is effectively a pointer to a single char. the sturct doesn't need to
contain the address, the compiler can work out the address of the data array
from the address of the struct. Since there is only one item, it must in
fact be the same as the address of the struct.
We can now play silly games with the implementation to get more than one
character in the string. For instance if the struct sits at the top of an
uninitialised block of memory the last member, if it is an array, can be
extended. This is an unwarranted chumminess with the implementation, and
isn't to be recomended in your own code, but for something as fundamental as
a string library there is maybe a case for it. Just maybe.

--
Free games and programming goodies.
http://www.personal.leeds.ac.uk/~bgy1mm

Aug 27 '07 #4
Malcolm McLean said:
>
"Name and address withheld" <do**@spam.mewrote in message
news:sl********************@nospam.invalid...
<snip>
>>
struct Ustr
{
unsigned char data[1];
<snip>
>};

It seems to me that a struct Ustr will only occupy 1-byte of memory,
so how can it contain even a simple pointer to char??? What's
happening here?
struct Ustr x;

x.data is effectively a pointer to a single char.
No, it isn't. It's an array of a single char.

--
Richard Heathfield <http://www.cpax.org.uk>
Email: -www. +rjh@
Google users: <http://www.cpax.org.uk/prg/writings/googly.php>
"Usenet is a strange place" - dmr 29 July 1999
Aug 27 '07 #5

"Richard Heathfield" <rj*@see.sig.invalidwrote in message
news:g6******************************@bt.com...
Malcolm McLean said:
>>
x.data is effectively a pointer to a single char.

No, it isn't. It's an array of a single char.
effectively.That means it isn't, but can be thought of as if it is for some
purposes.

--
Free games and programming goodies.
http://www.personal.leeds.ac.uk/~bgy1mm

Aug 27 '07 #6
Richard Heathfield <rj*@see.sig.invalidwrites:
Malcolm McLean said:
>"Name and address withheld" <do**@spam.mewrote in message
news:sl********************@nospam.invalid...
<snip>
>>>
struct Ustr
{
unsigned char data[1];
<snip>
>>};

It seems to me that a struct Ustr will only occupy 1-byte of memory,
so how can it contain even a simple pointer to char??? What's
happening here?
struct Ustr x;

x.data is effectively a pointer to a single char.

No, it isn't. It's an array of a single char.
Well, yes and no.

x.data as an object is a member of the structure x, which is of type
struct Ustr. x.data is of type unsigned char[1], which is of course a
one-byte array.

But x.data as an expression, unless it appears as the operand of a
unary 'sizeof' or '&' operator, is implicitly converted to a value of
type 'unsigned char*', pointing to the first element of the array. If
you're willing to be unwarrantedly chummy with the implementation, you
can use this pointer to access memory beyond the bounds of the array
itself, taking advantage of the fact that most implementations don't
do the bounds checking that they're permitted to do (and that a
compiler writer would almost certainly be unwilling to break the
"struct hack").

Malcom, IMHO, should have been much clearer on this point.

The comp.lang.c FAQ <http://c-faq.com/>, discusses the struct hack, of
which this is a particularly odd example in question 2.6. But I've
never seen a usage of the struct hack where the array is the only
declared member of the structure.

Code using 'struct Ustr' is likely to work, but that's certainly not
the way I would have impleemnted it. Instead, I'd probably just use
'unsigned char*' or perhaps 'void*'.

--
Keith Thompson (The_Other_Keith) ks***@mib.org <http://www.ghoti.net/~kst>
San Diego Supercomputer Center <* <http://users.sdsc.edu/~kst>
"We must do something. This is something. Therefore, we must do this."
-- Antony Jay and Jonathan Lynn, "Yes Minister"
Aug 27 '07 #7
"Malcolm McLean" <re*******@btinternet.comwrites:
"Richard Heathfield" <rj*@see.sig.invalidwrote in message
news:g6******************************@bt.com...
>Malcolm McLean said:
>>x.data is effectively a pointer to a single char.

No, it isn't. It's an array of a single char.
effectively.That means it isn't, but can be thought of as if it is for
some purposes.
Huh? I'm afraid I have no idea what that's supposed to mean.

See my other followup for what it *should* mean.

--
Keith Thompson (The_Other_Keith) ks***@mib.org <http://www.ghoti.net/~kst>
San Diego Supercomputer Center <* <http://users.sdsc.edu/~kst>
"We must do something. This is something. Therefore, we must do this."
-- Antony Jay and Jonathan Lynn, "Yes Minister"
Aug 27 '07 #8

"Keith Thompson" <ks***@mib.orgwrote in message
news:ln************@nuthaus.mib.org...
"Malcolm McLean" <re*******@btinternet.comwrites:
>effectively.That means it isn't, but can be thought of as if it is for
some purposes.

Huh? I'm afraid I have no idea what that's supposed to mean.
If we say "this free kick is from a position that makes it effectively a
corner" then we are saying it not a corner - a kick awarded when the other
side put the ball out of play behind their own line; it is a free kick - a
kick usually awarded for foul play. However if the foul occured in the
extreme corner of the pitch, then the kick will be from almost the same
spot, and so in terms of tactics it can be thought of as a corner. The words
"effectively a corner" mean "not a corner".
>
See my other followup for what it *should* mean.
I should maybe have been a bit clearer. Your post is better.

--
Free games and programming goodies.
http://www.personal.leeds.ac.uk/~bgy1mm

Aug 27 '07 #9
Well as the author of Ustr, I should probably respond...

On Mon, 27 Aug 2007 14:02:24 -0700, Keith Thompson wrote:
Richard Heathfield <rj*@see.sig.invalidwrites:
>Malcolm McLean said:
>>"Name and address withheld" <do**@spam.mewrote in message
news:sl********************@nospam.invalid...
<snip>
>>>>
struct Ustr
{
unsigned char data[1];
<snip>
>>>};

It seems to me that a struct Ustr will only occupy 1-byte of memory,
so how can it contain even a simple pointer to char??? What's
happening here?

struct Ustr x;

x.data is effectively a pointer to a single char.

No, it isn't. It's an array of a single char.

Well, yes and no.

x.data as an object is a member of the structure x, which is of type
struct Ustr. x.data is of type unsigned char[1], which is of course a
one-byte array.

But x.data as an expression, unless it appears as the operand of a unary
'sizeof' or '&' operator, is implicitly converted to a value of type
'unsigned char*', pointing to the first element of the array. If you're
willing to be unwarrantedly chummy with the implementation, you can use
this pointer to access memory beyond the bounds of the array itself,
taking advantage of the fact that most implementations don't do the
bounds checking that they're permitted to do (and that a compiler writer
would almost certainly be unwilling to break the "struct hack").
Right, the struct hack is _very_ well known IMNSHO ... I guess you could
argue that limiting the code to C99 and using:

struct Ustr
{
unsigned char info;
unsigned char data[];
};

....is "better" from a stds. POV, although it seemed more natural to me
to represent it as one unit.

Personally I'd say that "unwarrantedly chummy with the implementation"
is pretty harsh considering how likely the assumption is to fail.
Malcom, IMHO, should have been much clearer on this point.

The comp.lang.c FAQ <http://c-faq.com/>, discusses the struct hack, of
which this is a particularly odd example in question 2.6. But I've
never seen a usage of the struct hack where the array is the only
declared member of the structure.
Right, most people when they want that just go for "char *" as the
representation ... the main reason I didn't is that the I wanted to make
the compiler complain if you did:

Ustr *s1 = "abcd";

....if C had a "decent" form of typedef, I'd have just used that.
Code using 'struct Ustr' is likely to work, but that's certainly not the
way I would have impleemnted it. Instead, I'd probably just use
'unsigned char*' or perhaps 'void*'.
Sure, you might want to _implement_ it that way ... but would you want
to _use_ a string API that had took (unsigned char *) types? I certainly
wouldn't.

Atm. for the users of the library it looks like a "normal" string API
except that it's much more efficient than normal for small strings, and
you can easily create auto/const strings etc.

--
James Antill -- ja***@and.org
C String APIs use too much memory? ustr: length, ref count, size and
read-only/fixed. Ave. 44% overhead over strdup(), for 0-20B strings
http://www.and.org/ustr/
Aug 27 '07 #10
santosh wrote:
>
Name and address withheld wrote:
I am trying to understand how ustr (http://www.and.org/ustr/design)
works. Its a string-handling library for C,
and it stores strings in the
following struct, called a "magic struct"

struct Ustr
{
unsigned char data[1];
/* 0b_wxzy_nnoo =>
*
* w = allocated
* x = has size
* y = round up allocations (off == exact
* allocations)
* z = memory error
* nn = refn
* oo = lenn
*
* 0b_0000_0000 = "", const, no alloc fail, no mem
* err, refn=0, lenn=0 */
};

It seems to me that a struct Ustr will
only occupy 1-byte of memory, so
how can it contain even a simple pointer to char??? What's happening
here?

Search the Google Groups archive
for this group on the topic 'struct hack'
I've never been able to understand the struct hack.
It looks like when the array expression is converted to a pointer,
that pointer will have address of
the lowest addressable byte of the structure.

So why not just use the address operator on the structure instead?

--
pete
Aug 28 '07 #11
"pete" <pf*****@mindspring.comwrote in message
news:46***********@mindspring.com...
santosh wrote:
[...]
I've never been able to understand the struct hack.
typedef struct foo_s {
char weird[];
} foo_t;
foo_t* const _this = malloc(foo_t * 10);

_this if not NULL is foo_t::wierd[10]?
I am not sure about this crap either? Please correct me!

:^0

Aug 28 '07 #12
James Antill <ja***********@and.orgwrites:
Well as the author of Ustr, I should probably respond...
On Mon, 27 Aug 2007 14:02:24 -0700, Keith Thompson wrote:
[...]
>But x.data as an expression, unless it appears as the operand of a unary
'sizeof' or '&' operator, is implicitly converted to a value of type
'unsigned char*', pointing to the first element of the array. If you're
willing to be unwarrantedly chummy with the implementation, you can use
this pointer to access memory beyond the bounds of the array itself,
taking advantage of the fact that most implementations don't do the
bounds checking that they're permitted to do (and that a compiler writer
would almost certainly be unwilling to break the "struct hack").

Right, the struct hack is _very_ well known IMNSHO ... I guess you could
argue that limiting the code to C99 and using:

struct Ustr
{
unsigned char info;
unsigned char data[];
};

...is "better" from a stds. POV, although it seemed more natural to me
to represent it as one unit.
I certainly wouldn't argue for using C99-specific features unless you
want to limit yourself to the few compilers that implement those
features.
Personally I'd say that "unwarrantedly chummy with the implementation"
is pretty harsh considering how likely the assumption is to fail.
Quoting question 2.6 of the comp.lang.c FAQ:

Despite its popularity, the technique is also somewhat notorious:
Dennis Ritchie has called it ``unwarranted chumminess with the C
implementation,'' and an official interpretation has deemed that
it is not strictly conforming with the C Standard, although it
does seem to work under all known implementations. (Compilers
which check array bounds carefully might issue warnings.)

So you can take it up with dmr. 8-)}

There's little doubt that attempting to access data beyond the
declared bounds of an array invokes undefined behavior. There's also
little doubt that most existing compilers will happily let you get
away with it -- thus the popularity of the struct hack.
>Malcom, IMHO, should have been much clearer on this point.

The comp.lang.c FAQ <http://c-faq.com/>, discusses the struct hack, of
which this is a particularly odd example in question 2.6. But I've
never seen a usage of the struct hack where the array is the only
declared member of the structure.

Right, most people when they want that just go for "char *" as the
representation ... the main reason I didn't is that the I wanted to make
the compiler complain if you did:

Ustr *s1 = "abcd";

...if C had a "decent" form of typedef, I'd have just used that.
And using void* would have the same problem, due to implicit
conversions.

[snip]

--
Keith Thompson (The_Other_Keith) ks***@mib.org <http://www.ghoti.net/~kst>
San Diego Supercomputer Center <* <http://users.sdsc.edu/~kst>
"We must do something. This is something. Therefore, we must do this."
-- Antony Jay and Jonathan Lynn, "Yes Minister"
Aug 28 '07 #13
On Mon, 27 Aug 2007 23:06:55 -0400, pete wrote:
santosh wrote:
>>
Name and address withheld wrote:
I am trying to understand how ustr (http://www.and.org/ustr/design)
works. Its a string-handling library for C, and it stores strings in
the
following struct, called a "magic struct"

struct Ustr
{
unsigned char data[1];
/* 0b_wxzy_nnoo =>
*
* w = allocated
* x = has size
* y = round up allocations (off == exact *
allocations)
* z = memory error
* nn = refn
* oo = lenn
*
* 0b_0000_0000 = "", const, no alloc fail, no mem * err, refn=0,
lenn=0 */
};

It seems to me that a struct Ustr will only occupy 1-byte of memory,
so
how can it contain even a simple pointer to char??? What's happening
here?

Search the Google Groups archive
for this group on the topic 'struct hack'

I've never been able to understand the struct hack. It looks like when
the array expression is converted to a pointer, that pointer will have
address of
the lowest addressable byte of the structure.

So why not just use the address operator on the structure instead?
struct Foo1
{
int foo1;
int foo2;
Bar blah[0];
};

struct Foo2
{
int foo1;
int foo2;
};

/* option 1 */
struct Foo1 *foo1 = malloc(sizeof(struct Foo1) + (sizeof(Bar) * 10));

foo->blah[9];

/* option 2 */
struct Foo2 *foo2 = malloc(sizeof(struct Foo2) + (sizeof(Bar) * 10));

*((Bar *)((char *)foo + sizeof(struct Foo)) + (sizeof(Bar) * 9));

--
James Antill -- ja***@and.org
C String APIs use too much memory? ustr: length, ref count, size and
read-only/fixed. Ave. 44% overhead over strdup(), for 0-20B strings
http://www.and.org/ustr/
Aug 28 '07 #14
foo_t* const _this = malloc(foo_t * 10);

foo_t* const _this = malloc(foo_t * (sizeof(char) * 10));

?

Aug 28 '07 #15
"Chris Thomasson" <cr*****@comcast.netwrote in message
news:aP******************************@comcast.com. ..
>foo_t* const _this = malloc(foo_t * 10);

foo_t* const _this = malloc(foo_t * (sizeof(char) * 10));
CRAP!

foo_t* const _this = malloc(foo_t + (sizeof(char) * 10));

? shitfire!

Aug 28 '07 #16
"Chris Thomasson" <cr*****@comcast.netwrote in message
news:Kt******************************@comcast.com. ..
"Chris Thomasson" <cr*****@comcast.netwrote in message
news:aP******************************@comcast.com. ..
>>foo_t* const _this = malloc(foo_t * 10);

foo_t* const _this = malloc(foo_t * (sizeof(char) * 10));

CRAP!

foo_t* const _this = malloc(foo_t + (sizeof(char) * 10));
:^0 holy crap.

foo_t* const _this = malloc(sizeof(*_thsi) + (sizeof(char) * 10));

WTF is going on with my crappy brain!

Aug 28 '07 #17
"Chris Thomasson" <cr*****@comcast.netwrites:
"Chris Thomasson" <cr*****@comcast.netwrote in message
news:Kt******************************@comcast.com. ..
>"Chris Thomasson" <cr*****@comcast.netwrote in message
news:aP******************************@comcast.com ...
>>>foo_t* const _this = malloc(foo_t * 10);

foo_t* const _this = malloc(foo_t * (sizeof(char) * 10));

CRAP!

foo_t* const _this = malloc(foo_t + (sizeof(char) * 10));

:^0 holy crap.

foo_t* const _this = malloc(sizeof(*_thsi) + (sizeof(char) * 10));

WTF is going on with my crappy brain!
_thsi?

.....
Aug 28 '07 #18
In article <_P******************************@comcast.com>,
"Chris Thomasson" <cr*****@comcast.netwrote:
>
foo_t* const _this = malloc(sizeof(*_thsi) + (sizeof(char) * 10));

sizeof(char) is always 1.

--
Posted via a free Usenet account from http://www.teranews.com

Aug 28 '07 #19
"Richard" <rg****@gmail.comwrote in message
news:9g************@homelinux.net...
[...]
>>"Chris Thomasson" <cr*****@comcast.netwrote in message
news:aP******************************@comcast.co m...
[...]
>WTF is going on with my crappy brain!
_thsi?
a crappy typo

Aug 28 '07 #20

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

4
by: news.microsoft.com | last post by:
Hi, I am using structs and am also using property accessors to access those private member fields... TO me this is a good way of handling them, but I find alot of people using direct access to...
5
by: Bilgehan.Balban | last post by:
Hi, I am currently brushing up my c++ knowledge and I would like to ask you about the differences between classes and C structs, in the function/method perspective. 1) Is it correct to say...
61
by: Marty | last post by:
I am new to C# and to structs so this could be easy or just not possible. I have a struct defined called Branch If I use Branch myBranch = new Branch(i); // everything works If I use Branch...
11
by: Cliff Martin | last post by:
Hi, I am reading a fairly large file a line at a time, doing some processing, and filtering out bits of the line. I am storing the interesting information in a struct and then printing it out....
29
by: Dom | last post by:
I'm really confused by the difference between a Struct and a Class? Sometimes, I want just a group of fields to go together. A Class without methods seems wrong, in that it carries too much...
16
by: per9000 | last post by:
Hi, I recently started working a lot more in python than I have done in the past. And I discovered something that totally removed the pretty pink clouds of beautifulness that had surrounded my...
43
by: JohnQ | last post by:
Are a default constructor, destructor, copy constructor and assignment operator generated by the compiler for a struct if they are not explicitely defined? I think the answer is yes, because...
2
by: jyck91 | last post by:
i have done the magic square: #include <stdio.h> #include <stdlib.h> #include <string.h> #define SIZE 13 main() { FILE *fp; int i, j, n, row, column;
9
by: Larry Hale | last post by:
I've heard tell of a Python binding for libmagic (file(1) *nixy command; see http://darwinsys.com/file/). Generally, has anybody built this and worked with it under Windows? The only thing I've...
0
by: Charles Arthur | last post by:
How do i turn on java script on a villaon, callus and itel keypad mobile phone
0
by: ryjfgjl | last post by:
If we have dozens or hundreds of excel to import into the database, if we use the excel import function provided by database editors such as navicat, it will be extremely tedious and time-consuming...
0
by: emmanuelkatto | last post by:
Hi All, I am Emmanuel katto from Uganda. I want to ask what challenges you've faced while migrating a website to cloud. Please let me know. Thanks! Emmanuel
1
by: nemocccc | last post by:
hello, everyone, I want to develop a software for my android phone for daily needs, any suggestions?
1
by: Sonnysonu | last post by:
This is the data of csv file 1 2 3 1 2 3 1 2 3 1 2 3 2 3 2 3 3 the lengths should be different i have to store the data by column-wise with in the specific length. suppose the i have to...
0
by: Hystou | last post by:
There are some requirements for setting up RAID: 1. The motherboard and BIOS support RAID configuration. 2. The motherboard has 2 or more available SATA protocol SSD/HDD slots (including MSATA, M.2...
0
by: Hystou | last post by:
Most computers default to English, but sometimes we require a different language, especially when relocating. Forgot to request a specific language before your computer shipped? No problem! You can...
0
jinu1996
by: jinu1996 | last post by:
In today's digital age, having a compelling online presence is paramount for businesses aiming to thrive in a competitive landscape. At the heart of this digital strategy lies an intricately woven...
0
tracyyun
by: tracyyun | last post by:
Dear forum friends, With the development of smart home technology, a variety of wireless communication protocols have appeared on the market, such as Zigbee, Z-Wave, Wi-Fi, Bluetooth, etc. Each...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.