473,382 Members | 1,442 Online
Bytes | Software Development & Data Engineering Community
Post Job

Home Posts Topics Members FAQ

Join Bytes to post your question to a community of 473,382 software developers and data experts.

One for the language lawyers

Here is a commonly used technique, that will, of course, work fine on
any reasonably modern, normal hardware. But, does it pass the CLC test?

/* Assume well-formed input - of course, you can always break it by
* feeding it bad input */

struct foo { int field1, field2; char nl; } *bar;
char buffer[SOMENUMBERWHATEVERFLOATSYOURBOAT];

int main(void) {
bar = (struct foo *) buffer;
fgets(buffer,SOMENUMBERWHATEVERFLOATSYOURBOAT,stdi n);
/* Now access the members of the struct (using, e.g., bar -field1).
* Note that no actual struct was ever declared - we are using
* buffer as if it were the struct */
}

Jun 27 '08 #1
10 1161
On Mon, 09 Jun 2008 17:08:20 +0000, Kenny McCormack wrote:
Here is a commonly used technique,
It is? Where have you seen it used?
that will, of course, work fine on
any reasonably modern, normal hardware. But, does it pass the CLC test?
No.
/* Assume well-formed input - of course, you can always break it by
* feeding it bad input */

struct foo { int field1, field2; char nl; } *bar;
What's the nl member for?
char buffer[SOMENUMBERWHATEVERFLOATSYOURBOAT];

int main(void) {
bar = (struct foo *) buffer;
This assumes that buffer is appropriately aligned for a struct foo. When
you access *bar, you also ignore C's aliasing rules. Both problems can be
avoided by using a union.
fgets(buffer,SOMENUMBERWHATEVERFLOATSYOURBOAT,stdi n);
Did you mean fread, or were you really asking about fgets? If you meant
fread, I don't see the point of a nl member at all. If you meant fgets, I
don't see the point of a nl member at the very end.
/* Now access the members of the struct (using, e.g., bar -field1).
* Note that no actual struct was ever declared - we are using
* buffer as if it were the struct */
}
Jun 27 '08 #2
Kenny McCormack <ga*****@xmission.xmission.comwrote:
Here is a commonly used technique, that will, of course, work fine on
any reasonably modern, normal hardware. But, does it pass the CLC test?
/* Assume well-formed input - of course, you can always break it by
* feeding it bad input */
struct foo { int field1, field2; char nl; } *bar;
char buffer[SOMENUMBERWHATEVERFLOATSYOURBOAT];
int main(void) {
bar = (struct foo *) buffer;
fgets(buffer,SOMENUMBERWHATEVERFLOATSYOURBOAT,stdi n);
/* Now access the members of the struct (using, e.g., bar -field1).
* Note that no actual struct was ever declared - we are using
* buffer as if it were the struct */
}
As long as sizeof(struct foo) isn't smaller than
SOMENUMBERWHATEVERFLOATSYOURBOAT then there's no problem.
It's rather obfuscated and I dare to doubt that this is
a "commonly used technique", but 'buffer' is memory
you own so you can do with it whatever you want. Of
course, all hinges on your primary assuption that the
input is well-formed (it may be difficult to make it
non-well-formed for the types of members the structure
has on main-stream hardware, but there might be some
systems where certain bit-patterns don't represent ints
and thus you may run into danger of undefined behaviour).
So figuring out what's well-formed can be a bit of a
bother but as long as you do that there's no problem.

Regards, Jens
--
\ Jens Thoms Toerring ___ jt@toerring.de
\__________________________ http://toerring.de
Jun 27 '08 #3
Kenny McCormack writes:
Here is a commonly used technique, (...)
I hope not.
struct foo { int field1, field2; char nl; } *bar;
char buffer[SOMENUMBERWHATEVERFLOATSYOURBOAT];

int main(void) {
bar = (struct foo *) buffer;
fgets(buffer,SOMENUMBERWHATEVERFLOATSYOURBOAT,stdi n);
/* Now access the members of the struct (using, e.g., bar -field1).
This breaks e.g. if there is a 0x10 byte (newline) in the integer
representation of the would-be bar->field1 value. And as Harald
said, it breaks if buffer is not properly aligned for a struct foo.

Also when I see fgets() I suspect the file has been opened in text
instead of binary mode, which means there may be bugs from converting
between newline and the file system's representation of end-of-line.

--
Hallvard
Jun 27 '08 #4
>Kenny McCormack <ga*****@xmission.xmission.comwrote:
>Here is a commonly used technique, that will, of course, work fine on
any reasonably modern, normal hardware. But, does it pass the CLC test?
>/* Assume well-formed input - of course, you can always break it by
* feeding it bad input */
struct foo { int field1, field2; char nl; } *bar;
char buffer[SOMENUMBERWHATEVERFLOATSYOURBOAT];
>int main(void) {
bar = (struct foo *) buffer;
fgets(buffer,SOMENUMBERWHATEVERFLOATSYOURBOAT,stdi n);
/* Now access the members of the struct (using, e.g., bar -field1).
* Note that no actual struct was ever declared - we are using
* buffer as if it were the struct */
}
In article <6b*************@mid.uni-berlin.de>,
Jens Thoms Toerring <jt@toerring.dewrote:
>As long as sizeof(struct foo) isn't smaller than
SOMENUMBERWHATEVERFLOATSYOURBOAT then there's no problem.
When I first built the 4.xBSD system for the SPARC, tftp broke,
precisely because it used this kind of trick. (In tftp's case,
it was a more complex variant of the "struct hack".)
>It's rather obfuscated and I dare to doubt that this is
a "commonly used technique", but 'buffer' is memory
you own so you can do with it whatever you want. Of
course, all hinges on your primary assuption that the
input is well-formed ...
More importantly, it depends on the variable "buffer" being
properly aligned for all member accesses.

This was not true on the SPARC, where the compiler put the
big buffer on an odd byte boundary.

As a quick fix, I wrapped the buffer up into a union, which
forced gcc to align the entire thing on an appropriate boundary.

The trick also works if you use malloc() to obtain the buffer.

In any case, it is not a very good idea to write the code this way,
because it places such strong constraints on what constitutes "well
formed" input. You need to make sure that these severe restrictions
on whatever uses the code are paid-for by whatever benefit you are
getting from this "commonly used technique" (which, in my experience,
was used perhaps once in the entire 4.xBSD code base -- that seems
to argue against the claim that it is "commonly used").
--
In-Real-Life: Chris Torek, Wind River Systems
Salt Lake City, UT, USA (40°39.22'N, 111°50.29'W) +1 801 277 2603
email: gmail (figure it out) http://web.torek.net/torek/index.html
Jun 27 '08 #5
Chris Torek <no****@torek.netwrote:
Kenny McCormack <ga*****@xmission.xmission.comwrote:
Here is a commonly used technique, that will, of course, work fine on
any reasonably modern, normal hardware. But, does it pass the CLC test?
/* Assume well-formed input - of course, you can always break it by
* feeding it bad input */
struct foo { int field1, field2; char nl; } *bar;
char buffer[SOMENUMBERWHATEVERFLOATSYOURBOAT];
int main(void) {
bar = (struct foo *) buffer;
fgets(buffer,SOMENUMBERWHATEVERFLOATSYOURBOAT,stdi n);
/* Now access the members of the struct (using, e.g., bar -field1).
* Note that no actual struct was ever declared - we are using
* buffer as if it were the struct */
}
In article <6b*************@mid.uni-berlin.de>,
Jens Thoms Toerring <jt@toerring.dewrote:
As long as sizeof(struct foo) isn't smaller than
SOMENUMBERWHATEVERFLOATSYOURBOAT then there's no problem.
When I first built the 4.xBSD system for the SPARC, tftp broke,
precisely because it used this kind of trick. (In tftp's case,
it was a more complex variant of the "struct hack".)
It's rather obfuscated and I dare to doubt that this is
a "commonly used technique", but 'buffer' is memory
you own so you can do with it whatever you want. Of
course, all hinges on your primary assuption that the
input is well-formed ...
More importantly, it depends on the variable "buffer" being
properly aligned for all member accesses.
This was not true on the SPARC, where the compiler put the
big buffer on an odd byte boundary.
Yes, that's a point I forgot about. Should have known better,
being bitten more than once by this issue when trying to port
(mostly other peoples;-) code to a different architecture. I
guess I am not too good a language lawyer;-)

Best regards, Jens
--
\ Jens Thoms Toerring ___ jt@toerring.de
\__________________________ http://toerring.de
Jun 27 '08 #6
On Jun 10, 3:30 am, Chris Torek <nos...@torek.netwrote:
>
As a quick fix, I wrapped the buffer up into a union, which
forced gcc to align the entire thing on an appropriate boundary.
A bit off the topic:

We can also use compiler specific extensions to achieve the alignment
and padding
requirements. In case of gcc, __attribute__((packed)) for eliminating
padding for structures.
We can also use aligned attributes for buffer to coerce the alignment.
Jun 27 '08 #7
On 9 Jun, 18:08, gaze...@xmission.xmission.com (Kenny McCormack)
wrote:
Here is a commonly used technique, that will, of course, work fine on
any reasonably modern, normal hardware. *But, does it pass the CLC test?

/* Assume well-formed input - of course, you can always break it by
** feeding it bad input */

struct foo { int field1, field2; char nl; } *bar;
char buffer[SOMENUMBERWHATEVERFLOATSYOURBOAT];

int main(void) {
* * bar = (struct foo *) buffer;
* * fgets(buffer,SOMENUMBERWHATEVERFLOATSYOURBOAT,stdi n);
* * /* Now access the members of the struct (using, e.g., bar -field1).
* * ** Note that no actual struct was ever declared - we are using
* * ** buffer as if it were the struct */
* * }
I used it on real systems. Now it makes me nervous.
I've seen a system break when an OS was upgraded
due to this.

To use this I'd want to be *very* sure there was an
identical system at both ends. And always would be.
--
Nick Keighley
Jun 27 '08 #8
On 10 Jun, 05:30, rahul <rahulsin...@gmail.comwrote:
On Jun 10, 3:30 am, Chris Torek <nos...@torek.netwrote:
As a quick fix, I wrapped the buffer up into a union, which
forced gcc to align the entire thing on an appropriate boundary.

A bit off the topic:

We can also use compiler specific extensions to achieve the alignment
and padding
requirements. In case of gcc, __attribute__((packed)) for eliminating
padding for structures.
We can also use aligned attributes for buffer to coerce the alignment.
eek!!! These things are different on every compiler. And sometimes
don't exist. Some hardware cannot support it (or it becomes *very*
ineffceint).

I worked on systems that turned it on and off for
each structure in a large header...

I've hunted bugs when different packed/not packed options
had been used in different object files. It *linked* fine.

--
Nick Keighley

"Almost every species in the universe has an irrational fear of
#pragma packed. But they're wrong"
Jun 27 '08 #9
Kenny the Troll wrote:
Here is a commonly used technique, that will, of course, work fine on
How did you come to the conclusion that this technique is common?
Where did you see or hear about it?
any reasonably modern, normal hardware. But, does it pass the CLC test?
It certainly won't work for the "unreasonably modern/antique"
"abnormal hardware/software".
/* Assume well-formed input - of course, you can always break it by
* feeding it bad input */
You *can't* always break it by feeding it bad input as long as it's
properly programmed.
struct foo { int field1, field2; char nl; } *bar;
char buffer[SOMENUMBERWHATEVERFLOATSYOURBOAT];

int main(void) {
bar = (struct foo *) buffer;
fgets(buffer,SOMENUMBERWHATEVERFLOATSYOURBOAT,stdi n);
You don't check the return value of fgets, nor you include <stdio.h>
for it.
/* Now access the members of the struct (using, e.g., bar -field1).
Where? I don't see the code accessing said members.
* Note that no actual struct was ever declared - we are using
There was - struct foo { int field1, field2; char n1; }.
* buffer as if it were the struct */
No you are not.
}
You don't return a value from main().
Jun 27 '08 #10

"Nick Keighley" <ni******************@hotmail.comschreef in bericht
news:68**********************************@l64g2000 hse.googlegroups.com...
On 10 Jun, 05:30, rahul <rahulsin...@gmail.comwrote:
>On Jun 10, 3:30 am, Chris Torek <nos...@torek.netwrote:
As a quick fix, I wrapped the buffer up into a union, which
forced gcc to align the entire thing on an appropriate boundary.

A bit off the topic:

We can also use compiler specific extensions to achieve the alignment
and padding
requirements. In case of gcc, __attribute__((packed)) for eliminating
padding for structures.
We can also use aligned attributes for buffer to coerce the alignment.

eek!!! These things are different on every compiler. And sometimes
don't exist. Some hardware cannot support it (or it becomes *very*
ineffceint).
*very* inefficient is *very* relative. It all depends on the structure of
your code. So I would not worry about the efficiency aspect of unaligned
access, only on the incorrectness aspect :)

Jun 27 '08 #11

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

4
by: Michele Simionato | last post by:
According to the standand library (http://docs.python.org/lib/typeiter.html) an *iterable* is something with an __iter__ method. This means that strings are *not* iterable. However I can loop...
2
by: A. Y. Chen | last post by:
Hi, I've created a new mini-language for querying and manipulating XML. I've demonstrated (at least to myself) that it's superior to the standards that are currently available. Assuming that...
24
by: Ministry Of Jute | last post by:
I returned home from work today to find an Airborne Express Letter Express mailer leaning up against my apartment door. The return addressee was Microsoft Suite 300 1165 Eastlake Avenue E...
0
by: Thiva Charanasri | last post by:
http://www.poweroflanguage.org Track: Computer Language 1st World Congress on the Power of Language: Theory, Practice and Performance Date: March 6 - 10, 2006 Bangkok, Thailand On this...
0
by: Thiva Charanasri | last post by:
http://www.poweroflanguage.org Track: Computer Language 1st World Congress on the Power of Language: Theory, Practice and Performance Date: March 6 - 10, 2006 Bangkok, Thailand On this...
10
by: Immortalist | last post by:
Various aquisition devices that guide learning along particular pathways towards human biases. And as E.O. Wilson might say mental development appears to be genetically constrained. (1) Language...
22
by: David Mathog | last post by:
One thing that keeps coming up in this forum is that standard C lacks many functions which are required in a workstation or server but not possible in an embedded controller. This results in a...
0
by: Peter Morris | last post by:
I am looking for private detectives to find him. You are looking for attention, when in fact what you need to do is to turn your computer off, go outside in the sunshine, and listen to the birds...
25
by: Markus Elfring | last post by:
Hello, I have found two class libraries that have got the method "getLanguage". http://www.icu-project.org/apiref/icu4c/classLocale.html#7c0e53c666ea52387d0edae91f75c94f...
0
isladogs
by: isladogs | last post by:
The next Access Europe User Group meeting will be on Wednesday 3 Apr 2024 starting at 18:00 UK time (6PM UTC+1) and finishing by 19:30 (7.30PM). In this session, we are pleased to welcome former...
0
by: ryjfgjl | last post by:
In our work, we often need to import Excel data into databases (such as MySQL, SQL Server, Oracle) for data analysis and processing. Usually, we use database tools like Navicat or the Excel import...
0
by: taylorcarr | last post by:
A Canon printer is a smart device known for being advanced, efficient, and reliable. It is designed for home, office, and hybrid workspace use and can also be used for a variety of purposes. However,...
0
by: Charles Arthur | last post by:
How do i turn on java script on a villaon, callus and itel keypad mobile phone
0
by: aa123db | last post by:
Variable and constants Use var or let for variables and const fror constants. Var foo ='bar'; Let foo ='bar';const baz ='bar'; Functions function $name$ ($parameters$) { } ...
0
by: ryjfgjl | last post by:
If we have dozens or hundreds of excel to import into the database, if we use the excel import function provided by database editors such as navicat, it will be extremely tedious and time-consuming...
0
by: ryjfgjl | last post by:
In our work, we often receive Excel tables with data in the same format. If we want to analyze these data, it can be difficult to analyze them because the data is spread across multiple Excel files...
0
by: emmanuelkatto | last post by:
Hi All, I am Emmanuel katto from Uganda. I want to ask what challenges you've faced while migrating a website to cloud. Please let me know. Thanks! Emmanuel
1
by: nemocccc | last post by:
hello, everyone, I want to develop a software for my android phone for daily needs, any suggestions?

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.