How to retrieve the name of the file from a FILE *

jacob navia

Recently there was a discussion in this group about
how to retrieve the file name given a FILE *.

The question raised my curiosity, and after some
research I have come up with a good implementation.

The solution is in the tutorial for lcc-win32
(http://www.cs.virginia.edu/~lcc-win32) page
331.

jacob

Nov 14 '05 #1

Subscribe Post Reply

2310

Joona I Palaste

jacob navia <ja***@jacob.remcomp.fr> scribbled the following:

Recently there was a discussion in this group about
how to retrieve the file name given a FILE *. The question raised my curiosity, and after some
research I have come up with a good implementation. The solution is in the tutorial for lcc-win32
(http://www.cs.virginia.edu/~lcc-win32) page
331.

This solution is non-portable as it makes use of the Win32 system APIs.
As usual... *sigh*... Jacob Navia failed to mention this fact.

--
/-- Joona Palaste (pa*****@cc.helsinki.fi) ------------- Finland --------\
\-------------------------------------------------------- rules! --------/
"B-but Angus! You're a dragon!"
- Mickey Mouse

Nov 14 '05 #2

jacob navia

Joona I Palaste wrote:

jacob navia <ja***@jacob.remcomp.fr> scribbled the following:
Recently there was a discussion in this group about
how to retrieve the file name given a FILE *.

The question raised my curiosity, and after some
research I have come up with a good implementation.

The solution is in the tutorial for lcc-win32
(http://www.cs.virginia.edu/~lcc-win32) page
331.

This solution is non-portable as it makes use of the Win32 system APIs.
As usual... *sigh*... Jacob Navia failed to mention this fact.

Yes, it is based on the win32 API. That is why I did not publish the
code here, just giving a pointer to the code

Nov 14 '05 #3

Joona I Palaste

jacob navia <ja***@jacob.remcomp.fr> scribbled the following:

Joona I Palaste wrote:
jacob navia <ja***@jacob.remcomp.fr> scribbled the following:
Recently there was a discussion in this group about
how to retrieve the file name given a FILE *.

The question raised my curiosity, and after some
research I have come up with a good implementation.

The solution is in the tutorial for lcc-win32
(http://www.cs.virginia.edu/~lcc-win32) page
331.

This solution is non-portable as it makes use of the Win32 system APIs.
As usual... *sigh*... Jacob Navia failed to mention this fact.

Yes, it is based on the win32 API. That is why I did not publish the
code here, just giving a pointer to the code

Why did you even give a pointer here? This newsgroup is not concerned
about system-specific code no matter where that code is written. Also,
having to download a 5-megabyte PDF file just to see two pages of C
code doesn't exactly appeal to me, and I don't think it appeals to
many other people either.

--
/-- Joona Palaste (pa*****@cc.helsinki.fi) ------------- Finland --------\
\-------------------------------------------------------- rules! --------/
"Roses are red, violets are blue, I'm a schitzophrenic and so am I."
- Bob Wiley

Nov 14 '05 #4

Dan Pop

In <41**********************@news.wanadoo.fr> jacob navia <ja***@jacob.remcomp.fr> writes:

The question raised my curiosity, and after some
research I have come up with a good implementation.

If it's not portable, you're abusing this newsgroup by advertising it.
If it's portable, my congratulations!

Dan
--
Dan Pop
DESY Zeuthen, RZ group
Email: Da*****@ifh.de
Currently looking for a job in the European Union

Nov 14 '05 #5

CBFalconer

Dan Pop wrote:

jacob navia <ja***@jacob.remcomp.fr> writes:
The question raised my curiosity, and after some
research I have come up with a good implementation.

If it's not portable, you're abusing this newsgroup by advertising
it. If it's portable, my congratulations!

I feel morally certain that M.Navias method is non-portable.
However, I can conceive of a portable method, if you are willing to
replace the fopen call with something else, say fnopen. Then it
only has to cache a copy of the file name in a hashtable, keyed by
the FILE* pointer returned, and retrieval is a simple O(1) lookup.

--
Chuck F (cb********@yahoo.com) (cb********@worldnet.att.net)
Available for consulting/temporary embedded and systems.
<http://cbfalconer.home.att.net> USE worldnet address!

Nov 14 '05 #6

Mark McIntyre

On Wed, 13 Oct 2004 15:18:20 +0200, in comp.lang.c , jacob navia
<ja***@jacob.remcomp.fr> wrote:

Recently there was a discussion in this group about
how to retrieve the file name given a FILE *.

The question raised my curiosity, and after some
research I have come up with a good implementation.

The solution is in the tutorial for lcc-win32
(http://www.cs.virginia.edu/~lcc-win32) page
331.

It doesn't work. Please try not to post broken code, even in links, under
the guise of "good implementations"

--
Mark McIntyre
CLC FAQ <http://www.eskimo.com/~scs/C-faq/top.html>
CLC readme: <http://www.ungerhu.com/jxh/clc.welcome.txt>
----== Posted via Newsfeed.Com - Unlimited-Uncensored-Secure Usenet News==----
http://www.newsfeed.com The #1 Newsgroup Service in the World! >100,000 Newsgroups
---= 19 East/West-Coast Specialized Servers - Total Privacy via Encryption =---

Nov 14 '05 #7

jacob navia

Mark McIntyre wrote:

It doesn't work. Please try not to post broken code, even in links, under
the guise of "good implementations"

Obviously I have just written it, so it is not well tested. Can you give
any examples for your assertion? In which circumstances it doesn't work?

Nov 14 '05 #8

Keith Thompson

jacob navia <ja***@jacob.remcomp.fr> writes:

Mark McIntyre wrote:
It doesn't work. Please try not to post broken code, even in links,
under
the guise of "good implementations"

Obviously I have just written it, so it is not well tested. Can you
give any examples for your assertion? In which circumstances it
doesn't work?

It doesn't work on any system other than MS Windows. I can't say
whether it does work on MS Windows; that's why we have
Windows-specific newsgroups, where people can comment intelligently on
Windows-specific code.

(I also took a brief look at the code, and I noticed that it returns
the strings "stdin", "stdout", and "stderr" for the corresponding
arguments. Those are C identifiers, not file names.)

There is no portable way to retrieve a file name from a FILE* value.
The closest you can come is to remember the name when you open the
file. If you don't have control over the code that opens the file,
you'll have to either give up on getting the name or resort to
non-portable code.

--
Keith Thompson (The_Other_Keith) ks***@mib.org <http://www.ghoti.net/~kst>
San Diego Supercomputer Center <*> <http://users.sdsc.edu/~kst>
We must do something. This is something. Therefore, we must do this.

Nov 14 '05 #9

jacob navia

Keith Thompson wrote:

jacob navia <ja***@jacob.remcomp.fr> writes:
Mark McIntyre wrote:
It doesn't work. Please try not to post broken code, even in links,
under
the guise of "good implementations"
Obviously I have just written it, so it is not well tested. Can you
give any examples for your assertion? In which circumstances it
doesn't work?

It doesn't work on any system other than MS Windows. I can't say
whether it does work on MS Windows; that's why we have
Windows-specific newsgroups, where people can comment intelligently on
Windows-specific code.

Under unix it is possible to do that too, I am sure. Maybe
using the inode of the file, you can get into the name,
or similar.

There is no portable directory handling in C. This hole in the
language is quite incredible, and there are a lot of proposals
on how to fill it. The standards comitee has blocked all those
propositions, some of them quite advanced.

I can't change this. I did not want to discuss system specific
code in this group so I gave just a pointer to the code. I
do not understand why all this fuzz.
(I also took a brief look at the code, and I noticed that it returns
the strings "stdin", "stdout", and "stderr" for the corresponding
arguments. Those are C identifiers, not file names.)

Yes, but since there isn't any file associated with the terminal,
I thought it could work like that... Maybe I should return
$CONIN $CONOUT or similar. You have a point here.
There is no portable way to retrieve a file name from a FILE* value.
The closest you can come is to remember the name when you open the
file. If you don't have control over the code that opens the file,
you'll have to either give up on getting the name or resort to
non-portable code.

Yes, I do not discuss that. I proposed a system specific
way of doing this under a popular platform, that's all.

jacob

Nov 14 '05 #10

Keith Thompson

jacob navia <ja***@jacob.remcomp.fr> writes:

Keith Thompson wrote: [...] Under unix it is possible to do that too, I am sure. Maybe
using the inode of the file, you can get into the name,
or similar.

I'm not convinced that it's possible in any reasonable fashion, but I
won't get into the details.

[snip]

There is no portable way to retrieve a file name from a FILE* value.
The closest you can come is to remember the name when you open the
file. If you don't have control over the code that opens the file,
you'll have to either give up on getting the name or resort to
non-portable code.

Yes, I do not discuss that. I proposed a system specific
way of doing this under a popular platform, that's all.

But you didn't mention that it's system specific. Here's what
you wrote:

] Recently there was a discussion in this group about
] how to retrieve the file name given a FILE *.
]
] The question raised my curiosity, and after some
] research I have come up with a good implementation.
]
] The solution is in the tutorial for lcc-win32
] (http://www.cs.virginia.edu/~lcc-win32) page

The phrase "good implementation", in the context of comp.lang.c, can
usually be assumed to refer to portable code.

A quick pointer to system-specific code is probably ok. If you had
told us that it's specific to MS Windows, we could have avoided this
branch of the thread and saved a lot of time.

--
Keith Thompson (The_Other_Keith) ks***@mib.org <http://www.ghoti.net/~kst>
San Diego Supercomputer Center <*> <http://users.sdsc.edu/~kst>
We must do something. This is something. Therefore, we must do this.

Nov 14 '05 #11

S.Tobias

jacob navia <ja***@jacob.remcomp.fr> wrote:

Under unix it is possible to do that too, I am sure. Maybe
using the inode of the file, you can get into the name,
or similar.

FYI, in unix-like systems usually a file in an inode can have more
than one name, or none at all. Or to be more precise: files have
no names at all - file names are just identifiers of entries in
a directory, which point to inodes.

--
Stan Tobias
sed 's/[A-Z]//g' to email

Nov 14 '05 #12

Dik T. Winter

In article <41**********************@news.wanadoo.fr> jacob navia <ja***@jacob.remcomp.fr> writes:

Keith Thompson wrote:

....

It doesn't work on any system other than MS Windows. I can't say
whether it does work on MS Windows; that's why we have
Windows-specific newsgroups, where people can comment intelligently on
Windows-specific code.

Under unix it is possible to do that too, I am sure. Maybe
using the inode of the file, you can get into the name,
or similar.

You can be sure, you are also wrong. To get the file name, the file
*must* have a name in the first place. And a file that has been
removed after the file was opened by the program does not have a
name (but it exists until the program closes it). But even if the
file has a name it might be very difficult to get the filename.
--
dik t. winter, cwi, kruislaan 413, 1098 sj amsterdam, nederland, +31205924131
home: bovenover 215, 1025 jn amsterdam, nederland; http://www.cwi.nl/~dik/

Nov 14 '05 #13

Richard Bos

Keith Thompson <ks***@mib.org> wrote:

jacob navia <ja***@jacob.remcomp.fr> writes:
Obviously I have just written it, so it is not well tested. Can you
give any examples for your assertion? In which circumstances it
doesn't work?

It doesn't work on any system other than MS Windows. I can't say
whether it does work on MS Windows; that's why we have
Windows-specific newsgroups, where people can comment intelligently on
Windows-specific code.

(I also took a brief look at the code, and I noticed that it returns
the strings "stdin", "stdout", and "stderr" for the corresponding
arguments. Those are C identifiers, not file names.)

Which means, in fact, that this code is far from a "good
implementation", since it is obviously incorrect. I can make a file
called "stdout", and open it using fopen(). Pretending that stdout is
connected to this file called "stdout" is, plain and simply, wrong.

Richard

Nov 14 '05 #14

Jarno A Wuolijoki

On Wed, 13 Oct 2004, CBFalconer wrote:

If it's not portable, you're abusing this newsgroup by advertising
it. If it's portable, my congratulations!

I feel morally certain that M.Navias method is non-portable.
However, I can conceive of a portable method, if you are willing to
replace the fopen call with something else, say fnopen. Then it
only has to cache a copy of the file name in a hashtable, keyed by
the FILE* pointer returned, and retrieval is a simple O(1) lookup.

How do you hash pointers portably?

Nov 14 '05 #15

Kenneth Brody

Keith Thompson wrote:

jacob navia <ja***@jacob.remcomp.fr> writes:
Keith Thompson wrote:

[...]
Under unix it is possible to do that too, I am sure. Maybe
using the inode of the file, you can get into the name,
or similar.

I'm not convinced that it's possible in any reasonable fashion, but I
won't get into the details.

Remember, under *nix, there might not even be a filename associated with
the FILE*. It may be a pipe, or the file may have been removed since it
was opened.

Or, there may be multiple filenames associated with the inode.

[...]

--
+-------------------------+--------------------+-----------------------------+
| Kenneth J. Brody | www.hvcomputer.com | |
| kenbrody/at\spamcop.net | www.fptech.com | #include <std_disclaimer.h> |
+-------------------------+--------------------+-----------------------------+

Nov 14 '05 #16

Dan Pop

In <41**********************@news.wanadoo.fr> jacob navia <ja***@jacob.remcomp.fr> writes:

Keith Thompson wrote:
jacob navia <ja***@jacob.remcomp.fr> writes:
Mark McIntyre wrote:

It doesn't work. Please try not to post broken code, even in links,
under
the guise of "good implementations"

Obviously I have just written it, so it is not well tested. Can you
give any examples for your assertion? In which circumstances it
doesn't work?

It doesn't work on any system other than MS Windows. I can't say
whether it does work on MS Windows; that's why we have
Windows-specific newsgroups, where people can comment intelligently on
Windows-specific code.

Under unix it is possible to do that too, I am sure. Maybe
using the inode of the file, you can get into the name,
or similar.

How do you get the inode of the file from the corresponding FILE
structure? All the Unix implementation cares about is the file
descriptor returned by the open() call.

But even if you get somehow the inode number, you can have as many file
names connected to the same inode as you want under Unix. They are
called hard links:

fangorn:/bin 23> ls -l -i gunzip gzip zcat
240249 -rwxr-xr-x 3 root root 53076 2003-03-14 01:16 gunzip
240249 -rwxr-xr-x 3 root root 53076 2003-03-14 01:16 gzip
240249 -rwxr-xr-x 3 root root 53076 2003-03-14 01:16 zcat

Dan
--
Dan Pop
DESY Zeuthen, RZ group
Email: Da*****@ifh.de
Currently looking for a job in the European Union

Nov 14 '05 #17

Dan Pop

In <41***************@yahoo.com> CBFalconer <cb********@yahoo.com> writes:

However, I can conceive of a portable method, if you are willing to
replace the fopen call with something else, say fnopen. Then it
only has to cache a copy of the file name in a hashtable, keyed by
the FILE* pointer returned, and retrieval is a simple O(1) lookup.

Your method doesn't cover the predefined streams, which are the
interesting cases. If I open a file myself, I already know its name,
but it is sometimes helpful to know where your stdin data comes from.

Dan
--
Dan Pop
DESY Zeuthen, RZ group
Email: Da*****@ifh.de
Currently looking for a job in the European Union

Nov 14 '05 #18

Richard Tobin

In article <ck**********@sunnews.cern.ch>, Dan Pop <Da*****@cern.ch> wrote:

How do you get the inode of the file from the corresponding FILE
structure? All the Unix implementation cares about is the file
descriptor returned by the open() call.
You call the POSIX stat() function on that file descriptor.
But even if you get somehow the inode number, you can have as many file
names connected to the same inode as you want under Unix.

Including, of course, zero. Not to mention the case where the file
descriptor is a pipe or socket or some other thing that doesn't have a
real inode.

-- Richard

Nov 14 '05 #19

Dan Pop

In <Pi*******************************@sbz-31.cs.Helsinki.FI> Jarno A Wuolijoki <jw******@cs.Helsinki.FI> writes:

On Wed, 13 Oct 2004, CBFalconer wrote:
> If it's not portable, you're abusing this newsgroup by advertising
> it. If it's portable, my congratulations!

I feel morally certain that M.Navias method is non-portable.
However, I can conceive of a portable method, if you are willing to
replace the fopen call with something else, say fnopen. Then it
only has to cache a copy of the file name in a hashtable, keyed by
the FILE* pointer returned, and retrieval is a simple O(1) lookup.

How do you hash pointers portably?

You alias them with an array of unsigned char of size sizeof(FILE *).

Dan
--
Dan Pop
DESY Zeuthen, RZ group
Email: Da*****@ifh.de
Currently looking for a job in the European Union

Nov 14 '05 #20

CBFalconer

Jarno A Wuolijoki wrote:

On Wed, 13 Oct 2004, CBFalconer wrote:

.... snip ...

I feel morally certain that M.Navias method is non-portable.
However, I can conceive of a portable method, if you are willing to
replace the fopen call with something else, say fnopen. Then it
only has to cache a copy of the file name in a hashtable, keyed by
the FILE* pointer returned, and retrieval is a simple O(1) lookup.

How do you hash pointers portably?

You cast them into integers (allowable, but not the inverse) and
use hashing methods for integers (see the references in my hashlib
package and its tests). The operation of the hash table only
requires equal/non-equal comparisons of pointers, which is always
allowable.

Please refrain from snipping attributes for material you quote.

--
Chuck F (cb********@yahoo.com) (cb********@worldnet.att.net)
Available for consulting/temporary embedded and systems.
<http://cbfalconer.home.att.net> USE worldnet address!

Nov 14 '05 #21

CBFalconer

Dan Pop wrote:

CBFalconer <cb********@yahoo.com> writes:
However, I can conceive of a portable method, if you are willing to
replace the fopen call with something else, say fnopen. Then it
only has to cache a copy of the file name in a hashtable, keyed by
the FILE* pointer returned, and retrieval is a simple O(1) lookup.

Your method doesn't cover the predefined streams, which are the
interesting cases. If I open a file myself, I already know its name,
but it is sometimes helpful to know where your stdin data comes from.

Actually it is worse than that, because any stream may be connected
to one (or more) i/o devices or disk files, which also have names
or designators in most systems. Then that device may be connected
to something else, which in turn has its own cascade of names.

So it is better to simply cut the Gordian knot, and say that the
names are unknown to the program.

--
Chuck F (cb********@yahoo.com) (cb********@worldnet.att.net)
Available for consulting/temporary embedded and systems.
<http://cbfalconer.home.att.net> USE worldnet address!

Nov 14 '05 #22

jacob navia

Under linux making

ls -l /proc/self/fd

will print a nice table with each integer file descriptor linked
to the real file it is using (/dev/pts1 for a console file),
or /home/jacob/getfilename.c for a real file

So, that's how you do it under linux.

Maybe other OSes will follow.

Nov 14 '05 #23

Jonathan Adams

In article <41**********************@news.wanadoo.fr>,
jacob navia <ja***@jacob.remcomp.fr> wrote:

Under linux making

ls -l /proc/self/fd

will print a nice table with each integer file descriptor linked
to the real file it is using (/dev/pts1 for a console file),
or /home/jacob/getfilename.c for a real file

So, that's how you do it under linux.

Maybe other OSes will follow.

<OT>
Solaris 10 has similar functionality:

% ls -l /proc/self/path/[0-9]*
</OT>

Cheers,
- jonathan

Nov 14 '05 #24

Jarno A Wuolijoki

On Thu, 14 Oct 2004, CBFalconer wrote:

Jarno A Wuolijoki wrote:

How do you hash pointers portably?
You cast them into integers (allowable, but not the inverse) and
use hashing methods for integers (see the references in my hashlib
package and its tests). The operation of the hash table only
requires equal/non-equal comparisons of pointers, which is always
allowable.

Does the standard guarantee that pointers that compare equal convert to
integers that do so as well?
(think of x86 real mode, b000:8000 vs b800:0000)

Please refrain from snipping attributes for material you quote.

Oops. I accidentally followed my own queer 'netiquette' instead of ng's.

(That is, I tend to think that only first level attributions are really
relevant in the context of my reply. I learned this ugly habit in
BBS's where it was typical to nest much farther than here)

Nov 14 '05 #25

Keith Thompson

Da*****@cern.ch (Dan Pop) writes:

In <Pi*******************************@sbz-31.cs.Helsinki.FI> Jarno A
Wuolijoki <jw******@cs.Helsinki.FI> writes:
On Wed, 13 Oct 2004, CBFalconer wrote:
> If it's not portable, you're abusing this newsgroup by advertising
> it. If it's portable, my congratulations!

I feel morally certain that M.Navias method is non-portable.
However, I can conceive of a portable method, if you are willing to
replace the fopen call with something else, say fnopen. Then it
only has to cache a copy of the file name in a hashtable, keyed by
the FILE* pointer returned, and retrieval is a simple O(1) lookup.

How do you hash pointers portably?

You alias them with an array of unsigned char of size sizeof(FILE *).

That was my thought as well, but I can imagine an implementation in
which two FILE* values have the same value (as pointers) but different
representations (as arrays of unsigned char). Realistically, an
implementation is unlikely to generate two such distinct
representations for the same value, but I think a conforming
implementation could do so.

--
Keith Thompson (The_Other_Keith) ks***@mib.org <http://www.ghoti.net/~kst>
San Diego Supercomputer Center <*> <http://users.sdsc.edu/~kst>
We must do something. This is something. Therefore, we must do this.

Nov 14 '05 #26

Keith Thompson

ri*****@cogsci.ed.ac.uk (Richard Tobin) writes:

In article <ck**********@sunnews.cern.ch>, Dan Pop <Da*****@cern.ch> wrote:
How do you get the inode of the file from the corresponding FILE
structure? All the Unix implementation cares about is the file
descriptor returned by the open() call.

You call the POSIX stat() function on that file descriptor.

<OT><QUIBBLE>fstat()</QUIBBLE></OT>

--
Keith Thompson (The_Other_Keith) ks***@mib.org <http://www.ghoti.net/~kst>
San Diego Supercomputer Center <*> <http://users.sdsc.edu/~kst>
We must do something. This is something. Therefore, we must do this.

Nov 14 '05 #27

CBFalconer

Jarno A Wuolijoki wrote:

On Thu, 14 Oct 2004, CBFalconer wrote:
Jarno A Wuolijoki wrote:

How do you hash pointers portably?

You cast them into integers (allowable, but not the inverse) and
use hashing methods for integers (see the references in my hashlib
package and its tests). The operation of the hash table only
requires equal/non-equal comparisons of pointers, which is always
allowable.

Does the standard guarantee that pointers that compare equal convert
to integers that do so as well?
(think of x86 real mode, b000:8000 vs b800:0000)

It doesn't matter. You are working only with the value that was
returned from fopen.

--
Chuck F (cb********@yahoo.com) (cb********@worldnet.att.net)
Available for consulting/temporary embedded and systems.
<http://cbfalconer.home.att.net> USE worldnet address!

Nov 14 '05 #28

Keith Thompson

Da*****@cern.ch (Dan Pop) writes:

In <ln************@nuthaus.mib.org> Keith Thompson <ks***@mib.org> writes:
Da*****@cern.ch (Dan Pop) writes:
In <Pi*******************************@sbz-31.cs.Helsinki.FI> Jarno A
Wuolijoki <jw******@cs.Helsinki.FI> writes:
On Wed, 13 Oct 2004, CBFalconer wrote:

> > If it's not portable, you're abusing this newsgroup by advertising
> > it. If it's portable, my congratulations!
>
> I feel morally certain that M.Navias method is non-portable.
> However, I can conceive of a portable method, if you are willing to
> replace the fopen call with something else, say fnopen. Then it
> only has to cache a copy of the file name in a hashtable, keyed by
> the FILE* pointer returned, and retrieval is a simple O(1) lookup.

How do you hash pointers portably?

You alias them with an array of unsigned char of size sizeof(FILE *).

That was my thought as well, but I can imagine an implementation in
which two FILE* values have the same value (as pointers) but different
representations (as arrays of unsigned char). Realistically, an
implementation is unlikely to generate two such distinct
representations for the same value, but I think a conforming
implementation could do so.

It doesn't matter: you get only one representation from fopen() and you
keep using it. There is no way for that representation to metamorphose
into the other.

You may be right, but I'm still not quite sure of that. Could a
pointer assignment change its representation? Similarly, can a
floating-point assignment change the representation (without changing
the represented value)? I *think* it can; for example, loading a
floating-point value into a register might automatically normalize it.
The same thing could happen with an address register. As long as the
before and after values compare equal, I don't see a problem.

Realistically, though, if automatic pointer normalization happens so
easily, it's unlikely that a non-normalized pointer could survive long
enough to be returned from fopen().

If my guess is right, hashing pointers by converting them to arrays of
unsigned char will probably work reliably on every system other than
the DS9000.

--
Keith Thompson (The_Other_Keith) ks***@mib.org <http://www.ghoti.net/~kst>
San Diego Supercomputer Center <*> <http://users.sdsc.edu/~kst>
We must do something. This is something. Therefore, we must do this.

Nov 14 '05 #29

CBFalconer

Dan Pop wrote:

CBFalconer <cb********@yahoo.com> writes:
Jarno A Wuolijoki wrote:
On Wed, 13 Oct 2004, CBFalconer wrote:

... snip ...

I feel morally certain that M.Navias method is non-portable.
However, I can conceive of a portable method, if you are willing to
replace the fopen call with something else, say fnopen. Then it
only has to cache a copy of the file name in a hashtable, keyed by
the FILE* pointer returned, and retrieval is a simple O(1) lookup.

How do you hash pointers portably?

You cast them into integers (allowable, but not the inverse)

Wrong. The cast is allowed in both directions, but the results are
not guaranteed to be meaningful in any direction.

For maximal portability, you have to use the unsigned char array
approach. Even on C99, [u]intptr_t is an optional typedef.

I have my doubts. Consider that the representation of a pointer
may contain trap bits, which are accessed by the unsigned char
attack. There is no guarantee that those trap bits do not change
with time and/or actual storage location (of the pointer). The
cast technique eliminates those trap bits. If it doesn't convert
back to the pointer, so what, it is just one phase of the hashing
mechanism.

So I claim that the cast makes the pointer to hash function single
valued, while the unsigned char approach does not. I would be hard
put to find a system where the unsigned char method would not work,
but it is not guaranteed.

--
Chuck F (cb********@yahoo.com) (cb********@worldnet.att.net)
Available for consulting/temporary embedded and systems.
<http://cbfalconer.home.att.net> USE worldnet address!

Nov 14 '05 #30

Keith Thompson

CBFalconer <cb********@yahoo.com> writes:

Dan Pop wrote:

[...]

For maximal portability, you have to use the unsigned char array
approach. Even on C99, [u]intptr_t is an optional typedef.

I have my doubts. Consider that the representation of a pointer
may contain trap bits, which are accessed by the unsigned char
attack. There is no guarantee that those trap bits do not change
with time and/or actual storage location (of the pointer). The
cast technique eliminates those trap bits. If it doesn't convert
back to the pointer, so what, it is just one phase of the hashing
mechanism.

Did you mean padding bits rather than trap bits? A type can have trap
*representations*, but a valid pointer value (of the kind that we're
interested in hashing) won't be one of them.

I don't believe that the cast necessarily eliminates padding bits.

Assume the following:

void *p1 = foo();
void *p2 = bar();
uintptr_t u1 = uintptr_t(p1);
uintptr_t u2 = uintptr_t(p1);

Assume that p1 == p2 (they point to the same address), but that they
have different internal representations (perhaps one is normalized and
the other is not).

We know from C99 7.18.1.4 that (void*)u1 == (void*)u2, but we don't
know that u1 == u2. For example, if the cast simply copies the bits,
the values of u1 and u2 would reflect the difference in
representations of the two pointer values; converting back to void*
yields two pointers that have different representations, but compare
equal to each other.

The cast *might* normalize the representation, but it doesn't have to.

--
Keith Thompson (The_Other_Keith) ks***@mib.org <http://www.ghoti.net/~kst>
San Diego Supercomputer Center <*> <http://users.sdsc.edu/~kst>
We must do something. This is something. Therefore, we must do this.

Nov 14 '05 #31

Chris Torek

[regarding hashing pointers by first converting them to uintptr_t]

In article <news:ln************@nuthaus.mib.org>
Keith Thompson <ks***@mib.org> wrote:

Assume the following:

void *p1 = foo();
void *p2 = bar();
uintptr_t u1 = uintptr_t(p1);
uintptr_t u2 = uintptr_t(p1);
Minor nit: this is (old) C++ syntax; you mean:

uintptr_t u1 = (uintptr_t)p1;

and so on.
Assume that p1 == p2 (they point to the same address), but that they
have different internal representations (perhaps one is normalized and
the other is not).

We know from C99 7.18.1.4 that (void*)u1 == (void*)u2, but we don't
know that u1 == u2. For example, if the cast simply copies the bits,
the values of u1 and u2 would reflect the difference in
representations of the two pointer values; converting back to void*
yields two pointers that have different representations, but compare
equal to each other.

The cast *might* normalize the representation, but it doesn't have to.

Indeed, consider the historical implementations that are the very
reason the C standards are full of this kind of weirdness with
pointer arithmetic. In other words, think back to the 1980s and
C compilers for the IBM PC that ran under MS-DOS with its various
"extender" schemes to access more than 64K and 640K of memory.

One of the models under which code ran had 20-bit pointers, so that
uintptr_t would have to be defined as "unsigned long" ("int" being
only 16 bits on these compilers). If functions foo() and bar()
returned "un-normalized" pointers, and you assigned these to u1 and
u2 via casts, you get -- unnormalized integers. The "normalization"
operation was done by the "==" operators (only). Relational
comparisons ("<" and ">", and their "<=" and ">=" variants) compared
only offsets. This led to the peculiar case that:

printf("p1 is %sequal to p2\n", p1 == p2 ? "" : "not ");
printf("p1 is %sless than p2\n", p1 < p2 ? "" : "not ");

would sometimes print:

p1 is equal to p2
p1 is less than p2

In other words, p1 < p2 && p1 == p2, both at the same time.

(The only things that behave this way on modern CPUs are floating
point numbers. :-) If x is set to NaN, a surprising number of
comparisons all produce "false" as their result.)
--
In-Real-Life: Chris Torek, Wind River Systems
Salt Lake City, UT, USA (40°39.22'N, 111°50.29'W) +1 801 277 2603
email: forget about it http://web.torek.net/torek/index.html
Reading email is like searching for food in the garbage, thanks to spammers.

Nov 14 '05 #32

Keith Thompson

Chris Torek <no****@torek.net> writes:

[regarding hashing pointers by first converting them to uintptr_t]

In article <news:ln************@nuthaus.mib.org>
Keith Thompson <ks***@mib.org> wrote:
Assume the following:

void *p1 = foo();
void *p2 = bar();
uintptr_t u1 = uintptr_t(p1);
uintptr_t u2 = uintptr_t(p1);

Minor nit: this is (old) C++ syntax; you mean:

uintptr_t u1 = (uintptr_t)p1;

and so on.

D'oh! (It wasn't (deliberately) C++ syntax, it was just a mistake;
I'm not going to admit to the thought process that led to it.) And I
used the wrong variable on the last line. What I meant, of course,
was:

void *p1 = foo();
void *p2 = bar();
uintptr_t u1 = (uintptr_t)p1;
uintptr_t u2 = (uintptr_t)p2;

[snip]

Thanks for confirming (somewhat to my surprise) that there are
real-world examples of what I was talking about.

--
Keith Thompson (The_Other_Keith) ks***@mib.org <http://www.ghoti.net/~kst>
San Diego Supercomputer Center <*> <http://users.sdsc.edu/~kst>
We must do something. This is something. Therefore, we must do this.

Nov 14 '05 #33

Mark McIntyre

On Thu, 14 Oct 2004 09:12:59 +0200, in comp.lang.c , jacob navia
<ja***@jacob.remcomp.fr> wrote:

Mark McIntyre wrote:

It doesn't work. Please try not to post broken code, even in links, under
the guise of "good implementations"

Obviously I have just written it, so it is not well tested. Can you give
any examples for your assertion? In which circumstances it doesn't work?

When I run it through my C interpreter on my Palmpilot, or compile it on my
Vax 8800, and on my IBM S/360. And it also fails on my spare PC, on my Mac,
on my Atari, on my Symbian phone, etc etc....

But I think you probably knew that !
--
Mark McIntyre
CLC FAQ <http://www.eskimo.com/~scs/C-faq/top.html>
CLC readme: <http://www.ungerhu.com/jxh/clc.welcome.txt>
----== Posted via Newsfeed.Com - Unlimited-Uncensored-Secure Usenet News==----
http://www.newsfeed.com The #1 Newsgroup Service in the World! >100,000 Newsgroups
---= 19 East/West-Coast Specialized Servers - Total Privacy via Encryption =---

Nov 14 '05 #34

How to retrieve the name of the file from a FILE *

Similar topics