By using this site, you agree to our updated Privacy Policy and our Terms of Use. Manage your Cookies Settings.
458,127 Members | 1,124 Online
Bytes IT Community
+ Ask a Question
Need help? Post your question and get tips & solutions from a community of 458,127 IT Pros & Developers. It's quick & easy.

strong/weak typing and pointers

P: n/a

Is it correct to say that strong/weak typing does not make a difference
if one does not use any pointers (or adress-taking operator)?

More concretely, I am thinking particularly of Python vs C++.
So, are there any examples (without pointers, references, or adress-taking),
which would have a different result in Python and in C++?

I would appreciate all insights or pointers to literature.

TIA,
gabriel.

--
/-------------------------------------------------------------------------\
| We act as though comfort and luxury |
| were the chief requirements of life, |
| when all that we need to make us happy |
| is something to be enthusiastic about. (Einstein) |
+-------------------------------------------------------------------------+
| za**@cs.uni-bonn.de __@/' www.gabrielzachmann.org |
\-------------------------------------------------------------------------/
Jul 18 '05 #1
Share this Question
Share on Google+
94 Replies


P: n/a
> Is it correct to say that strong/weak typing does not make a difference
if one does not use any pointers (or adress-taking operator)?
It seems that you mistake strong/weak typing with static/dynamic typing - a
completely different beast.

Python is in fact strong typed - in opposition to php or perl or even C,
this won't work:

a = "1" + 2

as "1" is a string and 2 an integer. And even though C is statically typed,
it won't complain - you just end up with an unexpected result.

And pointers are not evil in themselves - the are a neccessity to create
recursive structures. But deliberately casting pointers can be very harmful
- a reason why its forbidden in languages like java and AFAIK ada.
More concretely, I am thinking particularly of Python vs C++.
So, are there any examples (without pointers, references, or
adress-taking), which would have a different result in Python and in C++?


I have difficulties to understand what you want here. Please elaborate.

--
Regards,

Diez B. Roggisch
Jul 18 '05 #2

P: n/a
JCM
Gabriel Zachmann <za**@cs.uni-bonn.de> wrote:
Is it correct to say that strong/weak typing does not make a difference
if one does not use any pointers (or adress-taking operator)?
You'll find a lack of consensus here on what's meant by "strong/weak
typing". In Python there's no way to re-interpret the bits of a value
as if they were a different type. For example, code like this is
impossible in Python:

float x = 2.5;
printf("%d\n", *(int*)&x);
More concretely, I am thinking particularly of Python vs C++.
So, are there any examples (without pointers, references, or adress-taking),
which would have a different result in Python and in C++?


If I understand your question, I believe not; because Python doesn't
provide the low-level operators that would be necessary for it.
Jul 18 '05 #3

P: n/a
On Thu, 28 Oct 2004 18:34:12 +0200, "Diez B. Roggisch"
<de*********@web.de> wrote:
Is it correct to say that strong/weak typing does not make a difference
if one does not use any pointers (or adress-taking operator)?


It seems that you mistake strong/weak typing with static/dynamic typing - a
completely different beast.

Python is in fact strong typed - in opposition to php or perl or even C,
this won't work:

a = "1" + 2

as "1" is a string and 2 an integer. And even though C is statically typed,
it won't complain - you just end up with an unexpected result.


You didn't mention C++. Try this ...

std::string s = "Wow";
s += 3.141592654; // Perfectly valid
s = 3.141592654; // Also valid

Andrea
Jul 18 '05 #4

P: n/a
On 2004-10-28, Diez B. Roggisch <de*********@web.de> wrote:
It seems that you mistake strong/weak typing with
static/dynamic typing - a completely different beast.

Python is in fact strong typed - in opposition to php or perl or even C,
this won't work:

a = "1" + 2

as "1" is a string and 2 an integer.
"1" is a pointer to a char.
And even though C is statically typed, it won't complain
That's because <pointer> + <integer> has a well-defined meaning
in C -- just like <float> + <integer> does in Python (and in C).
- you just end up with an unexpected result.


Only people who don't know how C pointer arithmatic works will
get unexpected results. [That's probably a shockingly high
percentage of C programmers.]

--
Grant Edwards grante Yow! I brought my BOWLING
at BALL -- and some DRUGS!!
visi.com
Jul 18 '05 #5

P: n/a
JCM <jo******************@myway.com> wrote:
Gabriel Zachmann <za**@cs.uni-bonn.de> wrote:
Is it correct to say that strong/weak typing does not make a difference
if one does not use any pointers (or adress-taking operator)?


You'll find a lack of consensus here on what's meant by "strong/weak
typing". In Python there's no way to re-interpret the bits of a value
as if they were a different type. For example, code like this is
impossible in Python:

float x = 2.5;
printf("%d\n", *(int*)&x);


True, but module struct lets you get the same effect, though the 4 bytes
get copied, not 'reinterpreted in place'.

More concretely, I am thinking particularly of Python vs C++.
So, are there any examples (without pointers, references, or adress-taking),
which would have a different result in Python and in C++?


If I understand your question, I believe not; because Python doesn't
provide the low-level operators that would be necessary for it.


Well... what about something like:

std::list<int> a, b;

....

a[2] = 45;
b = a;
b[2] = 23;

In C++, a[2] is still 45, because 'b = a;' COPIED the whole list over.

A similar case in Python would make no implicit copies, just give an
additional name 'b' to the same object which 'a' names, so assigning to
b[2] would also change a[2]. Nothing to do with dynamic vs static
typing, of course, because e.g. Java would work like Python here.

I've found this one tidbit to be the single biggest stumbling block for
experienced C or C++ programmers learning Java or Python. "without
references" isn't really true, because (in Python and Java) a and b
_are_ 'references' (aka names) to the same object -- but then, neither
in Java nor Python can you say that a name _isn't_ ``a reference''...
names always reference objects... ((Java makes exceptions to this rule
for some lowlevel types such as ints, Python doesn't)).

Templates may be another case in which C++ might act one way, and Java
and Python the other way, and may be more relevant to type issues.

E.g.,

template<typename T>
T foo(const T& bar)
{
static T baz;
T temp = baz;
baz = bar;
return temp;
}

now, if you make a series of calls such as foo(1), foo(1.2), foo(2),
foo(3.4), you should get as results 0, 0.0, 1, and 1.2 -- there are two
'foo's, one instantiated for T being int, another one for T being
double, so the 'delay register' baz also exists in two incarnations.

In the Python rough equivalent:

def foo(bar, _baz=[None]):
temp = _baz.pop()
_baz.append(bar)
return temp

(and the Java equivalent, too, with everything declared as Object to be
"generic"), the same calls would give None, 1, 1.2, 2 -- there is a
single 'incarnation' of foo, a single 'delay register' _baz. (Not sure
which way Java 1.5's generics go wrt statics; I'd expect the C++ way).
Not sure I've gotten the gist of what the OP was asking about, though.

Alex
Jul 18 '05 #6

P: n/a
On Thu, 28 Oct 2004 18:34:12 +0200, Diez B. Roggisch <de*********@web.de> wrote:
....
Python is in fact strong typed - in opposition to php or perl or even C,
this won't work:

a = "1" + 2

as "1" is a string and 2 an integer. And even though C is statically typed,
it won't complain - you just end up with an unexpected result.
[slightly offtopic defense of C and C++]

That's true only if a is a 'char *' of course (and if you didn't expect
this unexpected result ;-).

In C++ 'char *' would have been invalid, but not 'const char *' or
(and this is worse) 'std::string'.
And pointers are not evil in themselves - the are a neccessity to create
recursive structures.
I'd say they are neccessary, period. But note that I count what Java and
Python call "references" as pointers ...
But deliberately casting pointers can be very harmful
- a reason why its forbidden in languages like java and AFAIK ada.


Yes; lots of casts in C code (or worse, in C++ code) is a very, very bad
sign. Note though, that in the absense of casts, C and in particular C++ are
pretty strongly typed for pointers. Strongly enough to keep me happy, at
least.
More concretely, I am thinking particularly of Python vs C++.
So, are there any examples (without pointers, references, or
adress-taking), which would have a different result in Python and in C++?


I have difficulties to understand what you want here. Please elaborate.


I think he means the static/dynamic typing, and if it makes a difference in
a simple C and a simple Python program, if we pretend that all names are
just "variables". Hard to come up with a meaningful answer, but how about:

a = 2 const int a = 2;
if something_rare_happens: if(something_rare_happens) {
return b return b;
a = 'hugo' }
std::string a("hugo");
bar(a) bar(a);

In the Python program, we might clobber 'a' by accidentaly reusing its name
for something of a different type. C++ is stricter about this (doesn't
allow the construct above, in fact) and you can look at the code
(statically) to see which names are in scope and which are not.

In Python, you can forget to give 'b' a value, and not notice until that
code executes. You can in C++ too, and the runtime effects will be more
subtle but worse. The compiler is more likely to catch pure typos, though.

In Python, it is often hard to look at a function such as 'bar' and say you
know it is always called with an integer argument, or a string, or a 'Foo'
object. It's not even enough to look at all places where 'bar' is called,
because the type of b may depend on the phase of the moon or other dynamic
things. In C++ they compiler makes the guarantees, unless someone has
willfully bypassed the type system.

Is all this caused by the static/dynamic typing difference? No, but it
certainly has to do with it. Both languages have made a decision here, and
that of course works together with the rest of the language design. Python
doesn't /have/ to declare variables and parameters to give them a type, so
Guido said you don't have to, and let functions take all kinds of flexible
arguments. C++ had to have declarations/definitions, so Bjarne used them to
add the 'const' keyword, to give values a scope and making it well-defined
when objects are destroyed. And so on.

For what it's worth, I think both kinds of typing are interesting and useful
tools. Neither of them are obsolete or inferior; neither of them will
disappear in the next ten years.

/Jorgen

--
// Jorgen Grahn <jgrahn@ Ph'nglui mglw'nafh Cthulhu
\X/ algonet.se> R'lyeh wgah'nagl fhtagn!
Jul 18 '05 #7

P: n/a
In article <sl*****************@fuji.informatik.uni-bonn.de>,
Gabriel Zachmann <za**@cs.uni-bonn.de> wrote:

Is it correct to say that strong/weak typing does not make a difference
if one does not use any pointers (or adress-taking operator)?


One effect of weak typing is to put more reliance on
operators. In Perl, for instance the string operator `lt`
does a string compare to find that "10" is less than 2
(lexically) and the numeric operator `<` finds that "10" is
not less than 2 (numerically). Nothing to do with pointers
at all.

Regards. Mel.
Jul 18 '05 #8

P: n/a
Gabriel Zachmann wrote:
Is it correct to say that strong/weak typing does not make a
difference if one does not use any pointers (or adress-taking
operator)?

More concretely, I am thinking particularly of Python vs C++.
So, are there any examples (without pointers, references, or
adress-taking), which would have a different result in Python and in
C++?


Here's a trivial example that is almost identical in Python and C/C++ but
gives totally different results. In a weakly typed language such as C or
C++:

#include <stdio.h>

int main(int argc, char**argv)
{
float f = 3;
printf("value is %d", f);
}

I get the output (you may get different results):

value is 0

In a fairly strongly typed language such as Python:
f = 3.0
print "value is %d" % f

value is 3

In a really strongly typed language I would expect an exception to
be thrown.
Jul 18 '05 #9

P: n/a
Diez B. Roggisch <de*********@web.de> wrote:
And pointers are not evil in themselves - the are a neccessity to create
recursive structures. But deliberately casting pointers can be very harmful
- a reason why its forbidden in languages like java and AFAIK ada.


I agree. A language worth mentioning in this context might
be Cyclone. It's derived from C (and still has very much in
common with it, so it's easy to port C programs to Cyclone).
The difference is that "safe" features have been added to the
language. For example, you can't do arbitrary type casts on
pointers anymore, and you can't access strings (or other
allocated memory) beyond their end. The ultimate goal of
Cyclone is to make it impossible for programs to crash or
have security holes caused by buffer overflows or similar.

Furthermore, Cyclone provides interesting features, such as
tagged unions, parametric polymorphism, pattern matching,
exceptions, even a somewhat limited implementation of type
inference.

http://www.research.att.com/projects/cyclone/

Best regards
Oliver

--
Oliver Fromme, Konrad-Celtis-Str. 72, 81369 Munich, Germany

``All that we see or seem is just a dream within a dream.''
(E. A. Poe)
Jul 18 '05 #10

P: n/a
> It seems that you mistake strong/weak typing with static/dynamic typing - a

sorry, i don't think so.
completely different beast.

Python is in fact strong typed - in opposition to php or perl or even C,
this won't work:

a = "1" + 2
haven't tried that yet, but i guess it would at least evoke a warning in
ANSI C++.
as "1" is a string and 2 an integer. And even though C is statically typed,
In C++, "1" is a 'char const * const'.
it won't complain - you just end up with an unexpected result.

And pointers are not evil in themselves - the are a neccessity to create
i didn't say that.
In fact, they are everywhere, even in Python and Java, except that you
don't get to see them.
recursive structures. But deliberately casting pointers can be very harmful
i agree.
- a reason why its forbidden in languages like java and AFAIK ada.
> More concretely, I am thinking particularly of Python vs C++.
> So, are there any examples (without pointers, references, or
> adress-taking), which would have a different result in Python and in C++?


I have difficulties to understand what you want here. Please elaborate.


i am just trying to come up with the best possible definition of "weak and
strong typing" ( "best" in the sense of completeness and objectiveness).
I've read up quite a bit about strong/weak typing, and static and dynamic
typing, and it seems to me that, while static/dynamic typing is a pretty
well-defined concept, the definition of strong/weak typing is not so
clear-cut.

Cheers,
Gab.

--
/-------------------------------------------------------------------------\
| There are works which wait, |
| and which one does not understand for a long time; [...] |
| for the question often arrives a terribly long time after the answer. |
| (Oscar Wilde) |
+-------------------------------------------------------------------------+
| za**@cs.uni-bonn.de __@/' www.gabrielzachmann.org |
\-------------------------------------------------------------------------/
Jul 18 '05 #11

P: n/a
> You didn't mention C++. Try this ...

std::string s = "Wow";
s += 3.141592654; // Perfectly valid
s = 3.141592654; // Also valid


ah, good example.

So, would it be valid to say:
the more coercion (or automatic conversion) rules a language has, the
weaker the typing?

Best regards,
Gabriel.

--
/-------------------------------------------------------------------------\
| There are works which wait, |
| and which one does not understand for a long time; [...] |
| for the question often arrives a terribly long time after the answer. |
| (Oscar Wilde) |
+-------------------------------------------------------------------------+
| za**@cs.uni-bonn.de __@/' www.gabrielzachmann.org |
\-------------------------------------------------------------------------/
Jul 18 '05 #12

P: n/a
> printf("value is %d", f);

This seems a very good example to me.

Note that this is also an example showing that C++ does contain a little
bit of dynamic typing, too, isn't it?

Cheers,
gab.

--
/-------------------------------------------------------------------------\
| There are works which wait, |
| and which one does not understand for a long time; [...] |
| for the question often arrives a terribly long time after the answer. |
| (Oscar Wilde) |
+-------------------------------------------------------------------------+
| za**@cs.uni-bonn.de __@/' www.gabrielzachmann.org |
\-------------------------------------------------------------------------/
Jul 18 '05 #13

P: n/a
Gabriel Zachmann wrote:
printf("value is %d", f);


This seems a very good example to me.

Note that this is also an example showing that C++ does contain a little
bit of dynamic typing, too, isn't it?


Where do you get that idea from? Modern compilers are aware of printf, and
have special type-checking rules built into them that verify that you pass
the right number and types of arguments for the format string. And thats
totally static, as it is done at compiletime!

--
Regards,

Diez B. Roggisch
Jul 18 '05 #14

P: n/a
> Note that this is also an example showing that C++ does contain a little
bit of dynamic typing, too, isn't it?


On a related note: c++ _can_ have some dynamic type information - when not
disabled with -fno-rtti (gcc) you get "real time type identification". That
allows for (guess why there called that way...) dynamic casts.

--
Regards,

Diez B. Roggisch
Jul 18 '05 #15

P: n/a
>> It seems that you mistake strong/weak typing with static/dynamic typing
- a
sorry, i don't think so.


I still do - as your other post regarding dynamic aspects of printf makes me
think so.

Dynamic typing means that at runtime, type-information is associated with
data and used e.g. to determine what implementation for certain operations
is to be taking. Static typing means that type information is used during
compiletime. Python is purely dynamic, java has both aspects and C++ is
mainly static - albeit rtti (and virtual methods somehow) are dynamic. So
one could say its mixed.

Strong/weak typing is about how much you care of types at all - in php, its
perfectly legal to add strings to numbers - the string simply gets
converted to a number beforehand, that conversion yielding 0 when there is
nothing useful and numberlike can be extracted from the string. So typing
is weak, as it doesn't constrain the possible operations on variables with
certain values.

haven't tried that yet, but i guess it would at least evoke a warning in
ANSI C++.
No, as others pointed out its a legal pointer operation
i am just trying to come up with the best possible definition of "weak and
strong typing" ( "best" in the sense of completeness and objectiveness).
I've read up quite a bit about strong/weak typing, and static and dynamic
typing, and it seems to me that, while static/dynamic typing is a pretty
well-defined concept, the definition of strong/weak typing is not so
clear-cut.


That might be - but you put python and c++ in the same sentence as weak and
strong typing - and that looked as if you think of python beeing weakly
typed, which it is not.
--
Regards,

Diez B. Roggisch
Jul 18 '05 #16

P: n/a
> So you would say that in PHP the + operator cannot be viewed as overloaded
with
the signature (string, int) -> string? I don't know PHP, so could you
maybe you could give an example of why you think this is so?
If you do this:

"a" + 10

you end with 10 - if the string doesn't contain something as number
interpretable, the coercion results in null.

Sure, that behaviour can be seen as overloaded, too. But overloaded
functions usually make some sort of sense, where this technique masks
errors by _always_ trying to interpret values as useful to every operation.
The question remains if permanent coercions as php (and afaik perl) do
can also be considered weak typing, as you won't end up with an error for
more or less anything you do.


Sorry, I don't know what "permanent coercions" means. Could you explain?
"Permanent coercions" makes me expect something like:


I'm no native speaker, so I maybe confused the meaning of permanent. In
german, permanent not only means that things are enduring, but also that
things are done on regular bases: "He permanently changed his cloth."

What I wanted to say is that php uses coercions or overloaded operators for
nearly everything. Sometimes this is totally silent, sometimes nothing
happens and a warning is issued - which might be configured to be treated
as an actual error, I'm not sure about that.

So while there migth be in fact type information present, it's rarely used -
which I consider as beeing weak.

Does that make more sense?
--
Regards,

Diez B. Roggisch
Jul 18 '05 #17

P: n/a
> wall of abstraction". A Smalltalk programmer would say that
Python is more weakly typed than Smalltalk for user-defined types.


which brings me to another related question.

I understand strong/weak typing is more like a continuum --
is there kind of a programming langauges hierarchy sorting various
langauges accoring to "type-strongness" that is generally agreed upon?

I'd be interested in a hierarchy containing some of the following
languages: ANSI-C++, C, Perl, Python, Pascal, ML.
(because these happen to be some of the languages i know a bit ;-) )

Would the following be justifiable?

Perl < C < C++ < Pascal < Python < ML ?

Or does anyone have a pointer?

Best regards,
gab.
--
/-------------------------------------------------------------------------\
| There are works which wait, |
| and which one does not understand for a long time; [...] |
| for the question often arrives a terribly long time after the answer. |
| (Oscar Wilde) |
+-------------------------------------------------------------------------+
| za**@cs.uni-bonn.de __@/' www.gabrielzachmann.org |
\-------------------------------------------------------------------------/
Jul 18 '05 #18

P: n/a
Gabriel Zachmann <za**@cs.uni-bonn.de> wrote:
I've read up quite a bit about strong/weak typing, and static and dynamic
typing, and it seems to me that, while static/dynamic typing is a pretty
well-defined concept, the definition of strong/weak typing is not so
clear-cut.


"four legs good two legs bad". In the end, it's what works best...
Alex
Jul 18 '05 #19

P: n/a
Diez B. Roggisch <deetsNOSPAM <at> web.de> writes:

If you do this:

"a" + 10

you end with 10 - if the string doesn't contain something as number
interpretable, the coercion results in null.

Sure, that behaviour can be seen as overloaded, too. But overloaded
functions usually make some sort of sense, where this technique masks
errors by _always_ trying to interpret values as useful to every operation.
Ahh, I understand now. I would still call this coercion with an overloaded
operator. A horrible language decision, certainly, but not a mark of weak
typing -- note that Python can give you exactly the same behavior if you want it:
class phpint(int): .... def __add__(self, other):
.... try:
.... other = int(other)
.... except ValueError:
.... other = 0
.... return super(phpint, self).__add__(other)
.... __radd__ = __add__
.... i = phpint(10)
5 + i 15 "5" + i 15 "a" + i

10

This doesn't mean that Python has suddenly become a weakly-typed language. It
just means that I've implemented some poor coercion choices in the language. =)
I'm no native speaker, so I maybe confused the meaning of permanent. In
german, permanent not only means that things are enduring, but also that
things are done on regular bases: "He permanently changed his cloth."
Ahh. Gotcha. I would probably say "He (regularly changed/repeatedly
changed/used to change) his clothes." Thanks for the clarification.
What I wanted to say is that php uses coercions or overloaded operators for
nearly everything. Sometimes this is totally silent, sometimes nothing
happens and a warning is issued - which might be configured to be treated
as an actual error, I'm not sure about that.
Totally clear now, thanks. Basically you would say that the more implicit
coercions a language performs, the more weakly typed it is. This diverges from
the common use of the terms strong and weak typing in the PL literature, which
is why I was confused.
So while there migth be in fact type information present, it's rarely used -
which I consider as beeing weak.


Well, the type information is probably used all the time (I would't be surprised
if somewhere in the PHP internals something like my __add__ method above was
defined), but it's used implicitly, so the programmer might never see it.

Steve
Jul 18 '05 #20

P: n/a
Gabriel Zachmann writes:

I understand strong/weak typing is more like a continuum --
is there kind of a programming langauges hierarchy sorting various
langauges accoring to "type-strongness" that is generally agreed upon?


Well, as you can probably see from the discussion, the definition of strong/weak
typing isn't even agreed upon, so I'd be wary of giving a hierarchy. ;) The two
main interpretations that I've seen in this thread:

(1) Weakly-typed languages allow you to take a block of memory that was
originally defined as one type and reinterpret the bits of this block as another
type[1]. (This is the definition usually used in Programming Languages
literature.)

(2) Weakly-typed languages have more implicit coercions than strongly-typed
languages. (This seems to be the favored definition on this newsgroup.)

The answer to your question depends on which one of these definitions you're
interested in. Definition (1) will have a much flatter hierarchy than
definition (2). Which definition are you interested in?

Steve

[1] see my example at
http://mail.python.org/pipermail/pyt...er/248983.html

Jul 18 '05 #21

P: n/a
> (1) Weakly-typed languages allow you to take a block of memory that was
originally defined as one type and reinterpret the bits of this block as another
type[1]. (This is the definition usually used in Programming Languages
literature.)

(2) Weakly-typed languages have more implicit coercions than strongly-typed
languages. (This seems to be the favored definition on this newsgroup.)
Is either of them a subset of the other, generally speaking?

The answer to your question depends on which one of these definitions you're
interested in. Definition (1) will have a much flatter hierarchy than
definition (2). Which definition are you interested in?


both, if you don't mind ;-)

cheers,
gab.
--
/-------------------------------------------------------------------------\
| There are works which wait, |
| and which one does not understand for a long time; [...] |
| for the question often arrives a terribly long time after the answer. |
| (Oscar Wilde) |
+-------------------------------------------------------------------------+
| za**@cs.uni-bonn.de __@/' www.gabrielzachmann.org |
\-------------------------------------------------------------------------/
Jul 18 '05 #22

P: n/a
> (1) Weakly-typed languages allow you to take a block of memory that was
originally defined as one type and reinterpret the bits of this block as another
type[1]. (This is the definition usually used in Programming Languages
literature.)


So, according to that, Perl is strongly typed?

Thanks a lot in advance,
Gabriel.

--
/-------------------------------------------------------------------------\
| There are works which wait, |
| and which one does not understand for a long time; [...] |
| for the question often arrives a terribly long time after the answer. |
| (Oscar Wilde) |
+-------------------------------------------------------------------------+
| za**@cs.uni-bonn.de __@/' www.gabrielzachmann.org |
\-------------------------------------------------------------------------/
Jul 18 '05 #23

P: n/a
Gabriel Zachmann <zach <at> cs.uni-bonn.de> writes:
(1) Weakly-typed languages allow you to take a block of memory that was
originally defined as one type and reinterpret the bits of this block as
another type[1]. (This is the definition usually used in Programming
Languages literature.)

(2) Weakly-typed languages have more implicit coercions than strongly-typed
languages. (This seems to be the favored definition on this newsgroup.)
Is either of them a subset of the other, generally speaking?


Not really -- they're pretty much orthogonal.

For example, C is weakly-typed (PL theory definition) but has only a few
implicit coercions (e.g. int->float with the + operator). Of course, C only has
a very few basic types so it doesn't have as many chances to support implicit
coercions. For example, there is no string type in C (only arrays of
characters), so it wouldn't really make sense to talk about some sort of
string->int coercion.

Python is strongly-typed (PL theory definition) and also has only a few implicit
coercions (e.g. int->float with the + operator).

ML is strongly-typed (PL theory definition) and has *very* few (perhaps none?)
implicit coercions, e.g.:

# 1 + 1.0;;

Characters 4-7:
1 + 1.0;;
^^^
This expression has type float but is here used with type int

not even the int->float conversion common to C, Java, Python, etc. is supported.

I don't know enough about Pascal, Perl or PHP to tell whether they are
weakly-typed or strongly-typed (PL theory definition). Taking advantage of weak
typing isn't something I do much, so even in languages that I have passing
familiarity with, I've generally used only the strongly typed features. To
someone who knows about Pascal, Perl or PHP: Can you reinterpret an object's
memory block as a different object like you can in C?

I've never had to code in Pascal, but my understanding was that there weren't
too many implicit coercions... Please correct me on this one if anyone knows
better! Examples in this thread suggest that both Perl and PHP have large
numbers of implicit coercions.

both, if you don't mind
I'm not willing to commit to anything for languages that I'm not really familiar
with, but here's what I'd say:
From weakly-typed to strongly-typed (PL theory definition):
C < Java, Python, ML

Basically, there's no hierarchy, just weakly-typed and strongly-typed. You
might get a hierarchy if you had some languages that allowed only some (but not
all) objects to be reinterpreted.
From many implicit coercions to few implicit coercions:


C, Java, Python < ML

Of course, this isn't really helpful because C, Java and Python all contain only
a few implicit coercions (e.g. int->float with the + operator). You'll have to
get a PHP/Perl/Pascal expert in to make any claims about them.

Steve

Jul 18 '05 #24

P: n/a
Steven Bethard wrote:
Totally clear now, thanks. Basically you would say that the more implicit
coercions a language performs, the more weakly typed it is. This diverges
from the common use of the terms strong and weak typing in the PL
literature, which is why I was confused.


One could say that the common definition of weakly typed languages cannont
apply for a dynamicaly type language. Let's face it, once you've
implemented a dynamicaly typed language, it seems very hard to use one
piece of data as the wrong type. What we can do is extend the definition of
weakly typed language to something like : "A weakly typed language is a
language that often uses incorectly some piece of data by applying to it
the wrong type" Such definition would include any language that is too
liberal with the type coertion like php.

Jul 18 '05 #25

P: n/a
Gabriel Zachmann wrote:
wall of abstraction". A Smalltalk programmer would say that
Python is more weakly typed than Smalltalk for user-defined types.


which brings me to another related question.

I understand strong/weak typing is more like a continuum --
is there kind of a programming langauges hierarchy sorting various
langauges accoring to "type-strongness" that is generally agreed upon?

I'd be interested in a hierarchy containing some of the following
languages: ANSI-C++, C, Perl, Python, Pascal, ML.
(because these happen to be some of the languages i know a bit ;-) )

Would the following be justifiable?

Perl < C < C++ < Pascal < Python < ML ?

Or does anyone have a pointer?


If by ML you thing of OCaml you should try again. OCaml isn't type safe
because of a few modules of the standard library. The marshaling module
comes to mind. Using it you can "typecast" a pointer to an integer as a
pointer to a string and segfault in the process because the unmarshal
function trusts the caller to cast the result in the good type. In fact, it
seems impossible to write a correct typesafe marshaling module in OCaml
since there is no rtti info in the language for an anonymous piece of data.

Jul 18 '05 #26

P: n/a
Christophe Cavalaria <chris.cavalaria <at> free.fr> writes:

One could say that the common definition of weakly typed languages cannont
apply for a dynamicaly type language. Let's face it, once you've
implemented a dynamicaly typed language, it seems very hard to use one
piece of data as the wrong type.
While I agree that it would be kind of foolish to implement a dynamically typed
language that wasn't strongly-typed (PL theory definition), there's no reason
you *couldn't* -- your language would just need to provide some sort of 'cast'
function that did the conversion at runtime.

For example, say the memory block for an object instance was allocated like:

[0] pointer to class object
[1] pointer to dictionary object (holding variables)

Then maybe you could do something like:
int1, int2 = cast(object(), list(int))
Given, I can't see any use for such behavior, but you *could* do it if you had a
weakly-typed dynamic language and you knew how it allocated objects. At some
level, you always have bits -- whether it makes any sense to reinterpret them
depends on exactly what the bits originally meant.
What we can do is extend the definition of weakly typed language to something
like : "A weakly typed language is a language that often uses incorectly some
piece of data by applying to it the wrong type" Such definition would include
any language that is too liberal with the type coertion like php.


Don't get me wrong -- I do understand your point. In every case I can think of,
there is no reason to want weak-typing (PL theory definition) in a
dynamically-typed language. On the other hand, I haven't really seen any good
cases for wanting weak-typing in a statically-typed language either.

Note that PHP doesn't fit your definition above anyway. When PHP allows:

"a" + 10 == 10

it's not incorrectly using "some piece of data by applying to it the wrong
type". It's doing exactly what it tells you it'll do. This is *correct*
response given the PHP language definition.

In addition, you *can* create a statically-typed language that is strongly typed
(PL theory definition) but also very liberal with type coercion. What would you
call such a language? Since being liberal with type coercion and allowing bit
reinterpretation are orthogonal, why not keep the two separate terms?

My issue here is that I don't think we should confuse an already easily confused
term by giving it a second meaning. If there aren't any dynamically typed
languages that are also weakly-typed, that's ok -- it doesn't mean we should
change the meaning of "weakly-typed" for these languages.

Steve

Jul 18 '05 #27

P: n/a
Christophe Cavalaria <chris.cavalaria <at> free.fr> writes:

If by ML you thing of OCaml you should try again. OCaml isn't type safe
because of a few modules of the standard library. The marshaling module
comes to mind. Using it you can "typecast" a pointer to an integer as a
pointer to a string and segfault in the process because the unmarshal
function trusts the caller to cast the result in the good type.


Thanks, I knew I'd read something like that somewhere. Totally surprised me too
'cause I figured that, of all people, ML folks would be the most afraid of a
module like this. =)

Steve

Jul 18 '05 #28

P: n/a
Steven Bethard wrote:
Christophe Cavalaria <chris.cavalaria <at> free.fr> writes:

One could say that the common definition of weakly typed languages
cannont apply for a dynamicaly type language. Let's face it, once you've
implemented a dynamicaly typed language, it seems very hard to use one
piece of data as the wrong type.
While I agree that it would be kind of foolish to implement a dynamically
typed language that wasn't strongly-typed (PL theory definition), there's
no reason you *couldn't* -- your language would just need to provide some
sort of 'cast' function that did the conversion at runtime.

For example, say the memory block for an object instance was allocated
like:

[0] pointer to class object
[1] pointer to dictionary object (holding variables)

Then maybe you could do something like:
int1, int2 = cast(object(), list(int))
Given, I can't see any use for such behavior, but you *could* do it if you
had a
weakly-typed dynamic language and you knew how it allocated objects. At
some level, you always have bits -- whether it makes any sense to
reinterpret them depends on exactly what the bits originally meant.


In any good dynamicaly typed language, the object must know what it is thus
there is no way to do a reinterpret_cast like in C or C++. It is
meaningless. Doing it anyway is insane as you have pointed. It's only goal
beeing to add in the flaws of the weakly static typed languages to a
contrived example.
What we can do is extend the definition of weakly typed language to
something like : "A weakly typed language is a language that often uses
incorectly some piece of data by applying to it the wrong type" Such
definition would include any language that is too liberal with the type
coertion like php.


Don't get me wrong -- I do understand your point. In every case I can
think of, there is no reason to want weak-typing (PL theory definition) in
a
dynamically-typed language. On the other hand, I haven't really seen any
good cases for wanting weak-typing in a statically-typed language either.

Note that PHP doesn't fit your definition above anyway. When PHP allows:

"a" + 10 == 10

it's not incorrectly using "some piece of data by applying to it the wrong
type". It's doing exactly what it tells you it'll do. This is *correct*
response given the PHP language definition.


Incorrectly is a view of the mind in that case. From my point of view it is
incorrect. And you could argue that taking a float * and casting it into a
int * gives us a predictable behaviour. It'll still be wrong to do it
unless it was exactly what you wanted to do. And in that case you can
create explicit language constructs to do the trick.
In addition, you *can* create a statically-typed language that is strongly
typed
(PL theory definition) but also very liberal with type coercion. What
would you
call such a language? Since being liberal with type coercion and allowing
bit reinterpretation are orthogonal, why not keep the two separate terms?

My issue here is that I don't think we should confuse an already easily
confused
term by giving it a second meaning.
I see your point but I must add that I wasn't giving that term a second
meaning. Just like in mathematics when you take a theory and you create a
new theory that encompases the old one, I was giving a new definition for
the term.
If there aren't any dynamically typed
languages that are also weakly-typed, that's ok -- it doesn't mean we
should change the meaning of "weakly-typed" for these languages.

Steve


Jul 18 '05 #29

P: n/a
Steven Bethard wrote:
Christophe Cavalaria <chris.cavalaria <at> free.fr> writes:

If by ML you thing of OCaml you should try again. OCaml isn't type safe
because of a few modules of the standard library. The marshaling module
comes to mind. Using it you can "typecast" a pointer to an integer as a
pointer to a string and segfault in the process because the unmarshal
function trusts the caller to cast the result in the good type.


Thanks, I knew I'd read something like that somewhere. Totally surprised
me too 'cause I figured that, of all people, ML folks would be the most
afraid of a module like this. =)

Steve


Marshaling ( Python calls it pickling ;) ) is somthing needed in the
standard library of any good language. Too bad for them that the OCaml
language makes it impossible to implement.
Jul 18 '05 #30

P: n/a
Christophe Cavalaria <chris.cavalaria <at> free.fr> writes:
Note that PHP doesn't fit your definition above anyway. When PHP allows:

"a" + 10 == 10

it's not incorrectly using "some piece of data by applying to it the wrong
type". It's doing exactly what it tells you it'll do. This is *correct*
response given the PHP language definition.


Incorrectly is a view of the mind in that case. From my point of view it is
incorrect. And you could argue that taking a float * and casting it into a
int * gives us a predictable behaviour. It'll still be wrong to do it
unless it was exactly what you wanted to do. And in that case you can
create explicit language constructs to do the trick.


The problem is, "exactly what you wanted to do" varies from programmer to
programmer. Some programmers may actually want "a" + 10 == 10. I can imagine
code that takes a string as input and adds its integer value to an int counter.
Invalid input should not increment the counter and may be silently ignored. I
would not write code this way, but some people would, and would want the code to
work the way PHP does.

So yes, "incorrectly is a view of the mind", but since we can't know what every
programmer is thinking, and it's extremely unlikely that every programmer will
agree what's correct or incorrect for every example, so we have to take the
language definition as the measuring stick for correct or incorrect.

Given that, your definition of weakly-typed:
"A weakly typed language is a language that often uses incorectly some piece
of data by applying to it the wrong type"
either would not call PHP a weakly-typed language, because the data is not used
incorrectly according to the language definition, or would not know what to call
PHP, because incorrectly cannot be defined in a way that applies correctly to
all programmers.

I prefer not extending "weakly-typed" in this way because it makes the term less
well-defined.

Steve

Jul 18 '05 #31

P: n/a
Diez B. Roggisch wrote:
I'm no native speaker, so I maybe confused the meaning of permanent. In
german, permanent not only means that things are enduring, but also that
things are done on regular bases: "He permanently changed his cloth."


We would say "regularly", "frequently", "habitually", or
something like that. In English, "permanently" means
"once and for all".

--
Greg Ewing, Computer Science Dept,
University of Canterbury,
Christchurch, New Zealand
http://www.cosc.canterbury.ac.nz/~greg

Jul 18 '05 #32

P: n/a
Steven Bethard <st************@gmail.com> writes:
Don't get me wrong -- I do understand your point. In every case I can think of,
there is no reason to want weak-typing (PL theory definition) in a
dynamically-typed language. On the other hand, I haven't really seen any good
cases for wanting weak-typing in a statically-typed language either.


First of all, let's mention a truly weakly-typed language. BCPL, one
of C's predecessors. Variables don't have types, they just hold
words. Operations treat those variables as having a different type:
adding them as ints, or adding them as floats. Dereferencing them to
arrive at another word. Subroutines were just variables that held a
pointer to code, and people would actually write code that looked like
(pseudo-c):

func() {
init_func ;
func = foo ;
foo: func_code ;
}

To only run the initialization code the first time the function was
invoked, but not any other times. Of course, there was an external
program (written in BCPL) that did type inferencing, and would warn
you when you used something as other than what it really was.

And yes, BCPL saw real use. I've used a desktop DOS that was written
in BCPL. The rest of the system was written in C, which made life
*very* interesting.

Now, as to why one would *want* languages that let you treat things as
other than what they were.

It's much easier to write functions that convert 16, 32 and 64 bit
quantities from network order to host order (and vice versa) if you
can treat them as an array of bytes, even though you'll want to treat
them as longer hunks while dealing with them. When talking to
hardware, you can get some really *strange* things. You may have a
location that is an address most of the time, but part of the time is
a control register full of bits to toggle. When doing cryptography,
you very often want to treat the string of characters you're
encrypting as a string of words of some length, because that's the
size chunk that the algorythm encrypts. Marshelling has already been
mentioned on this thread. You may well want to marshal ints and floats
in binary form, meaning you'll need to treat that array of bytes as
being of that type.

You can also look through the python library for places where struct
is used - most of those will involve doing something where you want to
treat a string of bytes as something else.

Finally, I don't see that there's that much difference between the two
different definitions of 'weakly typed'. Both can be described as
treating an object as if it were of some type other than what it
really is. In one case, you abuse the raw bits, and in the other you
coerce the object to a different type. Both amount to the same thing:

a = "10"
b = 5
c = a + b

In a strongly typed language, I get an error. In a weakly typed
language, I get something else. Either a pointer beyond the end of the
string a, or 15, depending on exactly how the a object is abused.

<mike
--
Mike Meyer <mw*@mired.org> http://www.mired.org/home/mwm/
Independent WWW/Perforce/FreeBSD/Unix consultant, email for more information.
Jul 18 '05 #33

P: n/a
Mike Meyer <mwm <at> mired.org> writes:

First of all, let's mention a truly weakly-typed language. BCPL, one
of C's predecessors. Variables don't have types, they just hold
words.
So BCPL had no compile time checking? If this is true, BCPL is a good example
of a dynamically- and weakly-typed (PL theory definition) language...
Finally, I don't see that there's that much difference between the two
different definitions of 'weakly typed'. Both can be described as
treating an object as if it were of some type other than what it
really is. In one case, you abuse the raw bits, and in the other you
coerce the object to a different type.


Would you then classify BCPL as weakly- or strongly-typed? It seems like you
might call it "strongly-typed" since every variable just holds words, so every
use of a variable is thus just the use of a word, thus you would never be
"treating an object as if it were of some type other than what it really is".

Steve

Jul 18 '05 #34

P: n/a
Mike Meyer <mwm <at> mired.org> writes:

Finally, I don't see that there's that much difference between the two
different definitions of 'weakly typed'. Both can be described as
treating an object as if it were of some type other than what it
really is. In one case, you abuse the raw bits, and in the other you
coerce the object to a different type.


One other thing: If you lump coercions with weak-typing, you allow the code
written in a language to adjust the degree of "weakness" of a language. Any
language (like Python) that allows you to override operators allows you to
create new coercions[1]. So if I don't like the strong/weak classification of
my language, I can always make it more "weakly-typed" by just adding more
nonstandard coercions.

IMHO, classification of a language as strongly- or weakly-typed should be
invariant of the code written in a language -- it should be something associated
with the language definition itself. Lumping weak-typing with coercion makes
this impossible.

Steve

[1]http://mail.python.org/pipermail/python-list/2004-November/249023.html

Jul 18 '05 #35

P: n/a
Steven Bethard <st************@gmail.com> writes:
Mike Meyer <mwm <at> mired.org> writes:
First of all, let's mention a truly weakly-typed language. BCPL, one
of C's predecessors. Variables don't have types, they just hold
words.

So BCPL had no compile time checking? If this is true, BCPL is a good example
of a dynamically- and weakly-typed (PL theory definition) language...


I wouldn't call BCPL dynamically typed. BCPL has no run-time type
checking either. That seems to be the defining feature of dynamically
typed languages.
Finally, I don't see that there's that much difference between the two
different definitions of 'weakly typed'. Both can be described as
treating an object as if it were of some type other than what it
really is. In one case, you abuse the raw bits, and in the other you
coerce the object to a different type.

Would you then classify BCPL as weakly- or strongly-typed? It seems like you
might call it "strongly-typed" since every variable just holds words, so every
use of a variable is thus just the use of a word, thus you would never be
"treating an object as if it were of some type other than what it really is".


A word is just a unit of storage, not a type. Words hold values with
types - integer, float, pointer, code, chars. Nothing in BCPL prevents
you from treating a word as any type at all. You can call a pointer to
string, or do an integer add of a pair of floats. So it's weakly
typed.

<mike
--
Mike Meyer <mw*@mired.org> http://www.mired.org/home/mwm/
Independent WWW/Perforce/FreeBSD/Unix consultant, email for more information.
Jul 18 '05 #36

P: n/a
JCM
Gabriel Zachmann <za**@cs.uni-bonn.de> wrote:
You didn't mention C++. Try this ...

std::string s = "Wow";
s += 3.141592654; // Perfectly valid
s = 3.141592654; // Also valid
ah, good example. So, would it be valid to say:
the more coercion (or automatic conversion) rules a language has, the
weaker the typing?


If that's what your definition of weak typing is. The OP seemed to be
asking about re-interpreting the representation of a value of one type
as a different type.
Jul 18 '05 #37

P: n/a

Just a little question:

would sort of a summary of this thread be of any help?

Regards,
gab.

--
/-------------------------------------------------------------------------\
| There are works which wait, |
| and which one does not understand for a long time; [...] |
| for the question often arrives a terribly long time after the answer. |
| (Oscar Wilde) |
+-------------------------------------------------------------------------+
| za**@cs.uni-bonn.de __@/' www.gabrielzachmann.org |
\-------------------------------------------------------------------------/
Jul 18 '05 #38

P: n/a
JCM
JCM <jo******************@myway.com> wrote:
Gabriel Zachmann <za**@cs.uni-bonn.de> wrote:
You didn't mention C++. Try this ...

std::string s = "Wow";
s += 3.141592654; // Perfectly valid
s = 3.141592654; // Also valid
ah, good example. So, would it be valid to say:
the more coercion (or automatic conversion) rules a language has, the
weaker the typing?
If that's what your definition of weak typing is. The OP seemed to be
asking about re-interpreting the representation of a value of one type
as a different type.


Oops--I guess you are the OP.

When people talk about "weak typing" they generally mean either
implicit conversions (or operations on values of different types),
or reinterpreting representations of values as a different type.
The former, in my opinion, is not about weak typing.
Jul 18 '05 #39

P: n/a
Gabriel Zachmann writes:

would sort of a summary of this thread be of any help?


Here's a first stab at one:
In summary, there are basically three interpretations of "weak-typing" discussed
in this thread:

(1) A language is "weakly-typed" if it allows code to take a block of memory
that was originally defined as one type and reinterpret the bits of this block
as another type.

(2) A language is "weakly-typed" if it has a large number of implicit coercions.

(3) A language is "weakly-typed" if it often treats objects of one type as other
types.
Some points and problems addressed with each of these definitions:

Definition 1 is the definition most commonly used in Programming Languages
literature, and allows a language to be called "weakly-typed" based only on the
language definition. However, for all intents and purposes, it is only
applicable to statically typed languages; no one on the list could come up with
a dyamically typed language that allowed bit-reinterpretation.

Definition 2 seemed to be the definition most commonly used on the list, most
likely because it is actually applicable to a dynamically typed language like
Python. It has the problem that in a language that supports operator
overloading (like Python), programmers can make their language more
"weakly-typed" by simply providing additional coercions, thus whether or not a
language is called "weakly-typed" depends both on the language definition and
any code written in the language.

Definition 3 was an attempt to unify the first two definitions into a single
definition by describing both coercion and bit-reinterpretation as treating
"objects of one type as other types". This definition has the advantage of
better coverage, but has all the disadvantages of Definition 2. It is also
unclear as to how weak a "weakly-typed" language is if it both allows
bit-reinterpretation and has a large number of implicit coercions. (For
example, is a language that allows bit-reinterpretation and only a few implicit
coercions more or less "weakly-typed" than a language that doesn't allow
bit-reinterpretation, but has a large number of implicit coercions?)
I'll leave it to others to classify the various languages by these definitions.
;)

Steve
Jul 18 '05 #40

P: n/a
JCM <jo******************@myway.com> wrote:
When people talk about "weak typing" they generally mean either
implicit conversions (or operations on values of different types),
or reinterpreting representations of values as a different type.


I've seen people complain about "weak typing" mostly to mean an entirely
different issue: types being attached to objects and NOT to names.
Alex
Jul 18 '05 #41

P: n/a
Steven Bethard <st************@gmail.com> wrote:
On the other hand, I haven't really seen any good
cases for wanting weak-typing in a statically-typed language either.


How would an operating system's filesystems store arbitrary sequences of
bytes (which might be floats, int, whatever -- only the application
knows) into disk pages (blocks of, say , 4096 bytes each) otherwise? Or
are you saying that operating systems' kernels should all be implemented
in dynamically-typed languages, or that the structureless filesystem
concept that was the fortune of Unix (and is common today to other OSs,
too), is not "good"?

Even if you design a new OS based on a filesystem whose files are all
"strongly typed" (EEK, but that's another issue), how do you have
interoperate with other boxes, with the whole internet, without the
ability to type-pun ("weak-typing") when necessary...?
Alex
Jul 18 '05 #42

P: n/a
Steven Bethard wrote:
Gabriel Zachmann writes:

would sort of a summary of this thread be of any help?


Here's a first stab at one:

[...]
(2) A language is "weakly-typed" if it has a large number of implicit coercions.

[...]
Definition 2 seemed to be the definition most commonly used on the list, most
likely because it is actually applicable to a dynamically typed language like
Python. It has the problem that in a language that supports operator
overloading (like Python), programmers can make their language more
"weakly-typed" by simply providing additional coercions, thus whether or not a
language is called "weakly-typed" depends both on the language definition and
any code written in the language.


A case could be made that this "problem" isn't really valid if you look
at "implicit coercions" in the right way.

I'd argue that a programmer-overloaded operation providing coercion is
not _implicit_ in the same sense that language-default coercion is.
Admittedly, the coercion may not be immediately evident at the point of
use, but one can still find the explicitly-coercing code somewhere
inside the application (and/or included libraries). In contrast, the
coercions that happen in Perl, PHP, etc., are not explicitly stated
*anywhere* in the application. The difference between these two
scenarios is, at least in my mind, very distinct and (at least as far as
language philosophy) very profound -- it's one of *permitting*
semi-implicit coercions (if the programmer *really* wants them) versus
one of *mandating* implicit coercions whether the programmer wants them
or not.

In other words, definition 2 should read that a language can be
considered "weakly typed" if the *language definition* specifies a large
number of implicit coercions.

Jeff Shannon
Technician/Programmer
Credit International

Jul 18 '05 #43

P: n/a
Steven Bethard <st************@gmail.com> wrote:
Some programmers may actually want "a" + 10 == 10.


Some people may actually want to drink poisoned kool-aid and join the
great wise extraterrestrials on their comet in the skies. That doesn't
mean I will look with favour upon those who aid and abet such goals.
Alex
Jul 18 '05 #44

P: n/a
JCM
Alex Martelli <al*****@yahoo.com> wrote:
JCM <jo******************@myway.com> wrote:
When people talk about "weak typing" they generally mean either
implicit conversions (or operations on values of different types),
or reinterpreting representations of values as a different type.

I've seen people complain about "weak typing" mostly to mean an entirely
different issue: types being attached to objects and NOT to names.


Ah yep, there's that one too. But I hope most people call that static
typing. At least I think I do. c.l.py is bad for my mental dictionary.
Jul 18 '05 #45

P: n/a
JCM
Steven Bethard <st************@gmail.com> wrote:
(1) A language is "weakly-typed" if it allows code to take a block of memory .... Definition 1 is the definition most commonly used in Programming
Languages literature, and allows a language to be called
"weakly-typed" based only on the language definition. However, for
all intents and purposes, it is only applicable to statically typed
languages; no one on the list could come up with a dyamically typed
language that allowed bit-reinterpretation.


Assembly language. The types of values are implied by what
instructions you use.
Jul 18 '05 #46

P: n/a
Alex Martelli <aleaxit <at> yahoo.com> writes:

Steven Bethard <steven.bethard <at> gmail.com> wrote:
On the other hand, I haven't really seen any good
cases for wanting weak-typing in a statically-typed language either.
How would an operating system's filesystems store arbitrary sequences of
bytes (which might be floats, int, whatever -- only the application
knows) into disk pages (blocks of, say , 4096 bytes each) otherwise?


Valid point of course. But the OS doesn't really take advantage of weak-typing
here if it takes an arbitrary sequence of bytes and stores an arbitrary sequence
of bytes. I haven't written much OS code (just a prototype system back in
undergrad), but I never cast one type of struct to another -- to and from void*,
but never between types.

Of course, I'm sure there're are a number of good reasons to do so -- my claim
was only that I hadn't seen them. I'd be grateful if you could point me to an
example. =)
Even if you design a new OS based on a filesystem whose files are all
"strongly typed"


You really do think I'm satan, don't you? ;)

Steve
Jul 18 '05 #47

P: n/a
JCM <joshway_without_spam <at> myway.com> writes:
Definition 1 is the definition most commonly used in Programming
Languages literature.... However, for
all intents and purposes, it is only applicable to statically typed
languages; no one on the list could come up with a dyamically typed
language that allowed bit-reinterpretation.


Assembly language. The types of values are implied by what
instructions you use.


I'm sure some people would argue that assembly language is untyped (not
statically or dynamically typed) and that the operations are defined on bits,
but this is definitely the best example I've seen. Thanks!

Steve

Jul 18 '05 #48

P: n/a
Steven Bethard <st************@gmail.com> wrote:
Alex Martelli <aleaxit <at> yahoo.com> writes:

Steven Bethard <steven.bethard <at> gmail.com> wrote:
On the other hand, I haven't really seen any good
cases for wanting weak-typing in a statically-typed language either.
How would an operating system's filesystems store arbitrary sequences of
bytes (which might be floats, int, whatever -- only the application
knows) into disk pages (blocks of, say , 4096 bytes each) otherwise?


Valid point of course. But the OS doesn't really take advantage of
weak-typing here if it takes an arbitrary sequence of bytes and stores an
arbitrary sequence of bytes. I haven't written much OS code (just a
prototype system back in undergrad), but I never cast one type of struct
to another -- to and from void*, but never between types.


Is the OS going to be able to read something from disk and *USE* it?
Of course, I'm sure there're are a number of good reasons to do so -- my
claim was only that I hadn't seen them. I'd be grateful if you could
point me to an example. =)


Suppose for example that you would like your OS to be able to load
executable code from disk into memory and execute it. Suppose you would
like it to be able to read some configuration parameters from disk and
set its own internal data structures accordingly. At one level, as it
goes to disk or comes back from there, you have arrays of bytes. But in
memory, you want functions that can be called appropriately (a device
driver residing in a module) or data structures which, differently from
an array of byte, DO have structure -- for example, a partition table
for a disk, with information to specific filesystem drivers as to what
partition is to be treated in what way.

I don't understand how you can have failed to see a zillion more
examples of operating systems actively _using_ data read from disk,
since it's such a widespread phenomenon nowadays.

Even if you design a new OS based on a filesystem whose files are all
"strongly typed"


You really do think I'm satan, don't you? ;)


I'm old enough to have fought my way through filesystems more strongly
typed than plain streams of bytes, sure -- Unix was already around but
not all-pervasive yet. I remember peripherals which wanted streams, but
streams of _SIX_-bit "bytes" -- so you had to have somewhere a pack-and-
unpack routine that could take (e.g.) a block of 48 8-bit bytes and
reinterpret it as a block of 64 6-bit bytes, or viceversa. I hope there
aren't any more of _those_ around -- but in exchange we have, for
example, pervasive issues of unicode vs byte streams and encodings.

It's not just files, either. We have memory sliced up into pages, and a
page is, say, a well defined object of 4096 bytes. But we want to be
able to store all different kinds of stuff into those bytes - we HAVE
to, in fact, because that is all the memory we have... all pages...

If your point is that all you need is, not to "overlay" different
structures onto the same address, but "just" to overlay "look at this as
raw bytes" upon any structure and viceversa -- can't you see you're
doing exactly the same thing with just one conceptual extra step?
Instead of
struct foo* p = ...;
struct bar* q = (struct bar*) p;
you're thinking
struct foo* p = ...;
void *v = (void *) p;
struct bar* q = (struct bar*) v;
but it's just the same thing, and v can be optimized away.

Above the lowest levels, you can get away with (at least conceptually)
copying stuff in order to be able to reinterpret bits, as you can do in
Python with x = struct.unpack(f1, struct.pack(f2, y)) -- you can do
plenty of bit-level reinterpretation but not *in-place*, only via
copying. But you can't generally afford that luxury when you dig deep
enough -- copying bits around when you only need to reinterpret them is
paying a real cost in memory and CPU and bus bandwidth, after all. It
doesn't have to be OS-level: any virtual machine has similar issues.
So, look at the CPython interpreter sources, for example... what
performance price would it have to pay if it couldn't cast pointers but
rather had to copy bits around each time it now does a cast?
Alex
Jul 18 '05 #49

P: n/a
> A case could be made that this "problem" isn't really valid if you look
at "implicit coercions" in the right way.

I'd argue that a programmer-overloaded operation providing coercion is
not _implicit_ in the same sense that language-default coercion is.
Admittedly, the coercion may not be immediately evident at the point of
use, but one can still find the explicitly-coercing code somewhere
inside the application (and/or included libraries). In contrast, the
coercions that happen in Perl, PHP, etc., are not explicitly stated
*anywhere* in the application. The difference between these two
scenarios is, at least in my mind, very distinct and (at least as far as
language philosophy) very profound -- it's one of *permitting*
semi-implicit coercions (if the programmer *really* wants them) versus
one of *mandating* implicit coercions whether the programmer wants them
or not.


I'd second that - writing apus in php can lead to great surprises of what
actually happens - take this for example:

$foo = "abc";
$foo[0] = 65;

The result is

"6bc"

I have no idea what php actually _does_ here- perform a string conversion on
65, then taking the most signficant digit? There's all sorts of stuff like
that in php.

So while overloading allows for deliberate (and thus hopefully well-defined
or at least more or less understood) coercions, built-in doesen't.

And don't forget: If you don't like the way someone overloaded some
operator, you can alter that behaviour according to your own design
philosophies.
--
Regards,

Diez B. Roggisch
Jul 18 '05 #50

94 Replies

This discussion thread is closed

Replies have been disabled for this discussion.