469,330 Members | 1,270 Online
Bytes | Developer Community
New Post

Home Posts Topics Members FAQ

Post your question to a community of 469,330 developers. It's quick & easy.

Memory Limit for Visual Studio 2005???

It looks like System::Collections::Generic.List throws and OUT_OF_MEMORY
exception whenever memory allocated exceeds 256 MB. I have 1024 MB on my system
so I am not even out of physical RAM, much less virtual memory.

Are other people experiencing this same problem?

Dec 22 '06
81 4063

"Willy Denoyette [MVP]" <wi*************@telenet.bewrote in message
news:Ou**************@TK2MSFTNGP06.phx.gbl...
Trimmed...
"Peter Olcott" <No****@SeeScreen.comwrote in message
news:rh*********************@newsfe07.phx...
>I think that I just found out the reason for the difference. Visual C++ 6.0
std::vector has a memory growth factor of 1.5. Whereas Generic.List has been
reported to have a memory growth factor of 2.0. The next reallocation of
std::vector will fit into contiguous free RAM because it is only 50% larger.

The next allocation of Generic.List will not because it is twice as big. Both
of the prior allocations fit into actual RAM without the need of virtual
memory. Although the next allocation of std::vector will fit into free RAM,
it must write the current data to virtual memory to make room.

You are getting close, the growth factor in C++ is implementation dependent
and is not standard defined, but it may (and does) vary from implementation to
implementation.
The growth factor for generic containers in .NET is not a fixed 2.0 factor but
varies depending the actual capacity of the container, that is it starts with
factor 2 for small containers and once it has reached a threshold it drops to
1.5. A growth factor of 2 is advantageous, in terms of performance (less GC
pressure), for small containers that grow quickly, but is disadvantageous for
large containers in terms of memory consumption.
But there is more, containers that are >85KB are allocated from the so called
Large Object Heap (LHO), and this one isn't compacted by the GC after a
collection run, simply because it's too expensive to move these large objects
around in memory, that means that you can end with a fragmented LOH heap if
you don't care about your allocation scheme.
Think what's happening in this scenarion:
thread T1 allocates a List<int>() say L1 and starts filling the List with
1000000 int's
at the same time T1 fills L1, thread T2 allocates a List<double>() say l2and
starts filling this list with 100000 doubles
this will result in a highly fragmented LOH especially when one of the
containers are long
But isn't the term Garbage Collection and eliminating fragmented memory
one-and-the-same thing?
living. In this case you may even get OOM exceptions when allocating much
smaller objects than the total free heap space. In such scenario the only
solution is to start with pre-allocated containers, say 250000 for L1 and
25000 for L2 to reduce the number of fragments if you don't know the exact
"end size", but much better is to allocate the end size. Anyway, native or
managed you must be prepared to receive OOM exceptions but more importantly -
you should try to prevent OOM when allocating large objects, pre-allocating
is such a technique and as a bonus it helps with performance.
The native heap is never compacted by the C++ allocator, that means that
native applications are more sensible to fragmentation than managed
application, that's one of the many reasons the GC has been invented.

Willy.


Dec 23 '06 #51
"Peter Olcott" <No****@SeeScreen.comwrote in message
news:tx*******************@newsfe23.lga...
>
"Willy Denoyette [MVP]" <wi*************@telenet.bewrote in message
news:Ou**************@TK2MSFTNGP06.phx.gbl...
>Trimmed...
"Peter Olcott" <No****@SeeScreen.comwrote in message
news:rh*********************@newsfe07.phx...
>>I think that I just found out the reason for the difference. Visual C++ 6.0 std::vector
has a memory growth factor of 1.5. Whereas Generic.List has been reported to have a
memory growth factor of 2.0. The next reallocation of std::vector will fit into
contiguous free RAM because it is only 50% larger.

The next allocation of Generic.List will not because it is twice as big. Both of the
prior allocations fit into actual RAM without the need of virtual memory. Although the
next allocation of std::vector will fit into free RAM, it must write the current data to
virtual memory to make room.

You are getting close, the growth factor in C++ is implementation dependent and is not
standard defined, but it may (and does) vary from implementation to implementation.
The growth factor for generic containers in .NET is not a fixed 2.0 factor but varies
depending the actual capacity of the container, that is it starts with factor 2 for
small containers and once it has reached a threshold it drops to 1.5. A growth factor of
2 is advantageous, in terms of performance (less GC pressure), for small containers that
grow quickly, but is disadvantageous for large containers in terms of memory consumption.
But there is more, containers that are >85KB are allocated from the so called Large
Object Heap (LHO), and this one isn't compacted by the GC after a collection run, simply
because it's too expensive to move these large objects around in memory, that means that
you can end with a fragmented LOH heap if you don't care about your allocation scheme.
Think what's happening in this scenarion:
thread T1 allocates a List<int>() say L1 and starts filling the List with 1000000 int's
at the same time T1 fills L1, thread T2 allocates a List<double>() say l2and starts
filling this list with 100000 doubles
this will result in a highly fragmented LOH especially when one of the containers are
long

But isn't the term Garbage Collection and eliminating fragmented memory one-and-the-same
thing?
Yes, but "compactation" only applies to the Gen0, 1 and 2 heaps, that is for the heaps that
hold the objects smaller than 85KB, the LOH is collected but never compacted, it's simply to
expensive to compact the heap for such large objects. You really should pre-allocate large
objects as much as you can, if you don't, you probably will waste a lot more memory than if
you did.
Again take a look at the allocation scheme for a non pre-alloc'd List (well, to be exact
the underlying array) :

say the LOH starts at 0x02000000 (note that the small objects won't be allacoted from the
LOH)
and say that the first array (128kb bytes) in the LOH starts at :
0x02000000 array size = 12 + 131072= 131084 (object header + array values)
after the first four expansions, and supposed no other thread allocates in between, you will
have a LOH that looks like:

0x02000000 - 131084bytes (1)
0x0202000c - 262156 bytes (2)
0x0204000c- 524300 bytes (3)
0x0208000c - 1048588 bytes (4)
0x0210000c - 2097164 bytes (5)
....
see when expanding to (5), 1.2 and 3 are no longer needed, but their sum is not large
enough to accommodate the 2097164 bytes needed by (5)
Now, this goes on until you have a "free trailer" of let's say ~256MB followed by an array
of 512MB which need to get expanded to 1GB. That means that you'll get an OOM because the
allocator can't find a contiguous free space of 1GB in memory in top of the already occupied
(but still needed) 512MB, and the 256MB area which is too small anyway.
Note that the "free trailer" can be used by other threads to allocate objects from, the
"old" array objects in this area are freed by the GC...

Willy.

Dec 23 '06 #52

"Willy Denoyette [MVP]" <wi*************@telenet.bewrote in message
news:%2****************@TK2MSFTNGP02.phx.gbl...
"Peter Olcott" <No****@SeeScreen.comwrote in message
news:tx*******************@newsfe23.lga...
>>
"Willy Denoyette [MVP]" <wi*************@telenet.bewrote in message
news:Ou**************@TK2MSFTNGP06.phx.gbl...
>>Trimmed...
"Peter Olcott" <No****@SeeScreen.comwrote in message
news:rh*********************@newsfe07.phx...

I think that I just found out the reason for the difference. Visual C++ 6.0
std::vector has a memory growth factor of 1.5. Whereas Generic.List has
been reported to have a memory growth factor of 2.0. The next reallocation
of std::vector will fit into contiguous free RAM because it is only 50%
larger.

The next allocation of Generic.List will not because it is twice as big.
Both of the prior allocations fit into actual RAM without the need of
virtual memory. Although the next allocation of std::vector will fit into
free RAM, it must write the current data to virtual memory to make room.
You are getting close, the growth factor in C++ is implementation dependent
and is not standard defined, but it may (and does) vary from implementation
to implementation.
The growth factor for generic containers in .NET is not a fixed 2.0 factor
but varies depending the actual capacity of the container, that is it
starts with factor 2 for small containers and once it has reached a
threshold it drops to 1.5. A growth factor of 2 is advantageous, in terms of
performance (less GC pressure), for small containers that grow quickly, but
is disadvantageous for large containers in terms of memory consumption.
But there is more, containers that are >85KB are allocated from the so
called Large Object Heap (LHO), and this one isn't compacted by the GC after
a collection run, simply because it's too expensive to move these large
objects around in memory, that means that you can end with a fragmented LOH
heap if you don't care about your allocation scheme.
Think what's happening in this scenarion:
thread T1 allocates a List<int>() say L1 and starts filling the List with
1000000 int's
at the same time T1 fills L1, thread T2 allocates a List<double>() say l2and
starts filling this list with 100000 doubles
this will result in a highly fragmented LOH especially when one of the
containers are long

But isn't the term Garbage Collection and eliminating fragmented memory
one-and-the-same thing?

Yes, but "compactation" only applies to the Gen0, 1 and 2 heaps, that is for
the heaps that hold the objects smaller than 85KB, the LOH is collected but
never compacted, it's simply to expensive to compact the heap for such large
objects.
This should be a user selectable option instead of mandatory. Making some tasks
infeasible because the memory manager thinks that it is not fast enough is not
the ideal solution.
You really should pre-allocate large objects as much as you can, if you don't,
you probably will waste a lot more memory than if you did.
Yet I simply can't do that. The amount of memory that I need is completely
unpredictable. It can be anywhere from 25K to far more than the machine has.
Again take a look at the allocation scheme for a non pre-alloc'd List (well,
to be exact the underlying array) :

say the LOH starts at 0x02000000 (note that the small objects won't be
allacoted from the LOH)
and say that the first array (128kb bytes) in the LOH starts at :
0x02000000 array size = 12 + 131072= 131084 (object header + array values)
after the first four expansions, and supposed no other thread allocates in
between, you will have a LOH that looks like:

0x02000000 - 131084bytes (1)
0x0202000c - 262156 bytes (2)
0x0204000c- 524300 bytes (3)
0x0208000c - 1048588 bytes (4)
0x0210000c - 2097164 bytes (5)
...
see when expanding to (5), 1.2 and 3 are no longer needed, but their sum is
not large enough to accommodate the 2097164 bytes needed by (5)
Now, this goes on until you have a "free trailer" of let's say ~256MB followed
by an array of 512MB which need to get expanded to 1GB. That means that you'll
get an OOM because the allocator can't find a contiguous free space of 1GB in
memory in top of the already occupied (but still needed) 512MB, and the 256MB
area which is too small anyway.
Note that the "free trailer" can be used by other threads to allocate objects
from, the "old" array objects in this area are freed by the GC...

Willy.

Dec 23 '06 #53
"Peter Olcott" <No****@SeeScreen.comwrote in message
news:ZS***************@newsfe10.phx...
>
"Willy Denoyette [MVP]" <wi*************@telenet.bewrote in message
news:%2****************@TK2MSFTNGP02.phx.gbl...
>"Peter Olcott" <No****@SeeScreen.comwrote in message
news:tx*******************@newsfe23.lga...
>>>
"Willy Denoyette [MVP]" <wi*************@telenet.bewrote in message
news:Ou**************@TK2MSFTNGP06.phx.gbl...
Trimmed...
"Peter Olcott" <No****@SeeScreen.comwrote in message
news:rh*********************@newsfe07.phx...

I think that I just found out the reason for the difference. Visual C++ 6.0
std::vector has a memory growth factor of 1.5. Whereas Generic.List has been reported
to have a memory growth factor of 2.0. The next reallocation of std::vector will fit
into contiguous free RAM because it is only 50% larger.
>
The next allocation of Generic.List will not because it is twice as big. Both of the
prior allocations fit into actual RAM without the need of virtual memory. Although the
next allocation of std::vector will fit into free RAM, it must write the current data
to virtual memory to make room.
>

You are getting close, the growth factor in C++ is implementation dependent and is not
standard defined, but it may (and does) vary from implementation to implementation.
The growth factor for generic containers in .NET is not a fixed 2.0 factor but varies
depending the actual capacity of the container, that is it starts with factor 2 for
small containers and once it has reached a threshold it drops to 1.5. A growth factor
of 2 is advantageous, in terms of performance (less GC pressure), for small containers
that grow quickly, but is disadvantageous for large containers in terms of memory
consumption.
But there is more, containers that are >85KB are allocated from the so called Large
Object Heap (LHO), and this one isn't compacted by the GC after a collection run,
simply because it's too expensive to move these large objects around in memory, that
means that you can end with a fragmented LOH heap if you don't care about your
allocation scheme.
Think what's happening in this scenarion:
thread T1 allocates a List<int>() say L1 and starts filling the List with 1000000 int's
at the same time T1 fills L1, thread T2 allocates a List<double>() say l2and starts
filling this list with 100000 doubles
this will result in a highly fragmented LOH especially when one of the containers are
long

But isn't the term Garbage Collection and eliminating fragmented memory one-and-the-same
thing?

Yes, but "compactation" only applies to the Gen0, 1 and 2 heaps, that is for the heaps
that hold the objects smaller than 85KB, the LOH is collected but never compacted, it's
simply to expensive to compact the heap for such large objects.

This should be a user selectable option instead of mandatory. Making some tasks infeasible
because the memory manager thinks that it is not fast enough is not the ideal solution.
>You really should pre-allocate large objects as much as you can, if you don't, you
probably will waste a lot more memory than if you did.

Yet I simply can't do that. The amount of memory that I need is completely unpredictable.
It can be anywhere from 25K to far more than the machine has.
This is a bad excuse, pre-allocate more than you reasonably need and trim the excess, what's
the point it's all virtual memory. But don't forget - you can't allocate a single object
that is larger than 2GB anyway.
Note, this is my last reply, it makes no sense to continue this discussion. You simply don't
understand how the 1) the OS memory manager works 2) how GC collector works and what it
takes to move objects around, what the impact is on all running threads in the system, YES!
IN THE SYSTEM when some 500MB of data has to move from one location to another. One
suggestion though, stay away from .NET and return to native code , you are simply not ready
for it.

Willy.

Dec 23 '06 #54

"Willy Denoyette [MVP]" <wi*************@telenet.bewrote in message
news:%2****************@TK2MSFTNGP02.phx.gbl...
"Peter Olcott" <No****@SeeScreen.comwrote in message
news:ZS***************@newsfe10.phx...
>>
"Willy Denoyette [MVP]" <wi*************@telenet.bewrote in message
news:%2****************@TK2MSFTNGP02.phx.gbl...
>>"Peter Olcott" <No****@SeeScreen.comwrote in message
news:tx*******************@newsfe23.lga...

"Willy Denoyette [MVP]" <wi*************@telenet.bewrote in message
news:Ou**************@TK2MSFTNGP06.phx.gbl...
Trimmed...
"Peter Olcott" <No****@SeeScreen.comwrote in message
news:rh*********************@newsfe07.phx...
>
>I think that I just found out the reason for the difference. Visual C++
>6.0 std::vector has a memory growth factor of 1.5. Whereas Generic.List
>has been reported to have a memory growth factor of 2.0. The next
>reallocation of std::vector will fit into contiguous free RAM because it
>is only 50% larger.
>>
>The next allocation of Generic.List will not because it is twice as big.
>Both of the prior allocations fit into actual RAM without the need of
>virtual memory. Although the next allocation of std::vector will fit into
>free RAM, it must write the current data to virtual memory to make room.
>>
>
You are getting close, the growth factor in C++ is implementation
dependent and is not standard defined, but it may (and does) vary from
implementation to implementation.
The growth factor for generic containers in .NET is not a fixed 2.0 factor
but varies depending the actual capacity of the container, that is it
starts with factor 2 for small containers and once it has reached a
threshold it drops to 1.5. A growth factor of 2 is advantageous, in terms
of performance (less GC pressure), for small containers that grow quickly,
but is disadvantageous for large containers in terms of memory
consumption.
But there is more, containers that are >85KB are allocated from the so
called Large Object Heap (LHO), and this one isn't compacted by the GC
after a collection run, simply because it's too expensive to move these
large objects around in memory, that means that you can end with a
fragmented LOH heap if you don't care about your allocation scheme.
Think what's happening in this scenarion:
thread T1 allocates a List<int>() say L1 and starts filling the List with
1000000 int's
at the same time T1 fills L1, thread T2 allocates a List<double>() say
l2and starts filling this list with 100000 doubles
this will result in a highly fragmented LOH especially when one of the
containers are long

But isn't the term Garbage Collection and eliminating fragmented memory
one-and-the-same thing?
Yes, but "compactation" only applies to the Gen0, 1 and 2 heaps, that is for
the heaps that hold the objects smaller than 85KB, the LOH is collected but
never compacted, it's simply to expensive to compact the heap for such large
objects.

This should be a user selectable option instead of mandatory. Making some
tasks infeasible because the memory manager thinks that it is not fast enough
is not the ideal solution.
>>You really should pre-allocate large objects as much as you can, if you
don't, you probably will waste a lot more memory than if you did.

Yet I simply can't do that. The amount of memory that I need is completely
unpredictable. It can be anywhere from 25K to far more than the machine has.
This is a bad excuse, pre-allocate more than you reasonably need and trim the
excess, what's the point it's all virtual memory. But don't forget - you can't
allocate a single object that is larger than 2GB anyway.
I have no idea in advance how much memory that I will need. It can be anywhere
at all between 25K and several gigabytes, should I follow your advice and always
allocates at least one GB, or is it that your advice may not apply to my
situation?

I could run through the whole process twice, and then know how much memory that
I need in advance, yet this will double the time that my process takes. The
amount of memory that I need depends upon a number of things that can not
possibly be predicted or even reasonably estimated in advance.
Note, this is my last reply, it makes no sense to continue this discussion.
You simply don't understand how the 1) the OS memory manager works 2) how GC
collector works and what it takes to move objects around, what the impact is
on all running threads in the system, YES! IN THE SYSTEM when some 500MB of
data has to move from one location to another. One suggestion though, stay
away from .NET and return to native code , you are simply not ready for it.

Willy.

Dec 23 '06 #55
No :) but then I never had a reason too.

I would have a tendency if you don't know what you are going to need but
expect it to be very large more often then not to write the data to temp
file first. If it turns out to be small you can load the file to memory,
otherwise process it from disk.

Regards,
John

"Peter Olcott" <No****@SeeScreen.comwrote in message
news:ZZ*******************@newsfe13.phx...
Try and add one gig of Bytes to a List<Byteand see if it doesn't
abnormally terminate.

"John J. Hughes II" <in*****@nowhere.comwrote in message
news:Oy**************@TK2MSFTNGP04.phx.gbl...
>>I checked in task manager and my peek memory usage for VS2005 is 288,060K
but then I have 2Gigs on this system.

Regards,
John

"Peter Olcott" <No****@SeeScreen.comwrote in message
news:RK*******************@newsfe12.phx...
>>It looks like System::Collections::Generic.List throws and OUT_OF_MEMORY
exception whenever memory allocated exceeds 256 MB. I have 1024 MB on my
system so I am not even out of physical RAM, much less virtual memory.

Are other people experiencing this same problem?



Dec 23 '06 #56

"John J. Hughes II" <in*****@nowhere.comwrote in message
news:%2****************@TK2MSFTNGP02.phx.gbl...
No :) but then I never had a reason too.

I would have a tendency if you don't know what you are going to need but
expect it to be very large more often then not to write the data to temp file
first. If it turns out to be small you can load the file to memory,
otherwise process it from disk.
I expect to eventually have as many as millions of users, so I want to produce
the best possible solution. Since disk access is something like 1000-fold slower
than memory access, I don't want to take this kind of performance hit.
>
Regards,
John

"Peter Olcott" <No****@SeeScreen.comwrote in message
news:ZZ*******************@newsfe13.phx...
>Try and add one gig of Bytes to a List<Byteand see if it doesn't abnormally
terminate.

"John J. Hughes II" <in*****@nowhere.comwrote in message
news:Oy**************@TK2MSFTNGP04.phx.gbl...
>>>I checked in task manager and my peek memory usage for VS2005 is 288,060K but
then I have 2Gigs on this system.

Regards,
John

"Peter Olcott" <No****@SeeScreen.comwrote in message
news:RK*******************@newsfe12.phx...
It looks like System::Collections::Generic.List throws and OUT_OF_MEMORY
exception whenever memory allocated exceeds 256 MB. I have 1024 MB on my
system so I am not even out of physical RAM, much less virtual memory.

Are other people experiencing this same problem?





Dec 23 '06 #57

"Willy Denoyette [MVP]" <wi*************@telenet.bewrote in message
news:%2****************@TK2MSFTNGP02.phx.gbl...
"Peter Olcott" <No****@SeeScreen.comwrote in message
news:ZS***************@newsfe10.phx...
>>
"Willy Denoyette [MVP]" <wi*************@telenet.bewrote in message
news:%2****************@TK2MSFTNGP02.phx.gbl...
>>"Peter Olcott" <No****@SeeScreen.comwrote in message
news:tx*******************@newsfe23.lga...

"Willy Denoyette [MVP]" <wi*************@telenet.bewrote in message
news:Ou**************@TK2MSFTNGP06.phx.gbl...
Trimmed...
"Peter Olcott" <No****@SeeScreen.comwrote in message
news:rh*********************@newsfe07.phx...
>
>I think that I just found out the reason for the difference. Visual C++
>6.0 std::vector has a memory growth factor of 1.5. Whereas Generic.List
>has been reported to have a memory growth factor of 2.0. The next
>reallocation of std::vector will fit into contiguous free RAM because it
>is only 50% larger.
>>
>The next allocation of Generic.List will not because it is twice as big.
>Both of the prior allocations fit into actual RAM without the need of
>virtual memory. Although the next allocation of std::vector will fit into
>free RAM, it must write the current data to virtual memory to make room.
>>
>
You are getting close, the growth factor in C++ is implementation
dependent and is not standard defined, but it may (and does) vary from
implementation to implementation.
The growth factor for generic containers in .NET is not a fixed 2.0 factor
but varies depending the actual capacity of the container, that is it
starts with factor 2 for small containers and once it has reached a
threshold it drops to 1.5. A growth factor of 2 is advantageous, in terms
of performance (less GC pressure), for small containers that grow quickly,
but is disadvantageous for large containers in terms of memory
consumption.
But there is more, containers that are >85KB are allocated from the so
called Large Object Heap (LHO), and this one isn't compacted by the GC
after a collection run, simply because it's too expensive to move these
large objects around in memory, that means that you can end with a
fragmented LOH heap if you don't care about your allocation scheme.
Think what's happening in this scenarion:
thread T1 allocates a List<int>() say L1 and starts filling the List with
1000000 int's
at the same time T1 fills L1, thread T2 allocates a List<double>() say
l2and starts filling this list with 100000 doubles
this will result in a highly fragmented LOH especially when one of the
containers are long

But isn't the term Garbage Collection and eliminating fragmented memory
one-and-the-same thing?
Yes, but "compactation" only applies to the Gen0, 1 and 2 heaps, that is for
the heaps that hold the objects smaller than 85KB, the LOH is collected but
never compacted, it's simply to expensive to compact the heap for such large
objects.

This should be a user selectable option instead of mandatory. Making some
tasks infeasible because the memory manager thinks that it is not fast enough
is not the ideal solution.
>>You really should pre-allocate large objects as much as you can, if you
don't, you probably will waste a lot more memory than if you did.

Yet I simply can't do that. The amount of memory that I need is completely
unpredictable. It can be anywhere from 25K to far more than the machine has.
This is a bad excuse, pre-allocate more than you reasonably need and trim the
excess, what's the point it's all virtual memory. But don't forget - you can't
allocate a single object that is larger than 2GB anyway.
It turns out that your advice will work. I don't actually have to run through
the whole process twice to determine my memory requirements. I can run through
one part of the process once, and store the results, then use these results in
the next step.

This design is a cleaner design because I can know in advance whether or not I
am going to have any memory problems, and simply inform the user that there is
not enough memory for the requested task, rather than have to deal with
OUT_OF_MEMORY exception processing.
Note, this is my last reply, it makes no sense to continue this discussion.
You simply don't understand how the 1) the OS memory manager works 2) how GC
collector works and what it takes to move objects around, what the impact is
on all running threads in the system, YES! IN THE SYSTEM when some 500MB of
data has to move from one location to another. One suggestion though, stay
away from .NET and return to native code , you are simply not ready for it.

Willy.

Dec 23 '06 #58
"Peter Olcott" <No****@SeeScreen.comwrote:
>I expect to eventually have as many as millions of users, so I want to produce
the best possible solution. Since disk access is something like 1000-fold slower
than memory access, I don't want to take this kind of performance hit.
Then you're in for a surprise! When you allocate something, it always
gets taken out of the virtual address space range, and the operating
system decides if and when it wants to page bits of your data to disk.

--
Lucian
Dec 23 '06 #59

"Lucian Wischik" <lu***@wischik.comwrote in message
news:ja********************************@4ax.com...
"Peter Olcott" <No****@SeeScreen.comwrote:
>>I expect to eventually have as many as millions of users, so I want to produce
the best possible solution. Since disk access is something like 1000-fold
slower
than memory access, I don't want to take this kind of performance hit.

Then you're in for a surprise! When you allocate something, it always
gets taken out of the virtual address space range, and the operating
system decides if and when it wants to page bits of your data to disk.

--
Lucian
Yet it does not make this decision on an arbitrary and capricious manner, it
only actually pages to disk when it needs to. Since I will most often only need
less than 200MB, and the minimum requirements for my application will be stated
as 500MB, there is no sense for me to always manually page to disk, when this
paging is not necessary.

Since there was a way that I could determine the precise amount of my huge
memory allocation in advance, this is the best way to go. It is worth all the
extra effort specifically because it allows me to prevent OUT_OF_MEMORY
exceptions instead of having to catch them and deal with them as they occur.
Because I was able to accomplish this without duplicating the steps, it might
even improve performance by eliminating memory reallocation. In any case the
degradation to performance is minimal if any, about seven seconds in the worst
case.
Dec 23 '06 #60
"Peter Olcott" <No****@SeeScreen.coma écrit dans le message de news:
1P*******************@newsfe06.phx...

| Yet it does not make this decision on an arbitrary and capricious manner,
it
| only actually pages to disk when it needs to. Since I will most often only
need
| less than 200MB, and the minimum requirements for my application will be
stated
| as 500MB, there is no sense for me to always manually page to disk, when
this
| paging is not necessary.

Don't forget that, if your program is not the only one running on the
machine, then paging may happen sooner than you think, because other
programs may have already exhausted the physical memory before you get a
chance to allocate your memory block.

This is Windows; you are not in charge. And when a user only has 512MB or
even 1GB, then your application isn't going to look too smart; even 2GB can
soon disappear before you get a look in :-)

Joanna

--
Joanna Carter [TeamB]
Consultant Software Engineer
Dec 23 '06 #61
Peter Olcott <No****@SeeScreen.comwrote:
You keep missing the fact that VS2005 isn't a compiler. It's using the
standard CSC compiler that comes with the .Net framework. Even this isn't
really a compiler, as it only spits out IL.

I have written a couple of compilers, and the jitter is not a compiler.
It converts code from one language (IL) into native code. In what way
is it *not* a compiler?

(I would say that csc is still a compiler though - just because the
target isn't native code doesn't mean it's not compiling, IMO.)
The tricky part is translating the nested do-while and if-then-else
statements comprised of compounds relational expressions into jump
code. The jitter does not need to do this, this part is already done.
All the jitter has to do is to translate pseudo assembly language
into machine code.
That doesn't mean it's not a compiler.

--
Jon Skeet - <sk***@pobox.com>
http://www.pobox.com/~skeet Blog: http://www.msmvps.com/jon.skeet
If replying to the group, please do not mail me too
Dec 24 '06 #62

"Jon Skeet [C# MVP]" <sk***@pobox.comwrote in message
news:MP************************@msnews.microsoft.c om...
Peter Olcott <No****@SeeScreen.comwrote:
You keep missing the fact that VS2005 isn't a compiler. It's using the
standard CSC compiler that comes with the .Net framework. Even this isn't
really a compiler, as it only spits out IL.

I have written a couple of compilers, and the jitter is not a compiler.

It converts code from one language (IL) into native code. In what way
is it *not* a compiler?
I already carefully explained in what way it is not a compiler, and you cut that
explanation out. It is not a compiler in that it does not translate nested
compound conditional statements forming if-the-else and do-while control flow
constructs into their equivalent jump code. Although the jitter is commonly
referred to as a compiler its true role is much closer to that of an assembly
language translator.
>
(I would say that csc is still a compiler though - just because the
target isn't native code doesn't mean it's not compiling, IMO.)
>The tricky part is translating the nested do-while and if-then-else
statements comprised of compounds relational expressions into jump
code. The jitter does not need to do this, this part is already done.
All the jitter has to do is to translate pseudo assembly language
into machine code.

That doesn't mean it's not a compiler.

--
Jon Skeet - <sk***@pobox.com>
http://www.pobox.com/~skeet Blog: http://www.msmvps.com/jon.skeet
If replying to the group, please do not mail me too

Dec 24 '06 #63
Peter Olcott wrote:
"Jon Skeet [C# MVP]" <sk***@pobox.comwrote in message
news:MP************************@msnews.microsoft.c om...
>Peter Olcott <No****@SeeScreen.comwrote:
>>>You keep missing the fact that VS2005 isn't a compiler. It's using the
standard CSC compiler that comes with the .Net framework. Even this isn't
really a compiler, as it only spits out IL.
I have written a couple of compilers, and the jitter is not a compiler.
It converts code from one language (IL) into native code. In what way
is it *not* a compiler?

I already carefully explained in what way it is not a compiler, and you cut that
explanation out. It is not a compiler in that it does not translate nested
compound conditional statements forming if-the-else and do-while control flow
constructs into their equivalent jump code. Although the jitter is commonly
referred to as a compiler its true role is much closer to that of an assembly
language translator.
The difference between a compiler and an assembler is that a compiler
translates from a source language to a target language that are
conceptual different while an assembler translates from a source
language to a target language that are conceptually identical.

Since most of the optimization in a C# program is done by the
JIT compiler and the IL code can be significantly rearranged
then it is obviously not an assembler. It is a compiler.

It is possible to write a language without compound
conditional statements (Fortran 66 as an example does not have
much in this area). And those languages can obviously be compiled.
So you definition of a compiler seems flawed.

And in general it is rather pointless to create ones own
definitions. I could decide to call what everybody else calls
south for north and north for south. But if I ask someone
for directions, then I would get lost quickly. Microsoft
and C# programmers (and for that matter SUN and Java programmers)
call the JIT process for compilation. If you want to communicate
with C# programmers you use the standard terminology or
confusion will arise.

Arne
Dec 24 '06 #64
Peter Olcott <No****@SeeScreen.comwrote:
I have written a couple of compilers, and the jitter is not a compiler.
It converts code from one language (IL) into native code. In what way
is it *not* a compiler?

I already carefully explained in what way it is not a compiler, and you cut that
explanation out.
Not really - you said things that many compilers do which the JIT
doesn't have to do. That's not the same way as explaining why it isn't
a compiler.

In a similar way, I could say that a C compiler doesn't have to deal
with closures (which other compilers have to) - that doesn't make the C
compiler any less of a compiler.
It is not a compiler in that it does not translate nested
compound conditional statements forming if-the-else and do-while control flow
constructs into their equivalent jump code.
Could you provide the source for a definition of "compiler" which
requires that?

The wikipedia definition certainly doesn't require it. Here's the
start:

<quote>
A compiler is a computer program (or set of programs) that translates
text written in a computer language (the source language) into another
computer language (the target language). The original sequence is
usually called the source code and the output called object code.
Commonly the output has a form suitable for processing by other
programs (e.g., a linker), but it may be a human readable text file.
</quote>

Which part of that does a JIT compiler not do?
Although the jitter is commonly referred to as a compiler its true
role is much closer to that of an assembly language translator.
In some respects - but not in others.

However, even an assembler counts as a compiler in some senses. The
Wikipedia definition even mentions an assembler as an example:

<quote>
In this manner, assembly languages and the primitive compiler, the
assembler, emerged.
</quote>

--
Jon Skeet - <sk***@pobox.com>
http://www.pobox.com/~skeet Blog: http://www.msmvps.com/jon.skeet
If replying to the group, please do not mail me too
Dec 24 '06 #65

"Arne Vajhøj" <ar**@vajhoej.dkwrote in message
news:45***********************@news.sunsite.dk...
Peter Olcott wrote:
>"Jon Skeet [C# MVP]" <sk***@pobox.comwrote in message
news:MP************************@msnews.microsoft. com...
>>Peter Olcott <No****@SeeScreen.comwrote:
You keep missing the fact that VS2005 isn't a compiler. It's using the
standard CSC compiler that comes with the .Net framework. Even this isn't
really a compiler, as it only spits out IL.
I have written a couple of compilers, and the jitter is not a compiler.
It converts code from one language (IL) into native code. In what way
is it *not* a compiler?

I already carefully explained in what way it is not a compiler, and you cut
that explanation out. It is not a compiler in that it does not translate
nested compound conditional statements forming if-the-else and do-while
control flow constructs into their equivalent jump code. Although the jitter
is commonly referred to as a compiler its true role is much closer to that of
an assembly language translator.

The difference between a compiler and an assembler is that a compiler
translates from a source language to a target language that are
conceptual different while an assembler translates from a source
language to a target language that are conceptually identical.
http://en.wikipedia.org/wiki/Compiler
The most common reason for wanting to translate source code is to create an
executable program. The name "compiler" is primarily used for programs that
translate source code from a high level language to a lower level language
(e.g., assembly language or machine language).

If you understand the details of how compilers work and what compilers do you
will realize that the most important thing that compilers do is translating high
level abstractions into lower level implementations.

The most important and difficult one of these tasks is translating from high
level control flow statements into low level control flow statements. Because
the jitter does not do this, calling it a compiler is a stretch. It would be
much more accurate to call it an assembler.

Since it based on an improvement to things such as the Java just in time
compiler calling the .NET feature a just in time compiler is merely an artifact
of the development history rather than an accurate denotation. It would be like
thinking that JavaScript is a script based on Java.
>
Since most of the optimization in a C# program is done by the
JIT compiler and the IL code can be significantly rearranged
then it is obviously not an assembler. It is a compiler.

It is possible to write a language without compound
conditional statements (Fortran 66 as an example does not have
much in this area). And those languages can obviously be compiled.
So you definition of a compiler seems flawed.

And in general it is rather pointless to create ones own
definitions. I could decide to call what everybody else calls
south for north and north for south. But if I ask someone
for directions, then I would get lost quickly. Microsoft
and C# programmers (and for that matter SUN and Java programmers)
call the JIT process for compilation. If you want to communicate
with C# programmers you use the standard terminology or
confusion will arise.

Arne

Dec 24 '06 #66

"Jon Skeet [C# MVP]" <sk***@pobox.comwrote in message
news:MP************************@msnews.microsoft.c om...
Peter Olcott <No****@SeeScreen.comwrote:
>I have written a couple of compilers, and the jitter is not a compiler.

It converts code from one language (IL) into native code. In what way
is it *not* a compiler?

I already carefully explained in what way it is not a compiler, and you cut
that
explanation out.

Not really - you said things that many compilers do which the JIT
doesn't have to do. That's not the same way as explaining why it isn't
a compiler.

In a similar way, I could say that a C compiler doesn't have to deal
with closures (which other compilers have to) - that doesn't make the C
compiler any less of a compiler.
>It is not a compiler in that it does not translate nested
compound conditional statements forming if-the-else and do-while control flow
constructs into their equivalent jump code.

Could you provide the source for a definition of "compiler" which
requires that?

The wikipedia definition certainly doesn't require it. Here's the
start:

<quote>
A compiler is a computer program (or set of programs) that translates
text written in a computer language (the source language) into another
computer language (the target language). The original sequence is
usually called the source code and the output called object code.
Commonly the output has a form suitable for processing by other
programs (e.g., a linker), but it may be a human readable text file.
</quote>

Which part of that does a JIT compiler not do?
Here is a quote for your same source:
http://en.wikipedia.org/wiki/Compiler
The most common reason for wanting to translate source code is to create an
executable program. The name "compiler" is primarily used for programs that
translate source code from a high level language to a lower level language
(e.g., assembly language or machine language).

The JIT "compiler" does not translate high level language control flow
statements to lower level language control flow statements. The most important
part of a compiler' job is translating high level abstractions into lower level
implementations. The most important high level abstraction is high level control
flow statements, and the JIT "compiler" completely skips that part.
>
>Although the jitter is commonly referred to as a compiler its true
role is much closer to that of an assembly language translator.

In some respects - but not in others.

However, even an assembler counts as a compiler in some senses. The
Wikipedia definition even mentions an assembler as an example:

<quote>
In this manner, assembly languages and the primitive compiler, the
assembler, emerged.
</quote>

--
Jon Skeet - <sk***@pobox.com>
http://www.pobox.com/~skeet Blog: http://www.msmvps.com/jon.skeet
If replying to the group, please do not mail me too

Dec 24 '06 #67
Peter Olcott <No****@SeeScreen.comwrote:
The difference between a compiler and an assembler is that a compiler
translates from a source language to a target language that are
conceptual different while an assembler translates from a source
language to a target language that are conceptually identical.

http://en.wikipedia.org/wiki/Compiler
The most common reason for wanting to translate source code is to create an
executable program. The name "compiler" is primarily used for programs that
translate source code from a high level language to a lower level language
(e.g., assembly language or machine language).
Yup, and IL is a higher level language than native code. Which part of
that definition *isn't* fulfilled by a JIT compiler?
If you understand the details of how compilers work and what compilers do you
will realize that the most important thing that compilers do is translating high
level abstractions into lower level implementations.
You mean high level abstractions like "call this virtual method" into
low level implementations like "look up the actual method in the vtable
and call appropriately"?

It's not as high level as (say) C#, but that doesn't mean it's at the
same level as native code.

Beyond this, you need to read carefully: the quote says that the term
is *primarily* used for programs translating source code from a high
level language. "Primarily" doesn't mean "exclusively".
The most important and difficult one of these tasks is translating from high
level control flow statements into low level control flow statements. Because
the jitter does not do this, calling it a compiler is a stretch. It would be
much more accurate to call it an assembler.
And as Wikipedia states, an assembler is a primitive compiler. I would
say a JIT is significantly more than an assembler though.

Note that the JIT *does* need to deal with some higher level control
flow statements than the JIT understands - "switch" is a good example
of this.
Since it based on an improvement to things such as the Java just in time
compiler calling the .NET feature a just in time compiler is merely an artifact
of the development history rather than an accurate denotation.
That "since" doesn't make logical sense. Just because Java had a JIT
compiler before .NET existed doesn't make the word "compiler"
inappropriate.
It would be like thinking that JavaScript is a script based on Java.
Not at all. That just shows that things *can* be badly named - it is
not evidence that the JIT compiler *is* badly named.

--
Jon Skeet - <sk***@pobox.com>
http://www.pobox.com/~skeet Blog: http://www.msmvps.com/jon.skeet
If replying to the group, please do not mail me too
Dec 24 '06 #68
Peter Olcott <No****@SeeScreen.comwrote:
The wikipedia definition certainly doesn't require it. Here's the
start:

<quote>
A compiler is a computer program (or set of programs) that translates
text written in a computer language (the source language) into another
computer language (the target language). The original sequence is
usually called the source code and the output called object code.
Commonly the output has a form suitable for processing by other
programs (e.g., a linker), but it may be a human readable text file.
</quote>

Which part of that does a JIT compiler not do?

Here is a quote for your same source:
http://en.wikipedia.org/wiki/Compiler
The most common reason for wanting to translate source code is to create an
executable program. The name "compiler" is primarily used for programs that
translate source code from a high level language to a lower level language
(e.g., assembly language or machine language).
Dealt with in another post.
The JIT "compiler" does not translate high level language control flow
statements to lower level language control flow statements.
Care to enlighten me as to where in x86 the "switch" statement is
defined? How about virtual method calls?
The most important part of a compiler' job is translating high level
abstractions into lower level implementations.
As I said in another post, the JIT does plenty of this.
The most important high level abstraction is high level control
flow statements, and the JIT "compiler" completely skips that part.
So your definition of "compiler" completely excludes anything which
converts one language into another if the source language doesn't have
any higher level control flow statements. It's an interesting
definition of a compiler, but not one which you've shown that anyone
else uses.

On the other hand, I *have* produced a definition of "compiler" which
the JIT compiler certainly meets, and which is clearly used by others.

--
Jon Skeet - <sk***@pobox.com>
http://www.pobox.com/~skeet Blog: http://www.msmvps.com/jon.skeet
If replying to the group, please do not mail me too
Dec 24 '06 #69
Peter Olcott wrote:
"Arne Vajhøj" <ar**@vajhoej.dkwrote in message
>The difference between a compiler and an assembler is that a compiler
translates from a source language to a target language that are
conceptual different while an assembler translates from a source
language to a target language that are conceptually identical.

http://en.wikipedia.org/wiki/Compiler
The most common reason for wanting to translate source code is to create an
executable program. The name "compiler" is primarily used for programs that
translate source code from a high level language to a lower level language
(e.g., assembly language or machine language).

If you understand the details of how compilers work and what compilers do you
will realize that the most important thing that compilers do is translating high
level abstractions into lower level implementations.
Yes.

IL is at a higher level than the x86 instruction set.
The most important and difficult one of these tasks is translating from high
level control flow statements into low level control flow statements. Because
the jitter does not do this, calling it a compiler is a stretch. It would be
much more accurate to call it an assembler.
I thougth the processs of compiling high level compound statements
to assembler/machinecode was something any CS student learned. While
writing a good optimizing compiler was an art.

Do you consider what we call a Fortran 66 compiler for a
Fortran 66 assembler because it does not have block if
statements ?

It does not make much sense to me to define compilers by
requiring the source language to have certain high level
control flow statements.
Since it based on an improvement to things such as the Java just in time
compiler calling the .NET feature a just in time compiler is merely an artifact
of the development history rather than an accurate denotation.
Language is an artifact of history.
It would be like
thinking that JavaScript is a script based on Java.
This is actually a very good example. But not quite as you think it is.

If you use the term "JavaScript" then everybody knows what you
are talking about.

If you use the term "LiveScript" or "ECMAScript", then much
fewer people understand you.

NetScape gave it the name JavaScript many years ago. It has
been general accepted.

Arne
Dec 24 '06 #70

"Jon Skeet [C# MVP]" <sk***@pobox.comwrote in message
news:MP************************@msnews.microsoft.c om...
Peter Olcott <No****@SeeScreen.comwrote:
The difference between a compiler and an assembler is that a compiler
translates from a source language to a target language that are
conceptual different while an assembler translates from a source
language to a target language that are conceptually identical.

http://en.wikipedia.org/wiki/Compiler
The most common reason for wanting to translate source code is to create an
executable program. The name "compiler" is primarily used for programs that
translate source code from a high level language to a lower level language
(e.g., assembly language or machine language).

Yup, and IL is a higher level language than native code. Which part of
that definition *isn't* fulfilled by a JIT compiler?
>If you understand the details of how compilers work and what compilers do you
will realize that the most important thing that compilers do is translating
high
level abstractions into lower level implementations.

You mean high level abstractions like "call this virtual method" into
low level implementations like "look up the actual method in the vtable
and call appropriately"?

It's not as high level as (say) C#, but that doesn't mean it's at the
same level as native code.

Beyond this, you need to read carefully: the quote says that the term
is *primarily* used for programs translating source code from a high
level language. "Primarily" doesn't mean "exclusively".
>The most important and difficult one of these tasks is translating from high
level control flow statements into low level control flow statements. Because
the jitter does not do this, calling it a compiler is a stretch. It would be
much more accurate to call it an assembler.

And as Wikipedia states, an assembler is a primitive compiler. I would
say a JIT is significantly more than an assembler though.
And significantly less than a compiler, closer to an assembler than a compiler.
>
Note that the JIT *does* need to deal with some higher level control
flow statements than the JIT understands - "switch" is a good example
of this.
This can be translated into a jump table quite easily. It does not require
anything close to the level of difficulty of translating nested compound
conditional control flow statements into their equivalent jump code.
>
>Since it based on an improvement to things such as the Java just in time
compiler calling the .NET feature a just in time compiler is merely an
artifact
of the development history rather than an accurate denotation.

That "since" doesn't make logical sense. Just because Java had a JIT
compiler before .NET existed doesn't make the word "compiler"
inappropriate.
>It would be like thinking that JavaScript is a script based on Java.

Not at all. That just shows that things *can* be badly named - it is
not evidence that the JIT compiler *is* badly named.

--
Jon Skeet - <sk***@pobox.com>
http://www.pobox.com/~skeet Blog: http://www.msmvps.com/jon.skeet
If replying to the group, please do not mail me too

Dec 24 '06 #71

"Jon Skeet [C# MVP]" <sk***@pobox.comwrote in message
news:MP***********************@msnews.microsoft.co m...
Peter Olcott <No****@SeeScreen.comwrote:
The wikipedia definition certainly doesn't require it. Here's the
start:

<quote>
A compiler is a computer program (or set of programs) that translates
text written in a computer language (the source language) into another
computer language (the target language). The original sequence is
usually called the source code and the output called object code.
Commonly the output has a form suitable for processing by other
programs (e.g., a linker), but it may be a human readable text file.
</quote>

Which part of that does a JIT compiler not do?

Here is a quote for your same source:
http://en.wikipedia.org/wiki/Compiler
The most common reason for wanting to translate source code is to create an
executable program. The name "compiler" is primarily used for programs that
translate source code from a high level language to a lower level language
(e.g., assembly language or machine language).

Dealt with in another post.
>The JIT "compiler" does not translate high level language control flow
statements to lower level language control flow statements.

Care to enlighten me as to where in x86 the "switch" statement is
defined? How about virtual method calls?
A similar construct could be created in the typical assembly language using very
powerful macro substitution.
>
>The most important part of a compiler' job is translating high level
abstractions into lower level implementations.

As I said in another post, the JIT does plenty of this.
Not enough to be accurately construed as a compiler. It would probably be a
little less than halfway between the typical high level language compiler and
the typical assembler. This would make it a very high level assembler.

Even languages that lack high level control flow constructs typically translate
high level mathematical constructs into the sequence of low level steps required
by a pseudo (or actual) machine language. From what I recall, this is another
compiler feature that the jitter is missing.
>
>The most important high level abstraction is high level control
flow statements, and the JIT "compiler" completely skips that part.

So your definition of "compiler" completely excludes anything which
converts one language into another if the source language doesn't have
any higher level control flow statements. It's an interesting
definition of a compiler, but not one which you've shown that anyone
else uses.

On the other hand, I *have* produced a definition of "compiler" which
the JIT compiler certainly meets, and which is clearly used by others.

--
Jon Skeet - <sk***@pobox.com>
http://www.pobox.com/~skeet Blog: http://www.msmvps.com/jon.skeet
If replying to the group, please do not mail me too

Dec 24 '06 #72

"Arne Vajhøj" <ar**@vajhoej.dkwrote in message
news:45***********************@news.sunsite.dk...
Peter Olcott wrote:
>"Arne Vajhøj" <ar**@vajhoej.dkwrote in message
>>The difference between a compiler and an assembler is that a compiler
translates from a source language to a target language that are
conceptual different while an assembler translates from a source
language to a target language that are conceptually identical.

http://en.wikipedia.org/wiki/Compiler
The most common reason for wanting to translate source code is to create an
executable program. The name "compiler" is primarily used for programs that
translate source code from a high level language to a lower level language
(e.g., assembly language or machine language).

If you understand the details of how compilers work and what compilers do you
will realize that the most important thing that compilers do is translating
high level abstractions into lower level implementations.

Yes.

IL is at a higher level than the x86 instruction set.
Yes.
>
>The most important and difficult one of these tasks is translating from high
level control flow statements into low level control flow statements. Because
the jitter does not do this, calling it a compiler is a stretch. It would be
much more accurate to call it an assembler.

I thougth the processs of compiling high level compound statements
to assembler/machinecode was something any CS student learned. While
writing a good optimizing compiler was an art.

Do you consider what we call a Fortran 66 compiler for a
Fortran 66 assembler because it does not have block if
statements ?
It still would translate high level mathematical expressions into their
equivalent low level sequence of operations, so it would still be a compiler. If
a source language lacks both of the two key high level abstractions typically
associated with 3GL languages (high level control flow, and high level
expression processing) Then the "high level" language is not high enough to
accurately call it a 3GL.

One of my inventions that I never completed because it would not likely be
patentable, and because its potential market kept shrinking was a language that
I called 2point5GL. This language embedded the {if-then-else, do-while, and
while} control flow constructs from the "C" programming language into any
assembly language.

As you can tell from its name, it was not high enough to call it a 3GL (high
level language), neither was it low enough to call it a 2GL, (assembly level
language).
>
It does not make much sense to me to define compilers by
requiring the source language to have certain high level
control flow statements.
You have to have some way to delineate it otherwise every assembler would be a
compiler and common usage conventions clearly distinguish between compilers and
assemblers. The jitter is a very high level assembler.
>
>Since it based on an improvement to things such as the Java just in time
compiler calling the .NET feature a just in time compiler is merely an
artifact of the development history rather than an accurate denotation.

Language is an artifact of history.
> It would be like thinking
that JavaScript is a script based on Java.

This is actually a very good example. But not quite as you think it is.

If you use the term "JavaScript" then everybody knows what you
are talking about.

If you use the term "LiveScript" or "ECMAScript", then much
fewer people understand you.

NetScape gave it the name JavaScript many years ago. It has
been general accepted.

Arne

Dec 24 '06 #73
Peter Olcott <No****@SeeScreen.comwrote:
The most important and difficult one of these tasks is translating from high
level control flow statements into low level control flow statements. Because
the jitter does not do this, calling it a compiler is a stretch. It would be
much more accurate to call it an assembler.
And as Wikipedia states, an assembler is a primitive compiler. I would
say a JIT is significantly more than an assembler though.

And significantly less than a compiler, closer to an assembler than a compiler.
That makes no sense given the definition of compiler includes assembler
(according to the quote).
Note that the JIT *does* need to deal with some higher level control
flow statements than the JIT understands - "switch" is a good example
of this.

This can be translated into a jump table quite easily. It does not require
anything close to the level of difficulty of translating nested compound
conditional control flow statements into their equivalent jump code.
Be that as it may, it's still a higher level control flow in IL than in
native code. Why should the difficulty level decide whether or not
something should be called a compiler?

--
Jon Skeet - <sk***@pobox.com>
http://www.pobox.com/~skeet Blog: http://www.msmvps.com/jon.skeet
If replying to the group, please do not mail me too
Dec 24 '06 #74
Peter Olcott <No****@SeeScreen.comwrote:
It does not make much sense to me to define compilers by
requiring the source language to have certain high level
control flow statements.

You have to have some way to delineate it otherwise every assembler would be a
compiler and common usage conventions clearly distinguish between compilers and
assemblers. The jitter is a very high level assembler.
I find it amusing that *now* you're appealing to "common usage
conventions" despite the fact that in common usage the JIT clearly *is*
regarded as a compiler, and your arbitrary delineation based on control
flow is anything but common.

--
Jon Skeet - <sk***@pobox.com>
http://www.pobox.com/~skeet Blog: http://www.msmvps.com/jon.skeet
If replying to the group, please do not mail me too
Dec 24 '06 #75
"Peter Olcott" <No****@SeeScreen.comwrote:
>The most important and difficult one of these tasks is translating from high
level control flow statements into low level control flow statements.
No way, not at all. That one's a trivial transformation. Harder parts
are type inference and analysis, name/type resolution and binding,
single assignment translation and register colouring.

--
Lucian
Dec 25 '06 #76

"Lucian Wischik" <lu***@wischik.comwrote in message
news:i4********************************@4ax.com...
"Peter Olcott" <No****@SeeScreen.comwrote:
>>The most important and difficult one of these tasks is translating from high
level control flow statements into low level control flow statements.

No way, not at all. That one's a trivial transformation. Harder parts
are type inference and analysis, name/type resolution and binding,
single assignment translation and register colouring.
Great can I hire you for a trivial amount of money to build a Yacc based parser
that recognizes the basic "C" control flow constructs of {if-then-else, while,
and do-while} and produces simple bytecode based jump code that can be used by
an interpreter?

This interpreter will not even have to process arithmetic expressions, it only
needs to process the case that you labeled as trivial. This includes arbitrarily
nested conditional statements and arbitrarily complex compound conditional
expressions. Each element of the compound conditional expression only needs to
compare pairs of <inttypes.
>
--
Lucian

Dec 25 '06 #77
"Peter Olcott" <No****@SeeScreen.comwrote:
>Great can I hire you for a trivial amount of money to build a Yacc based parser
that recognizes the basic "C" control flow constructs of {if-then-else, while,
and do-while} and produces simple bytecode based jump code that can be used by
an interpreter?
Here, I had 30 minutes free after our christmas turkey so you can have
it for free. Being a 30 minute free piece of code it doesn't have
comments or error-checking. I'm assuming that yacc has already turned
the input language into an AST. I use "hcode" for the high-level AST
which has the control-flow constructs you wanted, and "mc" for the
bytecode based jump code. I wrote it in F#, a functional .net
language, since it seemed most appropriate and you can easily interop
from C#.

type hcode = SEQUENCE of hcode*hcode
| IF of string*hcode*hcode
| WHILE of string*hcode
| DOWHILE of hcode*string
| ATOM of string

type mc = MATOM of string
| MCOMP of string
| MIFFALSEJUMP of string
| MIFTRUEJUMP of string
| MJUMP of string
let fcount = ref 0
let freshstring = fun (s:string) -let r = !fcount in fcount:=r+1;
s^"_"^r.ToString()

let compile : hcode -mc array = fun code ->
let gend : (string list * mc * string list) array ref = ref [| |] //
the generated code
let append = fun (labstart,mcode,labend) -gend := Array.append
!gend [|(labstart,mcode,labend)|]
let rec subcompile = fun (labstart,code,labend) ->
match code with
| ATOM(atom) -append(labstart,MATOM(atom),labend)
| SEQUENCE(c1,c2) -subcompile(labstart,c1,[]);
subcompile([],c2,labend)
| IF(test,cif,celse) ->
let elsebranch,after =
freshstring("else_"^test),freshstring("endif_"^tes t)
append(labstart,MCOMP(test),[]);
append([],MIFFALSEJUMP(elsebranch),[]);
subcompile([],cif,[]);
append([],MJUMP(after),[]);
subcompile([elsebranch],celse,after::labend)
| WHILE(test,body) ->
let start,after = freshstring("while_"^test),
freshstring("endwhile_"^test)
append(start::labstart,MCOMP(test),[]);
append([],MIFFALSEJUMP(after),[]);
subcompile([],body,[]);
append([],MJUMP(start),after::labend)
| DOWHILE(body,test) ->
let start = freshstring("do_"^test)
subcompile(labstart,body,[]);
append([],MCOMP(test),[]);
append([],MIFTRUEJUMP(start),labend)
let progstart,progend = freshstring("mainstart"),
freshstring("mainend")
subcompile([progstart],code,[progend])
// now put strings into a map
let labs:(Map<string,int>) ref = ref (Map.Empty())
for i = 0 to Array.length(!gend)-1 do
let (labstart,code,labend) = (!gend).(i)
let _ = List.map (fun s -labs := (!labs).Add(s,i)) labstart
let _ = List.map (fun s -labs := (!labs).Add(s,i+1)) labend
()
done
let al = !labs
// now resolve the names
let resolve = fun (labstart,mcode,labend) ->
match mcode with
| MATOM(s) -MATOM(s)
| MCOMP(s) -MCOMP(s)
| MIFFALSEJUMP(s) -MIFFALSEJUMP((!labs).[s].ToString())
| MIFTRUEJUMP(s) -MIFTRUEJUMP((!labs).[s].ToString())
| MJUMP(s) -MJUMP((!labs).[s].ToString())
// and that's it!
Array.map resolve (!gend)
Degenerate test case: if you give it the input AST
IF("conda",ATOM "atom1", SEQUENCE(ATOM "atom2", ATOM "atom3"))
then it gives as output
0: CMP conda
1: IFFALSEJUMP 4
2: atom1
3: JUMP 6
4: atom2
5: atom3
PS. thanks for your kind offer to hire me, but I only just joined the
VB compiler team recently and it wouldn't be appropriate for me to
moonlight!

--
Lucian
Dec 26 '06 #78

"Lucian Wischik" <lu***@wischik.comwrote in message
news:hr********************************@4ax.com...
"Peter Olcott" <No****@SeeScreen.comwrote:
>>Great can I hire you for a trivial amount of money to build a Yacc based
parser
that recognizes the basic "C" control flow constructs of {if-then-else, while,
and do-while} and produces simple bytecode based jump code that can be used by
an interpreter?

Here, I had 30 minutes free after our christmas turkey so you can have
it for free. Being a 30 minute free piece of code it doesn't have
comments or error-checking. I'm assuming that yacc has already turned
the input language into an AST. I use "hcode" for the high-level AST
which has the control-flow constructs you wanted, and "mc" for the
bytecode based jump code. I wrote it in F#, a functional .net
language, since it seemed most appropriate and you can easily interop
from C#.
I was flabbergasted by the degree of productivity that your seem to have
demonstrated here. Could you point me in the direction where I can learn this
degree of productivity in compiler construction? I want to be able to do
something comparable to what you have done, and preferably in a native code
language such as C++.
>
type hcode = SEQUENCE of hcode*hcode
| IF of string*hcode*hcode
| WHILE of string*hcode
| DOWHILE of hcode*string
| ATOM of string

type mc = MATOM of string
| MCOMP of string
| MIFFALSEJUMP of string
| MIFTRUEJUMP of string
| MJUMP of string
let fcount = ref 0
let freshstring = fun (s:string) -let r = !fcount in fcount:=r+1;
s^"_"^r.ToString()

let compile : hcode -mc array = fun code ->
let gend : (string list * mc * string list) array ref = ref [| |] //
the generated code
let append = fun (labstart,mcode,labend) -gend := Array.append
!gend [|(labstart,mcode,labend)|]
let rec subcompile = fun (labstart,code,labend) ->
match code with
| ATOM(atom) -append(labstart,MATOM(atom),labend)
| SEQUENCE(c1,c2) -subcompile(labstart,c1,[]);
subcompile([],c2,labend)
| IF(test,cif,celse) ->
let elsebranch,after =
freshstring("else_"^test),freshstring("endif_"^tes t)
append(labstart,MCOMP(test),[]);
append([],MIFFALSEJUMP(elsebranch),[]);
subcompile([],cif,[]);
append([],MJUMP(after),[]);
subcompile([elsebranch],celse,after::labend)
| WHILE(test,body) ->
let start,after = freshstring("while_"^test),
freshstring("endwhile_"^test)
append(start::labstart,MCOMP(test),[]);
append([],MIFFALSEJUMP(after),[]);
subcompile([],body,[]);
append([],MJUMP(start),after::labend)
| DOWHILE(body,test) ->
let start = freshstring("do_"^test)
subcompile(labstart,body,[]);
append([],MCOMP(test),[]);
append([],MIFTRUEJUMP(start),labend)
let progstart,progend = freshstring("mainstart"),
freshstring("mainend")
subcompile([progstart],code,[progend])
// now put strings into a map
let labs:(Map<string,int>) ref = ref (Map.Empty())
for i = 0 to Array.length(!gend)-1 do
let (labstart,code,labend) = (!gend).(i)
let _ = List.map (fun s -labs := (!labs).Add(s,i)) labstart
let _ = List.map (fun s -labs := (!labs).Add(s,i+1)) labend
()
done
let al = !labs
// now resolve the names
let resolve = fun (labstart,mcode,labend) ->
match mcode with
| MATOM(s) -MATOM(s)
| MCOMP(s) -MCOMP(s)
| MIFFALSEJUMP(s) -MIFFALSEJUMP((!labs).[s].ToString())
| MIFTRUEJUMP(s) -MIFTRUEJUMP((!labs).[s].ToString())
| MJUMP(s) -MJUMP((!labs).[s].ToString())
// and that's it!
Array.map resolve (!gend)
Degenerate test case: if you give it the input AST
IF("conda",ATOM "atom1", SEQUENCE(ATOM "atom2", ATOM "atom3"))
then it gives as output
0: CMP conda
1: IFFALSEJUMP 4
2: atom1
3: JUMP 6
4: atom2
5: atom3
PS. thanks for your kind offer to hire me, but I only just joined the
VB compiler team recently and it wouldn't be appropriate for me to
moonlight!

--
Lucian

Dec 27 '06 #79
I see now that the key to greatly enhanced productivity in compiler construction
is the
AST (Abstract Syntax Tree). Yacc could build these relatively easily. Once build
code generation becomes much simpler.

"Lucian Wischik" <lu***@wischik.comwrote in message
news:hr********************************@4ax.com...
"Peter Olcott" <No****@SeeScreen.comwrote:
>>Great can I hire you for a trivial amount of money to build a Yacc based
parser
that recognizes the basic "C" control flow constructs of {if-then-else, while,
and do-while} and produces simple bytecode based jump code that can be used by
an interpreter?

Here, I had 30 minutes free after our christmas turkey so you can have
it for free. Being a 30 minute free piece of code it doesn't have
comments or error-checking. I'm assuming that yacc has already turned
the input language into an AST. I use "hcode" for the high-level AST
which has the control-flow constructs you wanted, and "mc" for the
bytecode based jump code. I wrote it in F#, a functional .net
language, since it seemed most appropriate and you can easily interop
from C#.

type hcode = SEQUENCE of hcode*hcode
| IF of string*hcode*hcode
| WHILE of string*hcode
| DOWHILE of hcode*string
| ATOM of string

type mc = MATOM of string
| MCOMP of string
| MIFFALSEJUMP of string
| MIFTRUEJUMP of string
| MJUMP of string
let fcount = ref 0
let freshstring = fun (s:string) -let r = !fcount in fcount:=r+1;
s^"_"^r.ToString()

let compile : hcode -mc array = fun code ->
let gend : (string list * mc * string list) array ref = ref [| |] //
the generated code
let append = fun (labstart,mcode,labend) -gend := Array.append
!gend [|(labstart,mcode,labend)|]
let rec subcompile = fun (labstart,code,labend) ->
match code with
| ATOM(atom) -append(labstart,MATOM(atom),labend)
| SEQUENCE(c1,c2) -subcompile(labstart,c1,[]);
subcompile([],c2,labend)
| IF(test,cif,celse) ->
let elsebranch,after =
freshstring("else_"^test),freshstring("endif_"^tes t)
append(labstart,MCOMP(test),[]);
append([],MIFFALSEJUMP(elsebranch),[]);
subcompile([],cif,[]);
append([],MJUMP(after),[]);
subcompile([elsebranch],celse,after::labend)
| WHILE(test,body) ->
let start,after = freshstring("while_"^test),
freshstring("endwhile_"^test)
append(start::labstart,MCOMP(test),[]);
append([],MIFFALSEJUMP(after),[]);
subcompile([],body,[]);
append([],MJUMP(start),after::labend)
| DOWHILE(body,test) ->
let start = freshstring("do_"^test)
subcompile(labstart,body,[]);
append([],MCOMP(test),[]);
append([],MIFTRUEJUMP(start),labend)
let progstart,progend = freshstring("mainstart"),
freshstring("mainend")
subcompile([progstart],code,[progend])
// now put strings into a map
let labs:(Map<string,int>) ref = ref (Map.Empty())
for i = 0 to Array.length(!gend)-1 do
let (labstart,code,labend) = (!gend).(i)
let _ = List.map (fun s -labs := (!labs).Add(s,i)) labstart
let _ = List.map (fun s -labs := (!labs).Add(s,i+1)) labend
()
done
let al = !labs
// now resolve the names
let resolve = fun (labstart,mcode,labend) ->
match mcode with
| MATOM(s) -MATOM(s)
| MCOMP(s) -MCOMP(s)
| MIFFALSEJUMP(s) -MIFFALSEJUMP((!labs).[s].ToString())
| MIFTRUEJUMP(s) -MIFTRUEJUMP((!labs).[s].ToString())
| MJUMP(s) -MJUMP((!labs).[s].ToString())
// and that's it!
Array.map resolve (!gend)
Degenerate test case: if you give it the input AST
IF("conda",ATOM "atom1", SEQUENCE(ATOM "atom2", ATOM "atom3"))
then it gives as output
0: CMP conda
1: IFFALSEJUMP 4
2: atom1
3: JUMP 6
4: atom2
5: atom3
PS. thanks for your kind offer to hire me, but I only just joined the
VB compiler team recently and it wouldn't be appropriate for me to
moonlight!

--
Lucian

Dec 28 '06 #80
"Peter Olcott" <No****@SeeScreen.comwrote:
>I was flabbergasted by the degree of productivity that your seem to have
demonstrated here. Could you point me in the direction where I can learn this
degree of productivity in compiler construction? I want to be able to do
something comparable to what you have done, and preferably in a native code
language such as C++.
As you said, ASTs is one part of it.

Something that C# people won't like to hear is that functional
languages (F#, caml, ML) are many times better than C# for
manipulating ASTs. In a functional language you'll use about 1/5th as
many lines to accomplish the same thing. And it has far fewer bugs.
And you'll code it up at least five times as fast.

Something that you, Peter, won't like to hear is that garbage
collection is the right way to do a compiler. It increases your
productivity a lot, and it ends up performing faster than whatever
custom allocators+deallocators you've tried to write yourself in C++.

--
Lucian
Dec 28 '06 #81

"Lucian Wischik" <lu***@wischik.comwrote in message
news:40********************************@4ax.com...
"Peter Olcott" <No****@SeeScreen.comwrote:
>>I was flabbergasted by the degree of productivity that your seem to have
demonstrated here. Could you point me in the direction where I can learn this
degree of productivity in compiler construction? I want to be able to do
something comparable to what you have done, and preferably in a native code
language such as C++.

As you said, ASTs is one part of it.

Something that C# people won't like to hear is that functional
languages (F#, caml, ML) are many times better than C# for
manipulating ASTs. In a functional language you'll use about 1/5th as
many lines to accomplish the same thing. And it has far fewer bugs.
And you'll code it up at least five times as fast.

Something that you, Peter, won't like to hear is that garbage
collection is the right way to do a compiler. It increases your
productivity a lot, and it ends up performing faster than whatever
custom allocators+deallocators you've tried to write yourself in C++.

--
Lucian
I must have the commercial project complete within a maximum of six months. This
project is currently written in Visual C++ 6.0, and I know C++ quite well. This
already assumes 90 hour work weeks. All that I need to end up with is a subset
of the "C" programming language as applied to a specific infrastructure.
Dec 28 '06 #82

This discussion thread is closed

Replies have been disabled for this discussion.

By using this site, you agree to our Privacy Policy and Terms of Use.