Explain this about threads

Jon Slaughter

"Instead of just waiting for its time slice to expire, a thread can block
each time it initiates a time-consuming activity in another thread until the
activity finishes. This is better than spinning in a polling loop waiting
for completion because it allows other threads to run sooner than they would
if the system had to rely solely on expiration of a time slice to turn its
attention to some other thread."

I don't get the "a thread can block each time...". What does it mean by
blocking? Does it mean that if thread B needs something from thread A that
thread A stops thread B from running until its finished but not interfer
with some other thread C?

Thanks,

Jon

Sep 22 '07 #1

Subscribe Post Reply

2234

Peter Duniho

Jon Slaughter wrote:

"Instead of just waiting for its time slice to expire, a thread can block
each time it initiates a time-consuming activity in another thread until the
activity finishes. This is better than spinning in a polling loop waiting
for completion because it allows other threads to run sooner than they would
if the system had to rely solely on expiration of a time slice to turn its
attention to some other thread."

Without the context of the quote, it's hard to know for sure the details
of the scenario being discussed in the quote. However...

Generally speaking, a thread is runnable or not. If it is not, it will
"block". There are a variety of things that can cause a thread to
block, but they generally fall into two categories: waiting for some
resource; and waiting explicitly (i.e. calling Sleep()).

A thread that becomes unrunnable will immediately yield its current
timeslice, allowing some other runnable thread to start executing. If a
thread doesn't become unrunnable, either because it explicitly sleeps or
because it makes some function call that involves having to wait on some
resource (for example, a synchronization object or some sort of i/o), it
will continue to execute for as long as its timeslice.

So, in the quote, they appear to be explaining that polling a resource
is much worse than allowing the operating system to block the thread
until the resource is available, because polling causes that thread to
consume its entire timeslice, rather than allowing other threads to run
during that time.

Assuming the other threads are of the same priority, they will
eventually get some time. It's just that you can waste a lot of time
executing a thread that uses up its entire timeslice without
accomplishing any actual work. This is especially true if the other
threads are better behaved and use blocking techniques to deal with
resource acquisition, since those threads _won't_ generally use their
entire timeslice. This results in the one thread that's not actually
doing anything useful winding up getting the lion's share of the CPU
time, which is exactly the opposite of what you normally would want.

So, with that background, your specific question:

I don't get the "a thread can block each time...". What does it mean by
blocking? Does it mean that if thread B needs something from thread A that
thread A stops thread B from running until its finished but not interfer
with some other thread C?

Thread A doesn't stop thread B explicitly, no. But assuming thread B
uses a blocking technique to wait on a resource being held by thread A,
thread B would be blocked by thread A implicitly. And importantly,
thread B would not be run at all until the resource was released by
thread A.

This means that as long as thread B is blocked, thread A and thread C
can share the CPU without having thread B using any CPU time. Assuming
you just have the three threads, then basically thread A and thread C
get their usual share of the CPU, plus they both get to use the time
thread B otherwise would have used.

For example, let's say the timeslice is one second. Then without any
blocking and with each thread consuming its entire timeslice, in a
three-second period, each thread would run for one continuous second.

Now, if thread B needs something thread A has, it can either poll for
the resource, continuing to use one continuous second for each three
second period, or it can block. If it uses some blocking mechanism,
then in a single three second period, thread B will use ZERO CPU time,
while thread's A and C will use, on average 1.5 seconds (but in reality,
for any given three second period, one of those threads will get two one
second timeslices, while the other will get one; over a six second
period, both threads will each get three one second timeslices though).

Of course, if thread A has to block as well, then thread C gets even
more CPU time, since it's the only one runnable.

This is obviously an oversimplification: timeslices aren't ever nearly
as long as one second on Windows, you never have only three threads, and
the above completely ignores the overhead of context switching between
threads. But it does illustrate the basic point.

The bottom line here is that polling is bad. Really bad. It takes the
one thread that actually has no work to do, and causes it to use the
most CPU time out of any thread running on the system. Polling is
almost always counter-productive. It's almost never the right way to
solve a problem.

Blocking, on the other hand, is a very nice way to solve a problem. The
operating system almost always has some mechanism for allowing a thread
to sit in an unrunnable state until whatever resource it actually needs
is available. After all, the OS is in charge of those resources, so it
naturally knows when they become available. So by using a blocking
technique, a thread that is cannot do any useful work with the CPU is
never allowed to use the CPU, which allows for much more efficient use
of the CPU and much greater net throughput.

As with everything, there are exceptions to the general rule. In very
specific situations, a "spin wait" can improve performance. That is, if
you know that a particular resource will for sure become available
within some period of time less than your timeslice, it can be better to
spin wait for it, because if the thread blocks it could be quite a while
before it gets a chance to run again.

Once the resource it's waiting on becomes available, it still has to
wait its turn in the round-robin thread scheduling to get CPU time
again. In addition, there is of course the overhead in switching
between threads. So if you need a particular thread to be very
responsive AND (and this is very important) you know for sure it won't
have to wait longer than the timeslice, spinning can work well.

On this last point: if the thread will have to wait longer than the
timeslice for the resource, then spinning doesn't do any good at all.
The thread _will_ be preempted; there is no way to prevent that. So all
that spinning in that case does is waste CPU time that could be used by
all the other threads.

The spinning thread will still get interrupted, and put at the end of
the round-robin list so that it has to wait for all the other threads to
get their chance to execute. In fact, because when other threads are
kept from running longer, they may wind up being able to do more work
when they finally do get to run, and because the spinning thread itself
just wasted a bunch of time doing nothing, the net result of having a
thread spin wait like that often is much _reduced_ performance even for
the spinning thread, never mind the issue of overall system throughput I
mentioned above.

So, even in these very specific scenarios where a spin wait may help,
you have to be very careful. If you aren't an expert in managing thread
scheduling, you can easily make your program a lot worse by attempting a
technique like that.

Pete

Sep 22 '07 #2

Rick Lones

Jon Slaughter wrote:

"Instead of just waiting for its time slice to expire, a thread can block
each time it initiates a time-consuming activity in another thread until the
activity finishes. This is better than spinning in a polling loop waiting
for completion because it allows other threads to run sooner than they would
if the system had to rely solely on expiration of a time slice to turn its
attention to some other thread."

I don't get the "a thread can block each time...". What does it mean by
blocking? Does it mean that if thread B needs something from thread A that
thread A stops thread B from running until its finished but not interfer
with some other thread C?

More like thread B stops itself from running until A has made some resource
available or raised some other kind of event. There is usually no reason to sit
and spin in the meantime, so normally the remainder of B's time slice would be
yielded to the scheduler in hopes that some other task has a use for the CPU
resource. B is now said blocked to be "blocked" on some event and will not be
rescheduled until the event occurs. It's less exotic than it may sound - if A
is the operating system, e.g., this happens every time your task B does a
synchronous wait for any OS resource, for example Console.Readline().

HTH,
-rick-

Sep 22 '07 #3

Jon Slaughter

"Rick Lones" <Wr******@YcharterZ.netwrote in message
news:q_***********@newsfe05.lga...

Jon Slaughter wrote:
>"Instead of just waiting for its time slice to expire, a thread can block
each time it initiates a time-consuming activity in another thread until
the activity finishes. This is better than spinning in a polling loop
waiting for completion because it allows other threads to run sooner than
they would if the system had to rely solely on expiration of a time slice
to turn its attention to some other thread."

I don't get the "a thread can block each time...". What does it mean by
blocking? Does it mean that if thread B needs something from thread A
that thread A stops thread B from running until its finished but not
interfer with some other thread C?

More like thread B stops itself from running until A has made some
resource available or raised some other kind of event. There is usually
no reason to sit and spin in the meantime, so normally the remainder of
B's time slice would be yielded to the scheduler in hopes that some other
task has a use for the CPU resource. B is now said blocked to be
"blocked" on some event and will not be rescheduled until the event
occurs. It's less exotic than it may sound - if A is the operating
system, e.g., this happens every time your task B does a synchronous wait
for any OS resource, for example Console.Readline().

Ok, but I don't understand the "blocked on some event and will not be
rescheduled until the event occurs". How does that happen? I can't picture
how the machinery is setup to do this.

If we use your example of ReadLine() then essentially what happens is
eventually the call works its way down to a hardware driver. But this
requires a task switch somewhere to go from user mode to kernel mode. Now
wouldn't the scheduler try and "revive" the user mode code that was tasked
switched because it wouldn't know that its waiting for the kernel mode code
to finish?

Or is there some information in a task switch like a lock or something that
tells the scheduler not to revice a process and the kernel mode code would
determine that.

Hence I suppose in the task switch state one might have something like
"IsBlocked". When the kernel mode is entered, if its asynchronous is might
set IsBlocked momentarily but then release it. If its synchronous then it
would set IsBlocked until it is completely finished.

Am I way off? ;) Its just hard for me to understand how the internal
machinery accomplishes this. Maybe is as simple as a lock though?

Thanks,
Jon

Sep 22 '07 #4

Willy Denoyette [MVP]

"Jon Slaughter" <Jo***********@Hotmail.comwrote in message
news:av****************@newssvr21.news.prodigy.net ...

"Instead of just waiting for its time slice to expire, a thread can block
each time it initiates a time-consuming activity in another thread until
the activity finishes. This is better than spinning in a polling loop
waiting for completion because it allows other threads to run sooner than
they would if the system had to rely solely on expiration of a time slice
to turn its attention to some other thread."

I don't get the "a thread can block each time...". What does it mean by
blocking? Does it mean that if thread B needs something from thread A that
thread A stops thread B from running until its finished but not interfer
with some other thread C?

Thanks,

Jon

Adding to what others have said in this thread:
1) You should never SpinWait on a single processor machine, doing so
prevents other threads in the system to make progress( unless this is
exactly what you are looking for).
Waiting for an event (whatever) from another thread in a SpinWait loop,
prevents the other thread to signal the event, so basically you are wasting
CPU cycles for nothing.
2) Define the count as such that you spin for less than the time needed to
perform a transition to the kernel and back, when waiting for an event.
Spinning for a longer period is just a waste of CPU cycles, you better give
up your quantum by calling Sleep(1) or PInvoke the Kernel32
"SwitchToThread" API in that case.

Willy.

Sep 22 '07 #5

Mads Bondo Dydensborg

Willy Denoyette [MVP] wrote:

Spinning for a longer period is just a waste of CPU cycles, you better
give
up your quantum by calling Sleep(1) or PInvoke the Kernel32
"SwitchToThread" API in that case.

Actually, IIRC, sleep(0) is enough to get the scheduler going.

Regards,

Mads

--
Med venlig hilsen/Regards

Systemudvikler/Systemsdeveloper cand.scient.dat, Ph.d., Mads Bondo
Dydensborg
Dansk BiblioteksCenter A/S, Tempovej 7-11, 2750 Ballerup, Tlf. +45 44 86 77
34

Sep 22 '07 #6

Jon Slaughter

"Willy Denoyette [MVP]" <wi*************@telenet.bewrote in message
news:%2****************@TK2MSFTNGP03.phx.gbl...

"Jon Slaughter" <Jo***********@Hotmail.comwrote in message
news:av****************@newssvr21.news.prodigy.net ...
>"Instead of just waiting for its time slice to expire, a thread can block
each time it initiates a time-consuming activity in another thread until
the activity finishes. This is better than spinning in a polling loop
waiting for completion because it allows other threads to run sooner than
they would if the system had to rely solely on expiration of a time slice
to turn its attention to some other thread."

I don't get the "a thread can block each time...". What does it mean by
blocking? Does it mean that if thread B needs something from thread A
that thread A stops thread B from running until its finished but not
interfer with some other thread C?

Thanks,

Jon

Adding to what others have said in this thread:
1) You should never SpinWait on a single processor machine, doing so
prevents other threads in the system to make progress( unless this is
exactly what you are looking for).
Waiting for an event (whatever) from another thread in a SpinWait loop,
prevents the other thread to signal the event, so basically you are
wasting CPU cycles for nothing.
2) Define the count as such that you spin for less than the time needed to
perform a transition to the kernel and back, when waiting for an event.
Spinning for a longer period is just a waste of CPU cycles, you better
give up your quantum by calling Sleep(1) or PInvoke the Kernel32
"SwitchToThread" API in that case.

Ok, Tell me then how I can do clocked IO in a timely fashion without using
without spining?

Lets suppose I have do slow down the rate because it simply to fast for
whatever device I'm communicating with if I do not insert delays.

I would be really interested in knowing how I can do this without using
spinwaits because it is a problem that I'm having. It seems like its a
necessary evil in my case.

Basically I'm trying to do synchronous communication with the parallel port.
I have the ability to use in and out which is supplied by a kernel mode
driver and dll wrapper. So if I want to output some data to the port I can
do "out(data)" and it will output it

if I have something like

for(int i = 0; i < data.Length; i++)
out(data[i]);

Then this will run about 100khz or so(on my machine). Now what if I need to
slow it down to 20khz? How can I do this without using spin waits but still
do it in a timely fashion? IGNORE ANY DELAYS FROM TASK SWITCHING! I cannot
control the delays that other processes and task switching introduce so
since its beyond my control I have to ignore it. Whats important is the
upper bound that I can get and the average and not the lower bound. So when
I say it needs to run at 20khz it means as an upper bound.

for(int i = 0; i < data.Length; i++)
{
out(data[i]);
Thread.SpinWait(X);
}

Where X is something that slows this down enough to run at 20khz. I can
figure out X on average by doing some profiling. i.e., if I know how long
out takes and how long Thread.SpinWait(1) takes(on average) then I can get
an approximate value for X.

But how can I do this without using spin waits?

Sep 22 '07 #7

Willy Denoyette [MVP]

"Mads Bondo Dydensborg" <mb*@dbc.dkwrote in message
news:un*************@TK2MSFTNGP06.phx.gbl...

Willy Denoyette [MVP] wrote:

>Spinning for a longer period is just a waste of CPU cycles, you better
give
up your quantum by calling Sleep(1) or PInvoke the Kernel32
"SwitchToThread" API in that case.

Actually, IIRC, sleep(0) is enough to get the scheduler going.

No it's not, Sleep(0) will not relinquish it's timeslice if there are no
"equal priority" ready threads to run, please read the Sleep API description
on MSDN. That means that specifying 0 as sleep count might lead to
starvation of lower priority threads.

Willy.

Sep 22 '07 #8

Willy Denoyette [MVP]

"Jon Slaughter" <Jo***********@Hotmail.comwrote in message
news:In****************@newssvr21.news.prodigy.net ...

>
"Willy Denoyette [MVP]" <wi*************@telenet.bewrote in message
news:%2****************@TK2MSFTNGP03.phx.gbl...
>"Jon Slaughter" <Jo***********@Hotmail.comwrote in message
news:av****************@newssvr21.news.prodigy.ne t...
>>"Instead of just waiting for its time slice to expire, a thread can
block each time it initiates a time-consuming activity in another thread
until the activity finishes. This is better than spinning in a polling
loop waiting for completion because it allows other threads to run
sooner than they would if the system had to rely solely on expiration of
a time slice to turn its attention to some other thread."

I don't get the "a thread can block each time...". What does it mean by
blocking? Does it mean that if thread B needs something from thread A
that thread A stops thread B from running until its finished but not
interfer with some other thread C?

Thanks,

Jon

Adding to what others have said in this thread:
1) You should never SpinWait on a single processor machine, doing so
prevents other threads in the system to make progress( unless this is
exactly what you are looking for).
Waiting for an event (whatever) from another thread in a SpinWait loop,
prevents the other thread to signal the event, so basically you are
wasting CPU cycles for nothing.
2) Define the count as such that you spin for less than the time needed
to perform a transition to the kernel and back, when waiting for an
event. Spinning for a longer period is just a waste of CPU cycles, you
better give up your quantum by calling Sleep(1) or PInvoke the Kernel32
"SwitchToThread" API in that case.

Ok, Tell me then how I can do clocked IO in a timely fashion without using
without spining?

Lets suppose I have do slow down the rate because it simply to fast for
whatever device I'm communicating with if I do not insert delays.

I would be really interested in knowing how I can do this without using
spinwaits because it is a problem that I'm having. It seems like its a
necessary evil in my case.

Basically I'm trying to do synchronous communication with the parallel
port. I have the ability to use in and out which is supplied by a kernel
mode driver and dll wrapper. So if I want to output some data to the port
I can do "out(data)" and it will output it

if I have something like

for(int i = 0; i < data.Length; i++)
out(data[i]);

Then this will run about 100khz or so(on my machine). Now what if I need
to slow it down to 20khz? How can I do this without using spin waits but
still do it in a timely fashion? IGNORE ANY DELAYS FROM TASK SWITCHING!
I cannot control the delays that other processes and task switching
introduce so since its beyond my control I have to ignore it. Whats
important is the upper bound that I can get and the average and not the
lower bound. So when I say it needs to run at 20khz it means as an upper
bound.

for(int i = 0; i < data.Length; i++)
{
out(data[i]);
Thread.SpinWait(X);
}

Where X is something that slows this down enough to run at 20khz. I can
figure out X on average by doing some profiling. i.e., if I know how long
out takes and how long Thread.SpinWait(1) takes(on average) then I can get
an approximate value for X.

But how can I do this without using spin waits?

The data transfer rate on a parallel port is a matter of handshake protocol
between the port and the device, basically it's the device who decides the
(maximum) rate. The exact transfer rates are defined in the IEE1284
protocol standards (IEE1284.1, 2, 3 , 4..) and the modes (like Compatible,
Nibble, Byte and ECP mode) supported by the parallel port peripheral
controller chips. All these kind of protocols (par. port , serial ports
networks, USB, other peripheral protocols) are exactly invented to be able
to control the signaling rates between the system and the device, the PC
hardware and the Windows OS is simply not designed for this, they are not
real-time capable.
Now, if you don't have a device connected that negotiates or respects one of
the IEE1284 protocol modes, you have a problem. You can't accurately time
the IO transfer rate, all you can do is insert waits in your code user mode
or in a driver driver) and as such define a top level rate but no lower
rate!
The easiest way (but still a dirty way to do) is by inserting delays like
you do in your code, this is not a problem for small bursts (say a few 100
bytes) on a single processor box, and a few KB on multi-cores, assuming that
you don't further peg the CPU between each burst, so that other threads
don't starve.
for(int i = 0; i < data.Length; i++)
{
out(data[i]);
Thread.SpinWait(X);
}

Say that the above code uses a SpinWait of 50µsec. (say X = 150000), with a
of Length = 200, the entire loop will at least take 10 msec.

Sep 22 '07 #9

Mads Bondo Dydensborg

Willy Denoyette [MVP] wrote:

"Mads Bondo Dydensborg" <mb*@dbc.dkwrote in message
news:un*************@TK2MSFTNGP06.phx.gbl...
>Willy Denoyette [MVP] wrote:

>>Spinning for a longer period is just a waste of CPU cycles, you better
give
up your quantum by calling Sleep(1) or PInvoke the Kernel32
"SwitchToThread" API in that case.

Actually, IIRC, sleep(0) is enough to get the scheduler going.

No it's not, Sleep(0) will not relinquish it's timeslice if there are no
"equal priority" ready threads to run, please read the Sleep API
description on MSDN. That means that specifying 0 as sleep count might
lead to starvation of lower priority threads.

We agree - I was sloppy - if there are no other threads ready to run,
nothing will happen.

Regards,

Mads

--
Med venlig hilsen/Regards

Systemudvikler/Systemsdeveloper cand.scient.dat, Ph.d., Mads Bondo
Dydensborg
Dansk BiblioteksCenter A/S, Tempovej 7-11, 2750 Ballerup, Tlf. +45 44 86 77
34

Sep 22 '07 #10

Jon Slaughter

"Willy Denoyette [MVP]" <wi*************@telenet.bewrote in message
news:OL**************@TK2MSFTNGP05.phx.gbl...

"Jon Slaughter" <Jo***********@Hotmail.comwrote in message
news:In****************@newssvr21.news.prodigy.net ...
>>
"Willy Denoyette [MVP]" <wi*************@telenet.bewrote in message
news:%2****************@TK2MSFTNGP03.phx.gbl...
>>"Jon Slaughter" <Jo***********@Hotmail.comwrote in message
news:av****************@newssvr21.news.prodigy.n et...
"Instead of just waiting for its time slice to expire, a thread can
block each time it initiates a time-consuming activity in another
thread until the activity finishes. This is better than spinning in a
polling loop waiting for completion because it allows other threads to
run sooner than they would if the system had to rely solely on
expiration of a time slice to turn its attention to some other thread."

I don't get the "a thread can block each time...". What does it mean by
blocking? Does it mean that if thread B needs something from thread A
that thread A stops thread B from running until its finished but not
interfer with some other thread C?

Thanks,

Jon

Adding to what others have said in this thread:
1) You should never SpinWait on a single processor machine, doing so
prevents other threads in the system to make progress( unless this is
exactly what you are looking for).
Waiting for an event (whatever) from another thread in a SpinWait loop,
prevents the other thread to signal the event, so basically you are
wasting CPU cycles for nothing.
2) Define the count as such that you spin for less than the time needed
to perform a transition to the kernel and back, when waiting for an
event. Spinning for a longer period is just a waste of CPU cycles, you
better give up your quantum by calling Sleep(1) or PInvoke the Kernel32
"SwitchToThread" API in that case.

Ok, Tell me then how I can do clocked IO in a timely fashion without
using without spining?

Lets suppose I have do slow down the rate because it simply to fast for
whatever device I'm communicating with if I do not insert delays.

I would be really interested in knowing how I can do this without using
spinwaits because it is a problem that I'm having. It seems like its a
necessary evil in my case.

Basically I'm trying to do synchronous communication with the parallel
port. I have the ability to use in and out which is supplied by a kernel
mode driver and dll wrapper. So if I want to output some data to the port
I can do "out(data)" and it will output it

if I have something like

for(int i = 0; i < data.Length; i++)
out(data[i]);

Then this will run about 100khz or so(on my machine). Now what if I need
to slow it down to 20khz? How can I do this without using spin waits but
still do it in a timely fashion? IGNORE ANY DELAYS FROM TASK SWITCHING!
I cannot control the delays that other processes and task switching
introduce so since its beyond my control I have to ignore it. Whats
important is the upper bound that I can get and the average and not the
lower bound. So when I say it needs to run at 20khz it means as an upper
bound.

for(int i = 0; i < data.Length; i++)
{
out(data[i]);
Thread.SpinWait(X);
}

Where X is something that slows this down enough to run at 20khz. I can
figure out X on average by doing some profiling. i.e., if I know how long
out takes and how long Thread.SpinWait(1) takes(on average) then I can
get an approximate value for X.

But how can I do this without using spin waits?

The data transfer rate on a parallel port is a matter of handshake
protocol between the port and the device, basically it's the device who
decides the (maximum) rate. The exact transfer rates are defined in the
IEE1284 protocol standards (IEE1284.1, 2, 3 , 4..) and the modes (like
Compatible, Nibble, Byte and ECP mode) supported by the parallel port
peripheral controller chips. All these kind of protocols (par. port ,
serial ports networks, USB, other peripheral protocols) are exactly
invented to be able to control the signaling rates between the system and
the device, the PC hardware and the Windows OS is simply not designed for
this, they are not real-time capable.

No, this is only for ECP and EPP. There is no handshaking and hardware
protocol in SPP which is what I'm using. It is also necessary for me to use
SPP because the device that is attached does not use the same protocol that
EPP/ECP uses.

Now, if you don't have a device connected that negotiates or respects one
of the IEE1284 protocol modes, you have a problem. You can't accurately
time the IO transfer rate, all you can do is insert waits in your code
user mode or in a driver driver) and as such define a top level rate but
no lower rate!
The easiest way (but still a dirty way to do) is by inserting delays like
you do in your code, this is not a problem for small bursts (say a few 100
bytes) on a single processor box, and a few KB on multi-cores, assuming
that you don't further peg the CPU between each burst, so that other
threads don't starve.

Well, thats what I'm doing but I'm trying to find the optimal method. This
is also the method that most programs that do similar things I'm trying to
do use.

I think I'm going to write a simple kernel mode driver that does all the
communications using direct port access(instead of the IOCTRL methodolgy).
Its more of a hack but is probably fast as I can get it. Of course that
method will cause problems with other drivers and stuff but I don't have to
worry about that.

I can also use the interrupt to get information on a regular basis but not
sure how well this will work.

I was thinking that maybe I could use an interrupt and then an external
clock that will trigger the interrupt very precisely and that would probably
give me a pretty accurate method but it would probably starve the system
because of all the task switching per clock. I guess I have no choice but
to either use something like dos or some hardware proxy that can deal with
the latency issues.

Sep 22 '07 #11

Peter Duniho

Jon Slaughter wrote:

[...]
Basically I'm trying to do synchronous communication with the parallel port.
I have the ability to use in and out which is supplied by a kernel mode
driver and dll wrapper.

I guess I'm curious at this point as to why you want to use this kernel
mode driver that provides you direct access to the ports? Windows has a
usable higher-level i/o API that should handle all the i/o timing and
buffering required. I'm not specifically aware of a managed code API,
but the unmanaged parallel port access via CreateFile, etc. ought to
work I would think.

Can you explain why it is the usual buffered i/o mechanisms are suitable
for your needs? I'm not sure it will lead to a specific solution, but
it would at least help us to understand the scenario better.

[...]
Then this will run about 100khz or so(on my machine). Now what if I need to
slow it down to 20khz? How can I do this without using spin waits but still
do it in a timely fashion?

I'm not aware of any non-spin-wait mechanism that will allow you to time
the interval between individual calls to out. The best you can do with
the mechanisms available is to ensure an _average_ data rate and even
there, without some kind of buffering support, you still have the
problem of having to not send data too fast.

But spin-waits are potentially going to cause other problems that will
actually cause your implementation to perform worse. In some cases, it
might actually perform worse than just calling Sleep(1) between each
call to "out" (which itself, as you probably know, would kill
performance if you're looking for a 20khz sending rate).

If you have to use spin-waits, then I suppose you have to. But it would
be helpful to at least make sure you really have to. So far, it's not
clear why you have to (at least, not to me). Spin-waiting is bad enough
that it's definitely a last resort.

Pete

Sep 22 '07 #12

Jon Slaughter

A simple example can be found

http://www.codeproject.com/csharp/csppleds.asp

I'll be working with pic's instead of led's(which are easy and require no
protocol) and lcd's.

BTW, the second part is the lcd,

http://www.codeproject.com/csharp/cspplcds.asp

as you can see, the code for sending a command to the lcd is

/* Makes the enable pin high and register pin low */
PortAccess.Output(control, 8); Thread.Sleep(1);
//The delays can be smaller, Check Busy Flag info in the article

/* Makes the enable pin low for LCD to read its
* data pins and also register pin low */
PortAccess.Output(control, 9); Thread.Sleep(1);
//The delays can be smaller, Check Busy Flag info in the article

/* Clears entire display and sets DDRAM address
* 0 in address counter */
PortAccess.Output(data, 1); //Thread.Sleep(1);
//The delays can be smaller, Check Busy Flag info in the article

/* Makes the enable pin high and register pin low */
PortAccess.Output(control, 8); Thread.Sleep(1);
//The delays can be smaller, Check Busy Flag info in the article

/* Makes the enable pin low for LCD to read its
* data pins and also register pin low */
PortAccess.Output(control, 9); Thread.Sleep(1);
//The delays can be smaller, Check Busy Flag info in the article

/* We are setting the interface data length to 8 bits
* with selecting 2-line display and 5 x 7-dot character font.
* Lets turn the display on so we have to send */
PortAccess.Output(data, 56); //Thread.Sleep(1);
//The delays can be smaller, Check Busy Flag info in the article

/* Makes the enable pin high and register pin low */
PortAccess.Output(control, 8); Thread.Sleep(1);
//The delays can be smaller, Check Busy Flag info in the article

/* Makes the enable pin low for LCD to read its
* data pins and also register pin low */
PortAccess.Output(control, 9); Thread.Sleep(1);
//The delays can be smaller, Check Busy Flag info in the article
}Which is just outping and delaying. This command takes atleast 10ms to
complete. This is no big deal for an lcd but huge for programming a
pic.Using smaller delays results in better performance but requires
spinwaits because there is no way to block for less than 1ms.Also, doesn't
matter if the sleep waits are larger than 1ms because that makes it even
slower... I mean that it doesn't matter to the lcd(most likely unless it has
some timeout circuitry)Jon

Sep 23 '07 #13

Rick Lones

Jon Slaughter wrote:

>>Ok, but I don't understand the "blocked on some event and will not be
rescheduled until the event occurs". How does that happen? I can't
picture how the machinery is setup to do this.
For an example of the type of low-level "machinery" that can be used to
synchronize tasks on resources, see "semaphore". A semaphore can be
viewed as a data structure which represents a resource or event. One key
aspect of a classic OS semaphore structure is a queue to which tasks
awaiting the resource or event can chain their task control blocks.

Ok, these are locks... but who implements them? I can't see how a program
can block itself to task scheduling... except maybe it blocks but then
something else has to unblock.

Whoever is responsible for managing the resource at hand "owns" the
sychroniztion mechanism, often enough this is the OS or some major subsystem of
same. You are correct that the unblocking is done by "something else".
Typically the something else is a driver, an interrupt handler, or some external
task. Think Producer/Consumer: the Consumer blocks (via Monitor.Wait()) until
the Producer makes available whatever the Consumer is needing. It is the
Producer who must wake up the sleeping Consumer via Monitor.Pulse(). (I highly
recommend Jon Skeet's very lucid explanation of this mechanism.)

>>If we use your example of ReadLine() then essentially what happens is
eventually the call works its way down to a hardware driver. But this
requires a task switch somewhere to go from user mode to kernel mode. Now
wouldn't the scheduler try and "revive" the user mode code that was
tasked switched because it wouldn't know that its waiting for the kernel
mode code to finish?
But "it" does know - because your task has called ReadLine()! That
announces that your task cannot continue until an Environment.NewLine has
been registered at the keyboard. You have blocked yourself until an I/O
operation completes.

Ok, so your saying essentially when a synchronous call is made that requires
a resource that you block yourself? Somehow you say "I'll wait until the
resource is free"... but then something else must unblock. But it would
seem then to do any type of blocking you have to know explicitly the
resource to block on so whatever else can unblock. Seems more complicated
than just having the other thing block and unblock.
What I mean is that it makes more sense to me for whatever that is
controlling the resource to control blocking and unblocking.

Example.

Program:
Hey, Save this file

Resource Handler:
(Internally to scheduler: Block Program,
Save the file,
Return status msg and unblock program)
Ok, I saved it.
------

instead of

Program:
Hey, Save this file
(tell scheduler to block me)

Resource Handler:
(Internally to scheduler:
Save the file,
Return status msg and unblock program)
Ok, I saved it.
I guess though it doens't quite matter where it does it and maybe its
actually better for the program to block itself.. just seems like theres no
context there and it could block itself for any reason even if the resource
handler doesn't need to block.(which I guess would be asynchronous commm)

Hmm, I don't want to get too deep into the details of your pseudocode here. You
are doing a lot of rather sophisticated guesswork but your overall model is
maybe a little skewed. One key piece of the picture is that a correctly
implemented synchronization method does not block arbitrarily. If my program
issues ReadLine() and there is already a CRLF in the input buffer the call will
return all bytes up to and including CRLF without blocking - the requested
"resource" was available in this case and the result is just a normal subroutine
call and return sequence. But if I issue ReadLine() and there is no CRLF to be
read I must wait until there is - so the ReadLIne() routine will follow a
differenct path that eventually calls an OS routine which one way or another
causes suspension of my task. Later some other task or OS component will notice
that there is now a CRLF available at the console and so I can be woken up and
allowed to proceed. Note that from the program's point of view it still looks
like the same call/return sequence - the intervening mechanism of waiting for
the resource is effectively transparent unless your program is time-sensitive.

>
I think eventually I will take a look at some embedded operating systems
because I'll probably need one in the future but at this point I just want
to write a program to compute with some devices I have that use some
protocols(ICSP, I2C and SPI). I want it ot be general enough so I program
these protocols in a nice way(instead of hard coding them). That way if the
future if I want to add another one such as modbus or rs-232(emulated on the
parallel port by polling... or just end up using the serial port) I can
without to much trouble.

I don't know what ICSP is, but I would be surprised if you can do either I2C or
SPI from DotNet. Or maybe you are running an embedded version on a micro?
Modbus (at least Modbus-ASCII) on the other hand could be done in managed code
on a PC running any version of Windows.

-rick-

Sep 24 '07 #14

Jon Slaughter

"Rick Lones" <Wr******@YcharterZ.netwrote in message
news:tJ**************@newsfe02.lga...

Jon Slaughter wrote:

>>>If we use your example of ReadLine() then essentially what happens is
eventually the call works its way down to a hardware driver. But this
requires a task switch somewhere to go from user mode to kernel mode.
Now wouldn't the scheduler try and "revive" the user mode code that was
tasked switched because it wouldn't know that its waiting for the
kernel mode code to finish?
But "it" does know - because your task has called ReadLine()! That
announces that your task cannot continue until an Environment.NewLine
has been registered at the keyboard. You have blocked yourself until an
I/O operation completes.

Ok, so your saying essentially when a synchronous call is made that
requires a resource that you block yourself? Somehow you say "I'll wait
until the resource is free"... but then something else must unblock. But
it would seem then to do any type of blocking you have to know explicitly
the resource to block on so whatever else can unblock. Seems more
complicated than just having the other thing block and unblock.
What I mean is that it makes more sense to me for whatever that is
controlling the resource to control blocking and unblocking.

Example.

Program:
Hey, Save this file

Resource Handler:
(Internally to scheduler: Block Program,
Save the file,
Return status msg and unblock program)
Ok, I saved it.
------

instead of

Program:
Hey, Save this file
(tell scheduler to block me)

Resource Handler:
(Internally to scheduler:
Save the file,
Return status msg and unblock program)
Ok, I saved it.
I guess though it doens't quite matter where it does it and maybe its
actually better for the program to block itself.. just seems like theres
no context there and it could block itself for any reason even if the
resource handler doesn't need to block.(which I guess would be
asynchronous commm)

Hmm, I don't want to get too deep into the details of your pseudocode
here. You are doing a lot of rather sophisticated guesswork but your
overall model is maybe a little skewed. One key piece of the picture is
that a correctly implemented synchronization method does not block
arbitrarily. If my program issues ReadLine() and there is already a CRLF
in the input buffer the call will return all bytes up to and including
CRLF without blocking - the requested "resource" was available in this
case and the result is just a normal subroutine call and return sequence.
But if I issue ReadLine() and there is no CRLF to be read I must wait
until there is - so the ReadLIne() routine will follow a differenct path
that eventually calls an OS routine which one way or another causes
suspension of my task. Later some other task or OS component will notice
that there is now a CRLF available at the console and so I can be woken up
and allowed to proceed. Note that from the program's point of view it
still looks like the same call/return sequence - the intervening mechanism
of waiting for the resource is effectively transparent unless your program
is time-sensitive.

I think this is essentially what I mean. I guess there is no real difference
who blocks the thread though because it will get blocked and it would just
be a different internal mechanism but probably have equivalent results.

>>
I think eventually I will take a look at some embedded operating systems
because I'll probably need one in the future but at this point I just
want to write a program to compute with some devices I have that use some
protocols(ICSP, I2C and SPI). I want it ot be general enough so I
program these protocols in a nice way(instead of hard coding them). That
way if the future if I want to add another one such as modbus or
rs-232(emulated on the parallel port by polling... or just end up using
the serial port) I can without to much trouble.

I don't know what ICSP is, but I would be surprised if you can do either
I2C or SPI from DotNet. Or maybe you are running an embedded version on a
micro? Modbus (at least Modbus-ASCII) on the other hand could be done in
managed code on a PC running any version of Windows.

Well, I'd like to implement them all... but at this point the main one is
ICSP. The real difference is that ICSP is for programming MCU's while the
others are for communication. Since you are programming it requires more
data lines to control the mcu(such as power and a power on sequence but is
essentially clocked communications and very similar in its overall "look" to
any other clocked protocol such as I2C and SPI).

It is not a complicated protocol though. In fact its very simple. I just
want to use the best method I can to get the best results I can. I could
easily do this in C# using the kernel mode driver to do "indirect" access to
the port. But here I do not want to bit bang as it seems like a very
inelegant solution. (although for all practical purposes it works)

I'll see what happens though. I'm sure learning about kernel mode
programming can't hurt ;)

Thanks,
Jon

Sep 24 '07 #15

Ben Voigt [C++ MVP]

>Yeah, I understand this. I don't understand how B can actually do any

>blocking because the only way it could do this is to poll the resource
until its ready or hook some interrupt. The first case then isn't
blocking and the second isn't available to normal windows applications.

Ah, but it is. The second case, that is. Normal Windows applications
don't have direct access to the interrupts, no. But they do have access
to methods that allow the OS itself to use the interrupts, which
implicitly provides a mechanism for the application itself to use the
interrupts.

This is, in fact, how a lot of the various i/o methods work.

Just wanted to throw in here, that depending on your registry settings (this
is true by default for workstations), whenever an I/O operation completes,
any thread waiting on that resource gets a dynamic priority boost, which can
have the effect of "interrupting" whatever thread is currently running (or
at least moving to the head of the queue).

http://msdn2.microsoft.com/en-us/library/ms684828.aspx

Sep 24 '07 #16

Ben Voigt [C++ MVP]

"Peter Duniho" <Np*********@NnOwSlPiAnMk.comwrote in message
news:13*************@corp.supernews.com...

Jon Slaughter wrote:
>Because it uses a specific protocol AFAIK so you cannot use any deviced
attached to the port(only those deviced designed to communicate on it).

Have you tried? The basic parallel port driver should be protocol
agnostic, AFAIK. You open it with CreateFile(), and it's just a
read/write stream.

The driver should take care of all the data integrity stuff, while your
application can worry about the application protocol.

That's all well and good, but it's not the application protocol only Jon
needs control of, it's the framing protocol as well. It's just the wrong
layer of the OSI model.

Kind of like trying to make DHCP requests using a TCP proxy connection. You
just can't. You have complete control over the application protocol, but
since DHCP doesn't use TCP, you cannot formulate a DHCP packet with a TCP
connection.

Similarly CreateFile("LPT1:") enforces a particular framing protocol,
whereas writing to the I/O port addresses of the parallel port controller
lets you control each parallel port pin individually.

Sep 24 '07 #17

Peter Duniho

Ben Voigt [C++ MVP] wrote:

Just wanted to throw in here, that depending on your registry settings (this
is true by default for workstations), whenever an I/O operation completes,
any thread waiting on that resource gets a dynamic priority boost, which can
have the effect of "interrupting" whatever thread is currently running (or
at least moving to the head of the queue).

Well, if I recall correctly, boosting the priority of a thread never
actually preempts another thread. So it'd always be the latter (your
parenthetical remark).

But yes, you're right...depending on what a thread was waiting on, it is
not necessarily the case that when it becomes runnable it has to wait on
every other thread of the same priority. It may get to go to the head
of the line, and only have to wait for the currently executing thread to
finish its timeslice.

Pete

Sep 25 '07 #18

Peter Duniho

Ben Voigt [C++ MVP] wrote:

"Peter Duniho" <Np*********@NnOwSlPiAnMk.comwrote in message
news:13*************@corp.supernews.com...
>The driver should take care of all the data integrity stuff, while your
application can worry about the application protocol.

That's all well and good, but it's not the application protocol only Jon
needs control of, it's the framing protocol as well. It's just the wrong
layer of the OSI model.

Easy for you to say now, after Jon's elaborated on the scenario. :)

I didn't have that luxury when I wrote the comment. :p

Sep 25 '07 #19

Explain this about threads

Similar topics