472,119 Members | 1,490 Online
Bytes | Software Development & Data Engineering Community
Post +

Home Posts Topics Members FAQ

Join Bytes to post your question to a community of 472,119 software developers and data experts.

Uniquely Identifying Multiple/Concurrent Async Tasks

It appears that System.Random would provide an acceptable means through
which to generate a unique value used to identify multiple/concurrent
asynchronous tasks.

The usage of the value under consideration here is that it is supplied to
the AsyncOperationManager.CreateOperation(userSupplied State) method... with
userSuppliedState being, more or less, a taskId.

In this case, the userSuppliedState {really taskId} is of the object type,
and could therefore be just about anything. Consequently it appears to me
that a unique integer as generated by System.Random would suffice (and yes,
I understand that System.Random doesn't provide *truly* random values).

Would you concur that System.Random would be "good enough" - or would you
recommend some better alternative for generating the taskId?

Thanks!
Sep 14 '07 #1
10 4298
Frankie,

If you really need something that is pretty much guaranteed (but not
completely) to be random, and unique, then I suggest you use a Guid
instance. You would have to generate new Guids at a rate of something like
5000/second for the next billion years or something ridiculous like that
before you actually create a duplicate.
--
- Nicholas Paldino [.NET/C# MVP]
- mv*@spam.guard.caspershouse.com

"Frankie" <A@B.COMwrote in message
news:%2****************@TK2MSFTNGP03.phx.gbl...
It appears that System.Random would provide an acceptable means through
which to generate a unique value used to identify multiple/concurrent
asynchronous tasks.

The usage of the value under consideration here is that it is supplied to
the AsyncOperationManager.CreateOperation(userSupplied State) method...
with userSuppliedState being, more or less, a taskId.

In this case, the userSuppliedState {really taskId} is of the object type,
and could therefore be just about anything. Consequently it appears to me
that a unique integer as generated by System.Random would suffice (and
yes, I understand that System.Random doesn't provide *truly* random
values).

Would you concur that System.Random would be "good enough" - or would you
recommend some better alternative for generating the taskId?

Thanks!

Sep 14 '07 #2
Frankie wrote:
It appears that System.Random would provide an acceptable means through
which to generate a unique value used to identify multiple/concurrent
asynchronous tasks.
Nope.
The usage of the value under consideration here is that it is supplied to
the AsyncOperationManager.CreateOperation(userSupplied State) method... with
userSuppliedState being, more or less, a taskId.
How are you using it as an ID? Can you provide a more concrete example?
In this case, the userSuppliedState {really taskId} is of the object type,
and could therefore be just about anything. Consequently it appears to me
that a unique integer as generated by System.Random would suffice (and yes,
I understand that System.Random doesn't provide *truly* random values).
It can be just about anything. Typically, it would be a class that
stores context important to the task instance.
Would you concur that System.Random would be "good enough" - or would you
recommend some better alternative for generating the taskId?
No, it would be awful. Random numbers aren't guaranteed to be unique.

If each task requires some sort of unique context, why not just create a
class that can contain this context, store the data related to the
context in the class, and use a reference to the class a your "unique
value"?

If you really just need an integer, why not just use a sequential
number? If you could potentially create 4 billions tasks over time,
you'll have to check newly generated numbers to make sure they aren't in
use, but that would be a requirement if you're using Random anyway,
since those aren't guaranteed to be unique.

Pete
Sep 14 '07 #3
<snip>
Re:
<< If each task requires some sort of unique context,..."

I'm not sure what you mean by "context" here... what I'm referring to is
that the task just needs a unique _identifier_. What I'm doing is
implementing the Event-based async pattern. So in this case I have an async
operation - for which there can be multiple concurrent operations going on.
For example, it cold be a method named GrabFileAsync that retrieves a file
from some remote location. The client could request 15 different files - so
we'd have 15 GrabFileAsync() calls - with potentially all 15 of them running
concurrently. Each of these 15 concurrent operations needs to be uniquely
identified. The client, upon calling GrabFileAsync() would then supply a
unique identifier. When any of the 15 async operations completes or reports
progress, etc, the client would then use the "task id" to identify which
particular GrabFileAsync() operation has completed, etc.

It's entirely possible that your meaning "context" is the similar to mine.
If so, maybe you could clarify why an integer isn't the best, and why going
with some "context" class would be better. If not, are you thinking
SynchronizationContext or something like that? If so, then you'd be missing
the fact that the AsyncOperation - which is part of the Event-based async
pattern implementation I'm going with - basically encapsulates the
underlying SynchronizationContext.... so therefore no need for me to supply
SynchronizationContext.

-F
Sep 14 '07 #4
Frankie,
Typically what you would do here is pass your own custom StateObject class
instance as the UserSupplied State parameter. In the callback method you can
cast the received state parameter back to an instance of your StateObject,
which would have fields that identify the file name or whatever it is you're
doing.
In the example code for the AsyncOperationManager, the taskID is stored in a
Hybrid Dictionary. if you want to make it easy for it to be unique and you
don't really need any addtional state info to capture, just use a guid as was
mentioned.
-- Peter
Recursion: see Recursion
site: http://www.eggheadcafe.com
unBlog: http://petesbloggerama.blogspot.com
BlogMetaFinder: http://www.blogmetafinder.com

"Frankie" wrote:
It appears that System.Random would provide an acceptable means through
which to generate a unique value used to identify multiple/concurrent
asynchronous tasks.

The usage of the value under consideration here is that it is supplied to
the AsyncOperationManager.CreateOperation(userSupplied State) method... with
userSuppliedState being, more or less, a taskId.

In this case, the userSuppliedState {really taskId} is of the object type,
and could therefore be just about anything. Consequently it appears to me
that a unique integer as generated by System.Random would suffice (and yes,
I understand that System.Random doesn't provide *truly* random values).

Would you concur that System.Random would be "good enough" - or would you
recommend some better alternative for generating the taskId?

Thanks!
Sep 14 '07 #5
Frankie wrote:
I'm not sure what you mean by "context" here... what I'm referring to is
that the task just needs a unique _identifier_.
But what are you going to do with that identifier? For example, if all
you're going to do is use it to look up some data specific to the
specific instance of the operation, then why not just use the data
itself as the unique identifier?
[...]
Each of these 15 concurrent operations needs to be uniquely
identified.
Of course.
The client, upon calling GrabFileAsync() would then supply a
unique identifier. When any of the 15 async operations completes or reports
progress, etc, the client would then use the "task id" to identify which
particular GrabFileAsync() operation has completed, etc.
But what does "identifying" which operation has completed gain you? The
identification does you no good unless you somehow correlated the
identification with some data specific to the operation. So you might
as well use the data itself as your unique identifier.
It's entirely possible that your meaning "context" is the similar to mine.
If so, maybe you could clarify why an integer isn't the best, and why going
with some "context" class would be better.
See above. All that adding an integer into the design does is create an
extra level of indirection you need to resolve upon completion of an
operation. I don't see the point in doing that.

As an example, consider the Socket class. When you call BeginReceive,
you pass a "state" parameter. This is what I'm calling "context". In
the most basic case, the code using the Socket instance would typically
pass at a minimum the Socket instance reference itself. Then in the
receive callback, the "state" parameter passed to the callback can be
cast back to a Socket which can then be used to complete the operation
(calling EndReceive(), for example).

If there were other data related to the receive operation that was
important (for example, perhaps the Socket is being used to transfer a
file and you want to easily get the FileStream you're using to save the
data to the disk), then you'd have a class that contains both the Socket
instance reference as well as that other data (for example, the
FileStream reference). Then in the receive callback method, you just
cast the "state" parameter back to your particular class, and from that
retrieve the Socket instance and other data (such as the FileStream
instance reference).

If you instead use a unique integer, then instead of just casting the
value to the appropriate class, you have to use that integer to look up
an instance of the appropriate class in some data structure, like an
Array or Dictionary<>. Why add that extra bit of work when you could
just pass the reference you want in the first place?

Pete
Sep 14 '07 #6
Thanks for the dialog on this.... to continue with it...

<snip>
But what are you going to do with that identifier? For example, if all
you're going to do is use it to look up some data specific to the specific
instance of the operation, then why not just use the data itself as the
unique identifier?
The identifier wouldn't necessarily be used to look up data [related to the
operation]. The identifier would identify the specific instance of the
operation, itself. There's no reason that we _must_ associate any data with
an async operation... just have it go to work. I would agree with the
position that states we would _usually_ have data to associate with an async
operation, and therefore could/should use that to ID the operation as you
are suggesting.

>
>[...]
Each of these 15 concurrent operations needs to be uniquely identified.

Of course.
>The client, upon calling GrabFileAsync() would then supply a unique
identifier. When any of the 15 async operations completes or reports
progress, etc, the client would then use the "task id" to identify which
particular GrabFileAsync() operation has completed, etc.

But what does "identifying" which operation has completed gain you? The
identification does you no good unless you somehow correlated the
identification with some data specific to the operation. So you might as
well use the data itself as your unique identifier.
Yes, but that's assuming that data must or can be associated with an async
operation prior to initiating it. Can't we have async operations without
associated data? We can certainly have methods that return void and take
zero parameters. They just "do something". What's to say we can't call such
a method asynchronously? At the bottom of this post I have a more detailed
example. The information we might want from such an asynchronous call might
include things about the call, itself, which might not be available before
calling the method. Such data about the call, itself, might include things
like "user cancelled the operation before it completed", "operation ran into
Exception xyz during the course of its operations", or "the operation just
now completed." In these cases, we'd have to invent some identifier out of
thin air - perhaps a unique integer, which is what I was thinking in the OP
here. Maybe I'm still wrong about that.

>It's entirely possible that your meaning "context" is the similar to
mine. If so, maybe you could clarify why an integer isn't the best, and
why going with some "context" class would be better.

See above. All that adding an integer into the design does is create an
extra level of indirection you need to resolve upon completion of an
operation. I don't see the point in doing that.
Your "unnecessary indirection" point is well taken -but only in cases were
we would have data to associate with the operation prior to kicking it off.
If no data exists prior to initiating the async operation, then we'd have to
come up with _some_ way to ID the operation.

>
As an example, consider the Socket class. When you call BeginReceive, you
pass a "state" parameter. This is what I'm calling "context". In the
most basic case, the code using the Socket instance would typically pass
at a minimum the Socket instance reference itself. Then in the receive
callback, the "state" parameter passed to the callback can be cast back to
a Socket which can then be used to complete the operation (calling
EndReceive(), for example).

If there were other data related to the receive operation that was
important (for example, perhaps the Socket is being used to transfer a
file and you want to easily get the FileStream you're using to save the
data to the disk), then you'd have a class that contains both the Socket
instance reference as well as that other data (for example, the FileStream
reference). Then in the receive callback method, you just cast the
"state" parameter back to your particular class, and from that retrieve
the Socket instance and other data (such as the FileStream instance
reference).
Great example... couldn't agree more! But this is an example where we
actually have something to run with before kicking off the async operation.
>
If you instead use a unique integer, then instead of just casting the
value to the appropriate class, you have to use that integer to look up an
instance of the appropriate class in some data structure, like an Array or
Dictionary<>. Why add that extra bit of work when you could just pass the
reference you want in the first place?

Now, I can see this next question "coming down mainstreet" ---Okay
Frankie, what is an example of an event for which we wouldn't have at least
_some_ data with which to uniquely ID the asycn operation prior to kicking
it off? Here goes...

I'm writing a utility app that will be used to update a bunch of Web sites
by copying files (.aspx, gif, etc) to various site directories. The utility
additionally updates the underlying SQL Server database by (1) executing DDL
scripts and (2) installing or updating stored procedures. Prior to launching
an update operation, the utility "validates" the destination Web sites and
SQL Server databases to be updated. Specifically, the Validate method
ensures that (1) each Web site's root directory exist, and that all required
subdirectories exist. A separate Validate method verifies connectivity to
the SQL Server databases. The utility does practically none of this grunt
work, itself. Rather, it dynamically loads "installers" which are classes
that implement a common IInstaller interface, which defines a Validate
method. The client, here, doesn't know what is being validated. All it's
doing is looping through its list of IInstallers and telling each to
Validate its environment. I'm modifying this arrangement so that the
Validate methods can run asynchronously (i.e., so the interface will be
modified to define ValidateAsync() in addition to the synchronous Validate()
method). All the client app will do is kick off these ValidateAsync
operations which will in turn report (1) progress (i.e., "SomeWebSite.com
validated successfully", or "failed to connect to TheDbNamedX" etc - for
each Web site and for each db), and (2) report when each ValidateAsync
operation completes (including the usual AsycnCompletedEventArgs stuff). In
this scenario I'm not sure how the client app that initiates these
ValidateAsync operations would identify each async operation without
assigning some unique ID invented out of thin air.

I'd appreciate your further perspective on this.

-Frankie
Sep 15 '07 #7
Frankie wrote:
[...]
<< And why can that identification not be accomplished via an instance of
some class that specifically refers to the operation in some way? >>

It _could_ be. But in my case I don't already have a class that refers to
the async operation.
If you have no class that refers to the async operation, then how would
you use a numeric ID to map to an async operation?

Surely the async operation has _some_ data somewhere. Otherwise, you
have no way to reference it.
In my case, the client is initiating the async
operation by calling the ValidateAsync method of an interface - so the
client doesn't know anything about the particular class implementing the
async operation
So what? I never said there's an existing class that the client knows
about. We are (as far as I know) talking about how the design _could_
be, not how it is.

So, just because there's no class the client knows about now, that
doesn't mean there couldn't be one. Just return the reference to the
class implementing the async operation.

It doesn't need to be the actual type known to the code that actually
uses it. Publish some sort of base class or interface that you can
return; all the client would know is "this is my unique reference to the
async operation". Then the actual reference would be a class that
inherits or implements the base class or interface, respectively.
[...]
ValidateAsync takes a
parameter that can be any object. That parameter is subsequently used in the
"event publisher/worker class" to identify the async operation.
Why is the worker class using data from the client to identify its own
data? What happens if the client uses the same value twice? Is there
any check on the parameter to ensure that it is in fact unique?
I was
thinking this parameter - originating in the client- could be a unique
integer being that there really was no class to or other obvious way for the
client to identify the particular async operation.

Re:
<< But that doesn't mean that the data isn't instantiated somewhere >>

I'd agree that some data is _likely_ instantiated somewhere (in client or in
the asycn operation itself), but I don't see that as a requirement of
anything (and I'm suspect you don't either), and in my case my client simply
doesn't have it.
I don't understand why you keep mentioning what the client doesn't have
(see each of the above two paragraphs). Are we not talking about a
component that you are in the process of designing? What does it matter
what the client has now? How does that restrict what you are able to do
in your design?

Just because you're not returning something useful to the client now, I
don't see why that means you cannot do so in a new version of your design.

It seems to me that one of the reasons my point wasn't getting across in
a previous message is that it seems that the unique ID you're looking
for isn't in fact used by the async processing, but rather only by the
client for managing some list of outstanding tasks. IMHO, that
overlooks the statement you made about wanting to be able to cancel the
async operation, but I can't figure out any other reason that what I've
written isn't clear enough.

So, let's take the two examples you've mentioned most recently of things
you might do with the unique ID:

1) Manage a client-side data structure (eg list) of outstanding tasks

2) Allow for canceling of outstanding tasks

Now, in the #1 scenario, I agree it doesn't matter what the value is as
long as it's unique. Though, if you haven't associated the value with
any actual data, it begs the question as to why use a value at all. As
you've already pointed out, you could just keep a counter.

In the #2 scenario, however...I find it obvious that there must be
_some_ data somewhere associated with that ID. If one assumes that the
client will pass that ID to the async implementation, and that the async
implementation will be able to use that ID to identify a task to be
canceled, then there _must_ be some mapping from the ID to some data
representing the task.

So, instead of having the client pass the ID and managing some sort of
dictionary mapping it to the data representing the task, why not just
pass a reference to that data back to the client? The client need not
know what's actually in it (see above regarding a simple base class or
interface); it just needs to hang on to it in case it wants to refer to
the specific task in the future.

I'm afraid I have, at least for the moment, run out of different ways to
state the above. It all comes down to the fact that the ID is
apparently supposed to represent some "thing" and in .NET all "things"
are ultimately representable by references, so IMHO you might as well
use that reference rather than some arbitrary ID that maps to that
reference.

Note that the key here is that there seems to be a one-to-one
relationship between this ID and some "thing". There are of course
situations in which you need a unique numeric ID that maps to something
that either doesn't exist yet, or you need the ID to be constant across
multiple executions of your program, or any number of other situations
in which unique numeric ID's unrelated to the object references are
needed. But so far, nothing about what you've described suggests that
any of those possible situations apply here.

Pet
Sep 15 '07 #8
Frankie wrote:
[...]
Unless I'm mistaken, the only reasonable way to get away from the client
generating the unique ID for the async operation - in the implementation at
the above links - would be for the CalculatePrimeAsync method (which
currently returns void) to return an IAsyncResult to the client. But that
would break the model being promoted here which hides IAsyncResult, etc and
other async operation implementation details from the client.
Okay...thanks for the links. I now understand better what design model
you're trying to follow.

First, I will point out that in the sample, it is very clear what the
answer to my repeated question regarding how the numeric ID is used. It
is used to retrieve an AsyncOperation from a HybridDictionary instance.
So the direct implication with respect to my previous comments is that
it's the AsyncOperation that you should pass back to the client, somehow
(note that you need not pass the client something it recognizes as an
AsyncOperation, or even something from which it can easily get the
AsyncOperation...it just needs to be something that immediately can be
translated into an AsyncOperation).

If you want to write code that generates unique numeric IDs, fine.
IMHO, GUIDs are overkill and random numbers aren't going to work at all
(since they aren't guaranteed to be unique). But otherwise, you
certainly could to that. The sample code you posted uses GUIDs, but
sequential numbers would work just fine (just be prepared to catch the
ArgumentException for non-unique numbers and try the next one, in the
unlikely even that you wrap around the full range of the 32-bit or
64-bit numeric variable you're using).

But, a couple of points that I hope will make clear what I'm trying to say:

1) It is not true that "the only reasonable way to get away from
the client generating the unique ID for the async operation...would be
for the CalculatePrimeAsync method...to return an IAsyncResult". The
method can return anything you want it to. It need not be an
IAsyncResult, and in fact should not be unless you really want a
mixed-mode event-plus-IAsyncResult design.

Using the sample code you're referring to as a template, let's look at
what modification I would make that would meet the goals I've already
stated. Rather than having a void return value, I would have
CalculatePrimeAsync return an instance of a class that looks something
like this:

class PrimeAsyncOperation : IPrimeAsyncOperation
{
AsyncOperation _ao;

public AsyncOperation AsyncOperation
{
get { return _ao; }
}

public PrimeAsyncOperation(AsyncOperation ao)
{
_ao = ao;
}
}

where:

public interface IPrimeAsyncOperation
{
}

The PrimeAsyncOperation class itself need not be visible to the client;
only the interface IPrimeAsyncOperation needs to be, and that's what the
CalculatePrimeAsync method would return.

Or rather, it's nice to have it that way; you could actually just return
an Object and forget about the interface altogether, but having a type
allows you to ensure in your client code that you maintain objects of
the right type. The interface isn't really required here; I just feel
it makes the code a little nicer.

Anyway, then any time the client wants to refer to the async operation,
rather than passing an ID which then needs to be used to look up the
AsyncOperation, all that the worker class has to do is retrieve the
AsyncOperation from the class.

One final note: the above still has one level of indirection. It's MUCH
more efficient than messing around with a hash table a la
HybridDictionary, but it's still a level of indirection. It's required
if you want to pass back a typed interface because AsyncOperation is
sealed and you don't have a way of instantiating the AsyncOperation
class except through the factory AsyncOperationManager (either situation
would force the issue, and you have both here)

But, if you are okay with passing back a plain old Object, then this
extra level of indirection can just go away. You'd just pass the
AsyncOperation instance itself back, as an Object. The client wouldn't
know anything about it except that it would use it as the way of
uniquely identifying the operation.

Alternatively, in a situation where the relevant class isn't a sealed
class with a factory for instantiation, you could create a new class
that inherits the actual identifying class, and have that new class
implement the empty interface that is public to the client. Then you
can have a typed reference passed back to the client, but which still
doesn't expose the internal aspects of the async implementation.

Note also that there's no requirement that you use AsyncOperation to
manage your asynchronous tasks. You could easily use some other
mechanism that does allow for inheriting the state object yourself.

2) IMHO, there is nothing invalid about having an event-based
design that returns something from the async method. Just because that
one MSDN sample doesn't, that doesn't mean it's prohibited. You can
design your class however you feel will work best for you. Heck, even
if you did want to return an IAsyncResult, you could, though it's true
that implies a certain level of non-event-based-ness that may not be
desirable.

The fact is, IMHO the sample code you're referring to is not a very nice
implementation of an event-based async design. I personally don't like
the aspect that a single class is responsible for managing multiple
asynchronous tasks. I prefer instead a design similar to the
BackgroundWorker where you instantiate a single class for each instance
of an asynchronous operation. Then, the class itself is all you need to
reference the asynchronous operation.

But, assuming you want a single class to manage multiple operations,
there's still no requirement that the client be the one to provide the
unique identifier, and IMHO it is much more natural and efficient to
allow the worker class managing the operations to provide the unique
identifier.

And I think that's all I'm going to say about that. :)

Pete
Sep 15 '07 #9
<snip all>

Thanks for the helpful dialog on this. I learned a lot as this was my
initial encounter with the Event-based async pattern and the MSDN sample is
what I was basing everything off of. And thanks especially for posting some
very clear feedback and perspective on how you would modify the MSDN sample
in a way that frees the client from having to come up with some arbitrary
ID. I didn't really like it to begin with, and especially their use of
Random... thus the OP here.

-Framkie
Sep 15 '07 #10
Frankie wrote:
<snip all>

Thanks for the helpful dialog on this. I learned a lot as this was my
initial encounter with the Event-based async pattern and the MSDN sample is
what I was basing everything off of. And thanks especially for posting some
very clear feedback and perspective on how you would modify the MSDN sample
in a way that frees the client from having to come up with some arbitrary
ID.
You're welcome. As you can see, it is often much easier to offer
comments when there is a concrete example of code to start with
(especially if that code sample is relatively minimal).
I didn't really like it to begin with, and especially their use of
Random... thus the OP here.
Indeed. I hope if nothing else it's clear that using Random isn't
appropriate for generating task IDs. Note, however, that the use of
Random in the sample code isn't for the task ID, but rather for the
number to test for prime-ness. The sample code uses a GUID as the task ID.

So, while there are obviously aspects of the sample I would do
differently, it's not actually broken. :)

Pete
Sep 15 '07 #11

This discussion thread is closed

Replies have been disabled for this discussion.

Similar topics

3 posts views Thread by Christopher Weaver | last post: by
1 post views Thread by milesm | last post: by
6 posts views Thread by Alan Silver | last post: by
2 posts views Thread by jasonsgeiger | last post: by
1 post views Thread by jonathan | last post: by
1 post views Thread by jaffarkazi | last post: by
reply views Thread by leo001 | last post: by

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.