473,394 Members | 1,828 Online
Bytes | Software Development & Data Engineering Community
Post Job

Home Posts Topics Members FAQ

Join Bytes to post your question to a community of 473,394 software developers and data experts.

Basic Threading question

Hello all,

I was perusing the internet for information on threading when I came
across this group. Since there seems to be a lot of good ideas and
useful info I thought I'd pose a question.

Threading is a new concept for me to implement. Here is my problem.

I have a system that receives xml files and records their file
locations in a database. I can potentially receive thousands,
sometimes hundreds of thousands, of files per day. When files are
received and stored in a folder on the server I need another
application to read in the paths from the database, locate, process,
and save each xml file. I want to create a windows service that can
read in a list from the database and assign work to multiple threads
in order to achieve greater performance. But, I am not sure where to
begin and I am having option paralysis. Do I need to create the
threads manually like:

Dim worker as New Thread(Address of Something)
Worker.Start()

Do I need to use the thread pool? The BackgroundWorker control? I have
seen a lot of examples. What I'd like is if someone could make a
research recommendation based on my scenario if possible. I realize
this is probably a basic question about a complex issue so any
feedback to get me thinking would be good.

Much appreciated.

May 11 '07 #1
19 1776
Frankie,

It's my understanding that threading does not increase performance but makes an interactive program
more responsive while it is running a lengthy process.

If your job is going to run as a service, it has no GUI. So why have a threaded app?

Flomo
--

fr**********@yahoo.com wrote:
Hello all,

I was perusing the internet for information on threading when I came
across this group. Since there seems to be a lot of good ideas and
useful info I thought I'd pose a question.

Threading is a new concept for me to implement. Here is my problem.

I have a system that receives xml files and records their file
locations in a database. I can potentially receive thousands,
sometimes hundreds of thousands, of files per day. When files are
received and stored in a folder on the server I need another
application to read in the paths from the database, locate, process,
and save each xml file. I want to create a windows service that can
read in a list from the database and assign work to multiple threads
in order to achieve greater performance. But, I am not sure where to
begin and I am having option paralysis. Do I need to create the
threads manually like:

Dim worker as New Thread(Address of Something)
Worker.Start()

Do I need to use the thread pool? The BackgroundWorker control? I have
seen a lot of examples. What I'd like is if someone could make a
research recommendation based on my scenario if possible. I realize
this is probably a basic question about a complex issue so any
feedback to get me thinking would be good.

Much appreciated.
May 11 '07 #2
multithreading will improve performance when used correctly. It is
not only for UI work, web servers such as IIS are a good example of
this. I recommend using the thread pool and queuing work since the
workload can be variable. The Background worker control uses the
thread pool, but you'll get better control if you queue things into
the thread pool yourself. With that said, the Background worker
control is fairly easy to use for a someone just getting into
multithreading. I don't recommend creating threads manually since the
cost of creating threads is very high.

May 11 '07 #3
The threadpool has the additional advantage of limiting the number of
simultaneous running threads and will queue additional threads in a first in
first processed order. This allows the framework to adjust for system
resources as well as preventing too many threads from running at one time,
which can slow the entire system down via excessive thread context
switching. Although there are other situations where doing your own
threading is the way to go, this one is definitely a thread pool scenario.

Mike Ober.

"Charlie Brown" <cb****@duclaw.comwrote in message
news:11*********************@n59g2000hsh.googlegro ups.com...
multithreading will improve performance when used correctly. It is
not only for UI work, web servers such as IIS are a good example of
this. I recommend using the thread pool and queuing work since the
workload can be variable. The Background worker control uses the
thread pool, but you'll get better control if you queue things into
the thread pool yourself. With that said, the Background worker
control is fairly easy to use for a someone just getting into
multithreading. I don't recommend creating threads manually since the
cost of creating threads is very high.


May 11 '07 #4
On May 10, 5:33 pm, "Flomo Togba Kwele" <F...@community.nospamwrote:
Frankie,

It's my understanding that threading does not increase performance but makes an interactive program
more responsive while it is running a lengthy process.

If your job is going to run as a service, it has no GUI. So why have a threaded app?

Flomo
--

frankiesp...@yahoo.com wrote:
Hello all,
I was perusing the internet for information on threading when I came
across this group. Since there seems to be a lot of good ideas and
useful info I thought I'd pose a question.
Threading is a new concept for me to implement. Here is my problem.
I have a system that receives xml files and records their file
locations in a database. I can potentially receive thousands,
sometimes hundreds of thousands, of files per day. When files are
received and stored in a folder on the server I need another
application to read in the paths from the database, locate, process,
and save each xml file. I want to create a windows service that can
read in a list from the database and assign work to multiple threads
in order to achieve greater performance. But, I am not sure where to
begin and I am having option paralysis. Do I need to create the
threads manually like:
Dim worker as New Thread(Address of Something)
Worker.Start()
Do I need to use the thread pool? The BackgroundWorker control? I have
seen a lot of examples. What I'd like is if someone could make a
research recommendation based on my scenario if possible. I realize
this is probably a basic question about a complex issue so any
feedback to get me thinking would be good.
Much appreciated.
There are a couple of scenarios for threading. One is to improve user
experience in a win form app as you have mentioned. The other is for
performance scalability, as I am after in my example.

May 11 '07 #5
On May 10, 5:51 pm, Charlie Brown <cbr...@duclaw.comwrote:
multithreading will improve performance when used correctly. It is
not only for UI work, web servers such as IIS are a good example of
this. I recommend using the thread pool and queuing work since the
workload can be variable. The Background worker control uses the
thread pool, but you'll get better control if you queue things into
the thread pool yourself. With that said, the Background worker
control is fairly easy to use for a someone just getting into
multithreading. I don't recommend creating threads manually since the
cost of creating threads is very high.
OK. This is pretty much what I expected. The way to do it would be to
creat a class containing a subset of the Db records -- the paths to
the xml files. Pass this in to a sub routine as an object for
processing. When the thread completes do a callback on the worker
method. Meanwhile, I can get another subset from the Db and repeat the
process. Sound good? Anything else I should know? Thanks for the
reply!

Frank

May 11 '07 #6
On May 10, 7:55 pm, "Michael D. Ober" <obermd.@.alum.mit.edu.nospam>
wrote:
The threadpool has the additional advantage of limiting the number of
simultaneous running threads and will queue additional threads in a first in
first processed order. This allows the framework to adjust for system
resources as well as preventing too many threads from running at one time,
which can slow the entire system down via excessive thread context
switching. Although there are other situations where doing your own
threading is the way to go, this one is definitely a thread pool scenario.

Mike Ober.

"Charlie Brown" <cbr...@duclaw.comwrote in message

news:11*********************@n59g2000hsh.googlegro ups.com...
multithreading will improve performance when used correctly. It is
not only for UI work, web servers such as IIS are a good example of
this. I recommend using the thread pool and queuing work since the
workload can be variable. The Background worker control uses the
thread pool, but you'll get better control if you queue things into
the thread pool yourself. With that said, the Background worker
control is fairly easy to use for a someone just getting into
multithreading. I don't recommend creating threads manually since the
cost of creating threads is very high.
This is consistant with my reading. Always helpful to get a second or
third opinion when approching a new concept. Thanks.

May 11 '07 #7
"Flomo Togba Kwele" <Fl***@community.nospamwrote in
news:xn***************@news.giganews.com:
Frankie,

It's my understanding that threading does not increase performance but
makes an interactive program more responsive while it is running a
lengthy process.
Threading can improve an application's performance - after all some
processes use very little CPU and thus you can run multiple threads to
maximize CPU usage.
May 11 '07 #8
fr**********@yahoo.com wrote in news:1178843096.073514.227330
@y5g2000hsa.googlegroups.com:
I have a system that receives xml files and records their file
locations in a database. I can potentially receive thousands,
sometimes hundreds of thousands, of files per day. When files are
received and stored in a folder on the server I need another
application to read in the paths from the database, locate, process,
and save each xml file. I want to create a windows service that can
read in a list from the database and assign work to multiple threads
in order to achieve greater performance. But, I am not sure where to
begin and I am having option paralysis. Do I need to create the
threads manually like:

Take a look at the FileSystemWatcher class - it can monitor a folder for
changes. Once it detects a change, you can fire off a thread to process the
change.

A couple other options:

1. Submit the XMLs to a web service and process the files immediately.
Since web services are executed by IIS/ASP.NET, it is extremely scalable

2. Perhaps look at using a MSMQ? MSMQ can handle a large volume of incoming
requests and hold them for you until your applications are ready to process
the data. MSMQ is quite easy to use ... and very reliable.
May 11 '07 #9
Never use a enduser program to learn

Cor

<fr**********@yahoo.comschreef in bericht
news:11*********************@l77g2000hsb.googlegro ups.com...
On May 10, 5:33 pm, "Flomo Togba Kwele" <F...@community.nospamwrote:
>Frankie,

It's my understanding that threading does not increase performance but
makes an interactive program
more responsive while it is running a lengthy process.

If your job is going to run as a service, it has no GUI. So why have a
threaded app?

Flomo
--

frankiesp...@yahoo.com wrote:
Hello all,
I was perusing the internet for information on threading when I came
across this group. Since there seems to be a lot of good ideas and
useful info I thought I'd pose a question.
Threading is a new concept for me to implement. Here is my problem.
I have a system that receives xml files and records their file
locations in a database. I can potentially receive thousands,
sometimes hundreds of thousands, of files per day. When files are
received and stored in a folder on the server I need another
application to read in the paths from the database, locate, process,
and save each xml file. I want to create a windows service that can
read in a list from the database and assign work to multiple threads
in order to achieve greater performance. But, I am not sure where to
begin and I am having option paralysis. Do I need to create the
threads manually like:
Dim worker as New Thread(Address of Something)
Worker.Start()
Do I need to use the thread pool? The BackgroundWorker control? I have
seen a lot of examples. What I'd like is if someone could make a
research recommendation based on my scenario if possible. I realize
this is probably a basic question about a complex issue so any
feedback to get me thinking would be good.
Much appreciated.

There are a couple of scenarios for threading. One is to improve user
experience in a win form app as you have mentioned. The other is for
performance scalability, as I am after in my example.

May 11 '07 #10
You can be sure that a process takes more throughput time when threading is
used then when not.
"Spam Catcher" <sp**********@rogers.comschreef in bericht
news:Xn**********************************@127.0.0. 1...
"Flomo Togba Kwele" <Fl***@community.nospamwrote in
news:xn***************@news.giganews.com:
>Frankie,

It's my understanding that threading does not increase performance but
makes an interactive program more responsive while it is running a
lengthy process.

Threading can improve an application's performance - after all some
processes use very little CPU and thus you can run multiple threads to
maximize CPU usage.

May 11 '07 #11
But only reading what you want to read is not always the best.

80% from the answers is against threading in your situation. It seems you
pick the one that fits you.

Cor

<fr**********@yahoo.comschreef in bericht
news:11**********************@p77g2000hsh.googlegr oups.com...
On May 10, 7:55 pm, "Michael D. Ober" <obermd.@.alum.mit.edu.nospam>
wrote:
>The threadpool has the additional advantage of limiting the number of
simultaneous running threads and will queue additional threads in a first
in
first processed order. This allows the framework to adjust for system
resources as well as preventing too many threads from running at one
time,
which can slow the entire system down via excessive thread context
switching. Although there are other situations where doing your own
threading is the way to go, this one is definitely a thread pool
scenario.

Mike Ober.

"Charlie Brown" <cbr...@duclaw.comwrote in message

news:11*********************@n59g2000hsh.googlegr oups.com...
multithreading will improve performance when used correctly. It is
not only for UI work, web servers such as IIS are a good example of
this. I recommend using the thread pool and queuing work since the
workload can be variable. The Background worker control uses the
thread pool, but you'll get better control if you queue things into
the thread pool yourself. With that said, the Background worker
control is fairly easy to use for a someone just getting into
multithreading. I don't recommend creating threads manually since the
cost of creating threads is very high.
This is consistant with my reading. Always helpful to get a second or
third opinion when approching a new concept. Thanks.

May 11 '07 #12
On May 10, 11:59 pm, "Cor Ligthert [MVP]" <notmyfirstn...@planet.nl>
wrote:
You can be sure that a process takes more throughput time when threading is
used then when not.

"Spam Catcher" <spamhoney...@rogers.comschreef in berichtnews:Xn**********************************@1 27.0.0.1...
"Flomo Togba Kwele" <F...@community.nospamwrote in
news:xn***************@news.giganews.com:
Frankie,
It's my understanding that threading does not increase performance but
makes an interactive program more responsive while it is running a
lengthy process.
Threading can improve an application's performance - after all some
processes use very little CPU and thus you can run multiple threads to
maximize CPU usage.- Hide quoted text -

- Show quoted text -
OK. If I understand your point you are simply stating that there is
some overhead involved in creating threads. Fair enough. But if I can
split the work over multiple threads (with some reasonable limit on
the number of threads created) then I assume an overall performance
gain.

May 11 '07 #13
On May 11, 12:01 am, "Cor Ligthert [MVP]" <notmyfirstn...@planet.nl>
wrote:
But only reading what you want to read is not always the best.

80% from the answers is against threading in your situation. It seems you
pick the one that fits you.

Cor

<frankiesp...@yahoo.comschreef in berichtnews:11**********************@p77g2000hsh.g ooglegroups.com...
On May 10, 7:55 pm, "Michael D. Ober" <obermd.@.alum.mit.edu.nospam>
wrote:
The threadpool has the additional advantage of limiting the number of
simultaneous running threads and will queue additional threads in a first
in
first processed order. This allows the framework to adjust for system
resources as well as preventing too many threads from running at one
time,
which can slow the entire system down via excessive thread context
switching. Although there are other situations where doing your own
threading is the way to go, this one is definitely a thread pool
scenario.
Mike Ober.
"Charlie Brown" <cbr...@duclaw.comwrote in message
>news:11*********************@n59g2000hsh.googlegr oups.com...
multithreading will improve performance when used correctly. It is
not only for UI work, web servers such as IIS are a good example of
this. I recommend using the thread pool and queuing work since the
workload can be variable. The Background worker control uses the
thread pool, but you'll get better control if you queue things into
the thread pool yourself. With that said, the Background worker
control is fairly easy to use for a someone just getting into
multithreading. I don't recommend creating threads manually since the
cost of creating threads is very high.
This is consistant with my reading. Always helpful to get a second or
third opinion when approching a new concept. Thanks.- Hide quoted text -

- Show quoted text -
If you can suggest an alternate strategy I am all ears. And, can you
explain why 80% of the answer is against threading in my scenario?

May 11 '07 #14
On May 10, 10:46 pm, Spam Catcher <spamhoney...@rogers.comwrote:
frankiesp...@yahoo.com wrote in news:1178843096.073514.227330
@y5g2000hsa.googlegroups.com:
I have a system that receives xml files and records their file
locations in a database. I can potentially receive thousands,
sometimes hundreds of thousands, of files per day. When files are
received and stored in a folder on the server I need another
application to read in the paths from the database, locate, process,
and save each xml file. I want to create a windows service that can
read in a list from the database and assign work to multiple threads
in order to achieve greater performance. But, I am not sure where to
begin and I am having option paralysis. Do I need to create the
threads manually like:

Take a look at the FileSystemWatcher class - it can monitor a folder for
changes. Once it detects a change, you can fire off a thread to process the
change.

A couple other options:

1. Submit the XMLs to a web service and process the files immediately.
Since web services are executed by IIS/ASP.NET, it is extremely scalable

2. Perhaps look at using a MSMQ? MSMQ can handle a large volume of incoming
requests and hold them for you until your applications are ready to process
the data. MSMQ is quite easy to use ... and very reliable.
I have used FileSystemWatcher and MSMQ in various other projects. I
will take a quick look again at FileSystemWatcher, but while I think
this is perfectly ok for low volume, I am uneasy with it in large
volume. MSMQ is fine. But it is another layer and the Db is already a
"queue" of sorts. Access to MQ is fast but my app can only access one
MQ message at a time as far as I know. I could create multiple queues,
however... IIS is something for me to look into. Thanks for your
suggestion. This is great you have me thinking about some other
scenarios. Cheers.

May 11 '07 #15
fr**********@yahoo.com wrote in
news:11**********************@y5g2000hsa.googlegro ups.com:
I have used FileSystemWatcher and MSMQ in various other projects. I
will take a quick look again at FileSystemWatcher, but while I think
this is perfectly ok for low volume, I am uneasy with it in large
volume.
That's true - FileSystemWatcher may not scale to high volumes.

MSMQ is fine. But it is another layer and the Db is already a
"queue" of sorts.
Depending on the queuing mechanism you're using in the database, it may
not be threadsafe (i.e. 2 threads might end up processing one record).
Thus SQL Server 2005 introduced the service broker (queue service for
SQL Server) which solves this issue.
Access to MQ is fast but my app can only access one
MQ message at a time as far as I know.
Yes, but this is where multi-threading comes into play ;-) You can have
multiple threads monitoring the queue. Since MSMQ is threadsafe you
don't have to worry about two threads pulling the same message twice out
of the queue.

Also depending on how long it takes to process a single file - you might
not need multi-threading. Since the queue is persistent and acts as a
buffer, you can process the message at leisure ... and catch up during
lulls in transmission.
I could create multiple queues,
however... IIS is something for me to look into. Thanks for your
suggestion. This is great you have me thinking about some other
scenarios. Cheers.
You can get very fancy with your application. For example, I have a
similar application at the moment, we do it this way:

Web Service --Queue <---Service to Process incoming requests

Using a web service provides a simplar standardized way to submit data
to the queue (not everyone talks to MSMQ). Also, web services can be
scaled horizontally via load balancing. The queue can be clustered or
scaled horizontally too. Multiple back end services can be installed to
process the queues. So in a sense you can scale such a solution multiple
ways to increase throughput.
May 11 '07 #16
Why not just seperate programs.

<fr**********@yahoo.comschreef in bericht
news:11********************@l77g2000hsb.googlegrou ps.com...
On May 11, 12:01 am, "Cor Ligthert [MVP]" <notmyfirstn...@planet.nl>
wrote:
>But only reading what you want to read is not always the best.

80% from the answers is against threading in your situation. It seems you
pick the one that fits you.

Cor

<frankiesp...@yahoo.comschreef in
berichtnews:11**********************@p77g2000hsh. googlegroups.com...
On May 10, 7:55 pm, "Michael D. Ober" <obermd.@.alum.mit.edu.nospam>
wrote:
The threadpool has the additional advantage of limiting the number of
simultaneous running threads and will queue additional threads in a
first
in
first processed order. This allows the framework to adjust for system
resources as well as preventing too many threads from running at one
time,
which can slow the entire system down via excessive thread context
switching. Although there are other situations where doing your own
threading is the way to go, this one is definitely a thread pool
scenario.
>Mike Ober.
>"Charlie Brown" <cbr...@duclaw.comwrote in message
>>news:11*********************@n59g2000hsh.googleg roups.com...
multithreading will improve performance when used correctly. It is
not only for UI work, web servers such as IIS are a good example of
this. I recommend using the thread pool and queuing work since the
workload can be variable. The Background worker control uses the
thread pool, but you'll get better control if you queue things into
the thread pool yourself. With that said, the Background worker
control is fairly easy to use for a someone just getting into
multithreading. I don't recommend creating threads manually since
the
cost of creating threads is very high.
This is consistant with my reading. Always helpful to get a second or
third opinion when approching a new concept. Thanks.- Hide quoted
text -

- Show quoted text -

If you can suggest an alternate strategy I am all ears. And, can you
explain why 80% of the answer is against threading in my scenario?

May 12 '07 #17
Frankie,
In addition to the other comments:

I've written similar applications.

Remember that threading in a service is most effective if you can
effectively use the processors (multi-core or hyper threaded CPUs or
multiple-CPU computers) or you have a lot of waiting on I/O. For example
your catalog thread is waiting to read a file, a second catalog thread could
be processing a file. While threading in an interactive app (Windows Forms)
is most effective if you avoid blocking the UI thread, so the user perceives
your app as being responsive.

Only use New Thread if you are only ever creating a 1 or 2 threads. For
example one thread to receive files and a second thread to process files.
Don't explicitly create a thread to process each file. Creating & destroying
threads is expensive, plus managing those threads can be a pain. The Thread
Pool creates a fixed # of threads and only creates a new thread if needed;
then it reuses them for requests. Further the Thread Pool will scale based
on the number of processors available (multi-core or hyper threaded CPUs or
multiple-CPU computers) the more processors available the more threads
available in the pool.

The BackgroundWorker control is intended for Windows Forms application so
your form can easily start a background process; it would not work as
expected in a Service (no Windows Forms message pump).

Instead I would recommend using the ThreadPool directly or indirectly via
{delegate}.BeginInvoke. (NOTE: Don't confuse this with Windows Forms
Control.BeginInvoke). Where {delegate} is the name of a Delegate Type.

For example the receive thread could use ThreadPool.QueueUserWorkItem to
start a catalog process for each file. The Thread pool would ensure an
effective # of catalog processes ran at one time. Instead of
ThreadPool.QueueUserWorkItem you could use {delegate}.BeginInvoke; however
be certain to call EndInvoke (the thread pool does it for you).

Something like (not fully tested):

Public Delegate Sub DoWork(ByVal file As String)

Dim worker As DoWork = AddressOf ProcessFile
For Each file As String In New String() {"a", "b", "c"}
worker.BeginInvoke(file, AddressOf EndProcess, worker)
Next

Private Sub ProcessFile(ByVal file As String)

End Sub

Private Sub EndProcess(ByVal ar As IAsyncResult)
Dim worker As DoWork = DirectCast(ar.AsyncState, DoWork)
Try
worker.EndInvoke(ar)
Catch ex As Exception
Log(ex)
End Try
End Sub
--
Hope this helps
Jay B. Harlow [MVP - Outlook]
..NET Application Architect, Enthusiast, & Evangelist
T.S. Bradley - http://www.tsbradley.net
<fr**********@yahoo.comwrote in message
news:11**********************@y5g2000hsa.googlegro ups.com...
Hello all,

I was perusing the internet for information on threading when I came
across this group. Since there seems to be a lot of good ideas and
useful info I thought I'd pose a question.

Threading is a new concept for me to implement. Here is my problem.

I have a system that receives xml files and records their file
locations in a database. I can potentially receive thousands,
sometimes hundreds of thousands, of files per day. When files are
received and stored in a folder on the server I need another
application to read in the paths from the database, locate, process,
and save each xml file. I want to create a windows service that can
read in a list from the database and assign work to multiple threads
in order to achieve greater performance. But, I am not sure where to
begin and I am having option paralysis. Do I need to create the
threads manually like:

Dim worker as New Thread(Address of Something)
Worker.Start()

Do I need to use the thread pool? The BackgroundWorker control? I have
seen a lot of examples. What I'd like is if someone could make a
research recommendation based on my scenario if possible. I realize
this is probably a basic question about a complex issue so any
feedback to get me thinking would be good.

Much appreciated.
May 13 '07 #18
fr**********@yahoo.com wrote:
I have a system that receives xml files and records their file
locations in a database. I can potentially receive thousands,
sometimes hundreds of thousands, of files per day. When files are
received and stored in a folder on the server I need another
application to read in the paths from the database, locate, process,
and save each xml file.
So - spot a file, pick it up, go process it. Spot another file ...

Ideal candidate for threading, since each job is isolated from every
other. The less your threads have to talk to one another, the happier
(i.e. faster) they'll be.
This "database" bit is a bit worrying, though, because that's going to
force the threads to "fight" to get at the database itself. Could lead
to some contention that will slow things down.

Given the volumes that you have here, you can't go spawning a new thread
for every file; work with a set of 10 or so Threads and each one will
run fairly well. Try the same thing with 1000 Threads and watch your
machine spin itself into the ground. :-)
I want to create a windows service that can read in a list from the
database and assign work to multiple threads in order to achieve
greater performance.
Sounds good. The service's main Thread acts as a marshaller, handing
out work to the other Threads that do the real work.
But, I am not sure where to begin and I am having option paralysis.
Do I need to create the threads manually like:

Dim worker as New Thread(Address of Something)
Worker.Start()
You need to have the Thread "callback" to the main service when they've
finished their job - that way, the main service doesn't have to "poll"
them to see if they're still busy. Polling just slows things down.
Do I need to use the thread pool?
Depends on how long each job takes. The pool is intended for tasks that
run and die off very quickly, so that the Threads are available for
something else to pick up and use. If a job takes 20 minutes, spin up
your own Threads.
The BackgroundWorker control?
No. If this were a Forms app launching all these threads then yes,
because the BackgroundWorker control is built to marshal the callbacks
from other Threads back onto the UI (Windows) Thread because Forms
Controls aren't Thread-safe.

HTH,
Phill W.
May 17 '07 #19
Use the ThreadPool. It has two components - one for event handling and one
for application tasks. The application task portion of the thread pool is
designed for long running threads.

Mike Ober.

"Phill W." <p-.-a-.-w-a-r-d-@-o-p-e-n-.-a-c-.-u-kwrote in message
news:f2**********@south.jnrs.ja.net...
fr**********@yahoo.com wrote:
>I have a system that receives xml files and records their file
locations in a database. I can potentially receive thousands,
sometimes hundreds of thousands, of files per day. When files are
received and stored in a folder on the server I need another
application to read in the paths from the database, locate, process,
and save each xml file.

So - spot a file, pick it up, go process it. Spot another file ...

Ideal candidate for threading, since each job is isolated from every
other. The less your threads have to talk to one another, the happier
(i.e. faster) they'll be.
This "database" bit is a bit worrying, though, because that's going to
force the threads to "fight" to get at the database itself. Could lead to
some contention that will slow things down.

Given the volumes that you have here, you can't go spawning a new thread
for every file; work with a set of 10 or so Threads and each one will run
fairly well. Try the same thing with 1000 Threads and watch your machine
spin itself into the ground. :-)
>I want to create a windows service that can read in a list from the
database and assign work to multiple threads in order to achieve
greater performance.

Sounds good. The service's main Thread acts as a marshaller, handing out
work to the other Threads that do the real work.
>But, I am not sure where to begin and I am having option paralysis. Do I
need to create the threads manually like:

Dim worker as New Thread(Address of Something)
Worker.Start()

You need to have the Thread "callback" to the main service when they've
finished their job - that way, the main service doesn't have to "poll"
them to see if they're still busy. Polling just slows things down.
>Do I need to use the thread pool?

Depends on how long each job takes. The pool is intended for tasks that
run and die off very quickly, so that the Threads are available for
something else to pick up and use. If a job takes 20 minutes, spin up
your own Threads.
>The BackgroundWorker control?
No. If this were a Forms app launching all these threads then yes,
because the BackgroundWorker control is built to marshal the callbacks
from other Threads back onto the UI (Windows) Thread because Forms
Controls aren't Thread-safe.

HTH,
Phill W.

May 17 '07 #20

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

65
by: Anthony_Barker | last post by:
I have been reading a book about the evolution of the Basic programming language. The author states that Basic - particularly Microsoft's version is full of compromises which crept in along the...
4
by: Richard Spooner | last post by:
Hey, I'm very new to python and am trying to do the following. I may get the jargon wrong at times but hopefully you can see what I'm trying to do... I have created a threaded class which...
7
by: asfwa | last post by:
I'm new to C++ and I have some basic questions. I have written an app that does some network stuff in a worker thread. The thread function requests something from the server, gets it and creates...
5
by: John | last post by:
I have an application that scans files and it takes a while to finish. While my application is running if I open other applications, my main application GUI becomes grey....does not refresh or...
7
by: Anthony Nystrom | last post by:
What is the correct way to stop a thread? abort? sleep? Will it start up again... Just curious... If the thread is enabling a form, if the form is disposed is the thread as well? Thanks, ...
3
by: KC | last post by:
Hey, I'm trying to implement a cancel button in my app (this is after the bulk of the program has been built). To do this I, of course, need to use a thread to run the time consuming method...
4
by: Bob | last post by:
- For cleanup, is it sufficient to set a Thread to Nothing after it's done? - It is OK to pass objects out of the thread? (dumb question maybe but I want to be sure) - What's the best way to...
4
by: DBC User | last post by:
I have a background process which reads a table to see if there are any pending requests. If there are any, then it will start a worker thread (only 10 allowed at a time) and executes a method. In...
2
by: Chris Ashley | last post by:
Hi, I'm overriding WndProc to process some custom messages like so: protected override void WndProc(ref Message m) { if (m.Msg == ImageFileMsg.MSG_IF_NEW_DATA) { ProcessNewDataMessage(m);
0
by: ryjfgjl | last post by:
If we have dozens or hundreds of excel to import into the database, if we use the excel import function provided by database editors such as navicat, it will be extremely tedious and time-consuming...
0
by: ryjfgjl | last post by:
In our work, we often receive Excel tables with data in the same format. If we want to analyze these data, it can be difficult to analyze them because the data is spread across multiple Excel files...
0
by: emmanuelkatto | last post by:
Hi All, I am Emmanuel katto from Uganda. I want to ask what challenges you've faced while migrating a website to cloud. Please let me know. Thanks! Emmanuel
0
BarryA
by: BarryA | last post by:
What are the essential steps and strategies outlined in the Data Structures and Algorithms (DSA) roadmap for aspiring data scientists? How can individuals effectively utilize this roadmap to progress...
1
by: Sonnysonu | last post by:
This is the data of csv file 1 2 3 1 2 3 1 2 3 1 2 3 2 3 2 3 3 the lengths should be different i have to store the data by column-wise with in the specific length. suppose the i have to...
0
marktang
by: marktang | last post by:
ONU (Optical Network Unit) is one of the key components for providing high-speed Internet services. Its primary function is to act as an endpoint device located at the user's premises. However,...
0
Oralloy
by: Oralloy | last post by:
Hello folks, I am unable to find appropriate documentation on the type promotion of bit-fields when using the generalised comparison operator "<=>". The problem is that using the GNU compilers,...
0
jinu1996
by: jinu1996 | last post by:
In today's digital age, having a compelling online presence is paramount for businesses aiming to thrive in a competitive landscape. At the heart of this digital strategy lies an intricately woven...
0
tracyyun
by: tracyyun | last post by:
Dear forum friends, With the development of smart home technology, a variety of wireless communication protocols have appeared on the market, such as Zigbee, Z-Wave, Wi-Fi, Bluetooth, etc. Each...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.