I have the following function which generates MD5 hashes for files on a
local and remote server. The remote server has a little applet that
runs from inetd and generates an MD5 hash given the file name.
The problem is that it takes 2+ minutes to generate the MD5 hash, so
this function takes about 5 minutes every time it is called. Since the
first MD5 hash is generated on a remote machine, the local machine does
nothing but wait for half that time.
Is there any way to rewrite each half of the function to run in the
background, so to speak, and then have a master process that waits on
the results? This would cut execution time in half more or less.
# checkMD5
def checkMD5(fileNa me, localDir):
# get remote hash
Socket = socket.socket(s ocket.AF_INET,s ocket.SOCK_STRE AM)
Socket.connect( (MD5server,888) )
#throw away ID string
Socket.recv(256 )
Socket.send(fil eName+'\n')
remoteMD5hash = Socket.recv(256 )
# get local hash
try:
file=open(makeM ovieName(localD ir,fileName), 'r')
except IOError:
localMD5hash = '0'
else:
hasher = md5.new()
while True:
chunk = file.read(1024)
if not chunk:
break
hasher.update(c hunk)
localMD5hash = hasher.hexdiges t()
if Debug: print "local:",localM D5hash, "remote:",remot eMD5hash
return localMD5hash.st rip() == remoteMD5hash.s trip()
-Kamus
--
o__ | If you're old, eat right and ride a decent bike.
,>/'_ | Q.
(_)\(_) | Usenet posting` 11 2390
Kamus of Kadizhar <ya*@NsOeSiPnAe Mr.com> writes: Is there any way to rewrite each half of the function to run in the background, so to speak, and then have a master process that waits on the results? This would cut execution time in half more or less.
Sure, use the threading module. Think about another aspect of what
you're doing though. You're comparing the md5's of a local and remote
copy of the same file, to see if they're the same. Are you trying to
detect malicious tampering? If someone tampered with one of the
files, how do you know that person can't also intercept your network
connection and send you the "correct" md5, so you won't detect the
tampering? Or for that matter, do you know that the remote copy of
the program itself hasn't been tampered with?
Paul Rubin wrote: Kamus of Kadizhar <ya*@NsOeSiPnAe Mr.com> writes:
Is there any way to rewrite each half of the function to run in the background, so to speak, and then have a master process that waits on the results? This would cut execution time in half more or less.
Sure, use the threading module.
OK, I'll read up on that. I've written gobs of scientific type code,
but this OS stuff is new to me.
Think about another aspect of what you're doing though. You're comparing the md5's of a local and remote copy of the same file, to see if they're the same. Are you trying to detect malicious tampering?
No, actually, both machines are under my control (and in my house). I'm
slinging large (1GB MOL) files around on an unreliable, slow wireless
network. I am trying to detect an incomplete copy across the network.
The local machine is the video player and the remote machine is the
archive server. My kids have a habit of just shutting down the video
server, resulting in incomplete transfers to the archives.
If it's appropriate for this newsgroup, I'd like to post the entire
effort for comments (it's my first bit of pyton code.) So far, python
has been the easiest language to learn I've ever come across. I tried
learning perl, and it was a disaster.... Too convoluted. Python is a
breath of fresh air. Also, the docs and support here is excellent.
:-) My thanks to all the volunteers who put in time to build python.
-Kamus
--
o__ | If you're old, eat right and ride a decent bike.
,>/'_ | Q.
(_)\(_) | Usenet posting`
Kamus of Kadizhar <ya*@NsOeSiPnAe Mr.com> writes: No, actually, both machines are under my control (and in my house). I'm slinging large (1GB MOL) files around on an unreliable, slow wireless network. I am trying to detect an incomplete copy across the network. The local machine is the video player and the remote machine is the archive server. My kids have a habit of just shutting down the video server, resulting in incomplete transfers to the archives. If it's appropriate for this newsgroup, I'd like to post the entire effort for comments (it's my first bit of pyton code.) So far, python has been the easiest language to learn I've ever come across. I tried learning perl, and it was a disaster.... Too convoluted. Python is a breath of fresh air. Also, the docs and support here is excellent. :-) My thanks to all the volunteers who put in time to build python.
Why don't you look at the rsync program. It brings two machines into
sync with each other by automatically detecting differences between
files and sending only the deltas over the network.
Paul Rubin wrote: Why don't you look at the rsync program. It brings two machines into sync with each other by automatically detecting differences between files and sending only the deltas over the network.
Well, the purpose of this whole project was to learn python. I did look
at the pysync modules (rsync written in python), but it's too
complicated for me at the moment.
-Kamus
--
o__ | If you're old, eat right and ride a decent bike.
,>/'_ | Q.
(_)\(_) | Usenet posting`
Kamus of Kadizhar <ya*@NsOeSiPnAe Mr.com> wrote: Is there any way to rewrite each half of the function to run in the background, so to speak, and then have a master process that waits on the results?
Yup. Two ways in fact.
The traditional way would be to fork another process to do the work and
have the parent process wait for the child to finish. You'll need to
use the fork() and exec() functions that can be found in the os module.
The other way would be to do something similar, but with threads instead
of processes. The basic flow is the same; you create a thread, have
that thread do the stuff that takes a long time, and then rejoin with
the primary thread. Of course (just like with child processes), you
could have multiple of these running at the same time doing different
parts of a parallelizable job. Take a look at the Threading module.
I'm intentionally not including any sample code here, because the
possibilities are numerous. Exactly how you do it depends on many
factors. I'm guessing that doing it with threads is what you really
want to do, so my suggestion would be to start by reading up on the
Threading module and playing with some examples to get the feel for how
it works. Working with threads is becomming more and more mainstream
and more operating systems and languages provide support for it, and the
programming community at large becomes more familiar and comfortable
with the issues involved.
Kamus of Kadizhar <ya*@NsOeSiPnAe Mr.com> writes: Is there any way to rewrite each half of the function to run in the background, so to speak, and then have a master process that waits on the results? This would cut execution time in half more or less.
Why don't you use twisted? It's a net framework with a lot of
protocols (and you can define your own ones) and it's based on async
sockets which let you write programs avoiding threads for most of the
times. www.twistedmatrix.com
I'm sure you will find out that's the best thing ever done for python
:)
--
Valentino Volonghi, Regia SpA, Milan
Linux User #310274, Gentoo Proud User
"Donn Cave" <do**@drizzle.c om> writes: Yes. I may be missing something here, because the followups I have seen strike me as somewhat misguided, if they're not just fooling with you. You already have two independent threads or processes here, one on each machine. All you need to do is take the results from the remote machine AFTER the local computation. Move the line that says "remoteMD5h ash = Socket.recv(256 )" to after the block that ends with "localMD5ha sh = hasher.hexdiges t()". No?
Can the remote process time out if the local side takes too long to
read from the socket? That could happen if the two machines aren't
the same speed.
Quoth Paul Rubin <http://ph****@NOSPAM.i nvalid>:
....
| Can the remote process time out if the local side takes too long to
| read from the socket? That could happen if the two machines aren't
| the same speed.
I wouldn't expect so. I'm no expert in such things, but I would
expect the remote process to return from send(), and exit; the
data would be waiting in a kernel mbuf on the local side
Donn Cave, do**@drizzle.co m
[Kamus of Kadizhar] So far, python has been the easiest language to learn I've ever come across. I tried learning perl, and it was a disaster.... Too convoluted. Python is a breath of fresh air. Also, the docs and support here is excellent. :-) My thanks to all the volunteers who put in time to build python.
+1 QOTW.
regards,
--
alan kennedy
------------------------------------------------------
check http headers here: http://xhaus.com/headers
email alan: http://xhaus.com/contact/alan This thread has been closed and replies have been disabled. Please start a new discussion. Similar topics |
by: Steven Bethard |
last post by:
I feel like this has probably been answered before, but I couldn't
find something quite like it in the archives. Feel free to point me
somewhere if you know where this has already been answered.
I have a list in a particular order that I want to split into two
lists: the list of every nth item, and the list of remaining items.
It's important to maintain the original order in both lists.
So the first list is simple:
|
by: Randy Yates |
last post by:
Having done a bit of Access Basic programming, I'm realizing
that AB does seem to have (as much as I hate to admit it since
I think it's a toy language) an advantage over C++.
Let's say I have a table called "tblCars" that has the fields fYear
(integer), fModel (string), fMake (string). When I perform a query in
AB, I do something like:
Dim rstCars As Recordset
Dim year As Integer
|
by: James A. Donald |
last post by:
I am contemplating getting into Python, which is used by engineers I
admire - google and Bram Cohen, but was horrified to read
"no variable or argument declarations are necessary."
Surely that means that if I misspell a variable name, my program will
mysteriously fail to work with no error message.
If you don't declare variables, you can inadvertently re-use an
variable used in an enclosing context when you don't intend to, or
|
by: Steven Matthew Bennett |
last post by:
I don't know about the rest of the posters, but I came to this NG to learn
more about Access, not to have some idiot homophobic dipshit spewing hate
with every post. You have nothing to add to this group that is remotely
on-topic, so unless you're gonna learn Access do us a favor and get lost.
Do you know how many Access MVP's post here? Do you know how much of their
valuable time is spent helping people?
I've had a number of MVP's...
|
by: maniac |
last post by:
Hey guys, I'm new here, just a simple question.
I'm learning to Program in C, and I was recommended a book called,
"Mastering C Pointers", just asking if any of you have read it,
and if it's worth the $25USD.
I'm just looking for a book on Pointers, because from what I've
read it's one of the toughest topics to understand.
thanks in advanced.
| |
by: Paminu |
last post by:
In math this expression:
(a < b) && (b < c)
would be described as:
a < b < c
But why is it that in C these two expressions evaluate to something
different for the same values of a, b and c?
|
by: Nils Magnus Englund |
last post by:
Hi,
I've made a HttpModule which deals with user authentication. On the first
request in a users session, it fetches data from a SQL Server using the
following code:
using (SqlConnection connection = new
SqlConnection(ConfigurationSettings.AppSettings))
{
|
by: JohnQ |
last post by:
(The "C++ Grammer" thread in comp.lang.c++.moderated prompted this post).
It would be more than a little bit nice if C++ was much "cleaner" (less
complex) so that it wasn't a major world wide untaking to create a toolchain
for it. Way back when, there used to be something called "Small C". I wonder
if the creator(s) of that would want to embark on creating a nice little
Small C++ compiler devoid of C++ language features that make...
|
by: David C. Ullrich |
last post by:
Having a hard time phrasing this in the form
of a question...
The other day I saw a thread where someone asked
about overrideable properties and nobody offered
the advice that properties are Bad. So maybe we've
got over that. I suppose properties could have
Bad consequences if a user doesn't know they exist
and think that a certain property of an object is
just an ordinary attribute. But that applies to
|
by: Oralloy |
last post by:
Hello folks,
I am unable to find appropriate documentation on the type promotion of bit-fields when using the generalised comparison operator "<=>".
The problem is that using the GNU compilers, it seems that the internal comparison operator "<=>" tries to promote arguments from unsigned to signed.
This is as boiled down as I can make it.
Here is my compilation command:
g++-12 -std=c++20 -Wnarrowing bit_field.cpp
Here is the code in...
|
by: jinu1996 |
last post by:
In today's digital age, having a compelling online presence is paramount for businesses aiming to thrive in a competitive landscape. At the heart of this digital strategy lies an intricately woven tapestry of website design and digital marketing. It's not merely about having a website; it's about crafting an immersive digital experience that captivates audiences and drives business growth.
The Art of Business Website Design
Your website is...
| |
by: Hystou |
last post by:
Overview:
Windows 11 and 10 have less user interface control over operating system update behaviour than previous versions of Windows. In Windows 11 and 10, there is no way to turn off the Windows Update option using the Control Panel or Settings app; it automatically checks for updates and installs any it finds, whether you like it or not. For most users, this new feature is actually very convenient. If you want to control the update process,...
|
by: tracyyun |
last post by:
Dear forum friends,
With the development of smart home technology, a variety of wireless communication protocols have appeared on the market, such as Zigbee, Z-Wave, Wi-Fi, Bluetooth, etc. Each protocol has its own unique characteristics and advantages, but as a user who is planning to build a smart home system, I am a bit confused by the choice of these technologies. I'm particularly interested in Zigbee because I've heard it does some...
|
by: agi2029 |
last post by:
Let's talk about the concept of autonomous AI software engineers and no-code agents. These AIs are designed to manage the entire lifecycle of a software development project—planning, coding, testing, and deployment—without human intervention. Imagine an AI that can take a project description, break it down, write the code, debug it, and then launch it, all on its own....
Now, this would greatly impact the work of software developers. The idea...
|
by: TSSRALBI |
last post by:
Hello
I'm a network technician in training and I need your help.
I am currently learning how to create and manage the different types of VPNs and I have a question about LAN-to-LAN VPNs.
The last exercise I practiced was to create a LAN-to-LAN VPN between two Pfsense firewalls, by using IPSEC protocols.
I succeeded, with both firewalls in the same network. But I'm wondering if it's possible to do the same thing, with 2 Pfsense firewalls...
|
by: 6302768590 |
last post by:
Hai team
i want code for transfer the data from one system to another through IP address by using C# our system has to for every 5mins then we have to update the data what the data is updated we have to send another system
|
by: muto222 |
last post by:
How can i add a mobile payment intergratation into php mysql website.
| |
by: bsmnconsultancy |
last post by:
In today's digital era, a well-designed website is crucial for businesses looking to succeed. Whether you're a small business owner or a large corporation in Toronto, having a strong online presence can significantly impact your brand's success. BSMN Consultancy, a leader in Website Development in Toronto offers valuable insights into creating effective websites that not only look great but also perform exceptionally well. In this comprehensive...
| | |