473,473 Members | 2,111 Online
Bytes | Software Development & Data Engineering Community
Create Post

Home Posts Topics Members FAQ

long lists

Hi All,

Firstly - thank you Sean for the help and the guideline to get the
size comparison, I will definitely look into this.

At the moment I actually have 2 bigger issues that needs sorting...

1. I have the script popping all the files that need to be checked
into a list, and have it parsing the list for everything...Now the
problem is this : The sever needs to check (at the moment) 375 files
and eliminate those that don't need reuploading. This number will
obviously get bigger and bigger as more files gets uploaded. Now, the
problem that I'm having is that the script is taking forever to parse
the list and give the final result. How can I speed this up?

2. This issue is actually because of the first one. While the script
is parsing the lists and files, the connection to the ftp server times
out, and I honestly must say that is is quite annoying. I know I can
set the function to reconnect if it cannot find a connection, but
wouldn't it just be easier just to keep the connection alive? Any idea
how I can keep the connection alive?

Thanks for all the help folks, I really appreciate it!

May 7 '07 #1
6 1799
On Mon, 07 May 2007 00:28:14 -0700, Merrigan wrote:
1. I have the script popping all the files that need to be checked into
a list, and have it parsing the list for everything...Now the problem is
this : The sever needs to check (at the moment) 375 files and eliminate
those that don't need reuploading. This number will obviously get bigger
and bigger as more files gets uploaded. Now, the problem that I'm having
is that the script is taking forever to parse the list and give the
final result. How can I speed this up?
By writing faster code???

It's really hard to answer this without more information. In particular:

- what's the format of the list and how do you parse it?

- how does the script decide what files need uploading?

--
Steven.
May 7 '07 #2
On May 7, 10:18 am, Steven D'Aprano
<ste...@REMOVE.THIS.cybersource.com.auwrote:
On Mon, 07 May 2007 00:28:14 -0700, Merrigan wrote:
1. I have the script popping all the files that need to be checked into
a list, and have it parsing the list for everything...Now the problem is
this : The sever needs to check (at the moment) 375 files and eliminate
those that don't need reuploading. This number will obviously get bigger
and bigger as more files gets uploaded. Now, the problem that I'm having
is that the script is taking forever to parse the list and give the
final result. How can I speed this up?

By writing faster code???

It's really hard to answer this without more information. In particular:

- what's the format of the list and how do you parse it?

- how does the script decide what files need uploading?

--
Steven.
Hi, Thanx for the reply,

The Script it available at this url : http://www.lewendewoord.co.za/theScript.py

P.S. I know it looks like crap, but I'm a n00b, and not yet through
the OOP part of the tutorial.

Thanx in advance!

May 7 '07 #3
In <11**********************@w5g2000hsg.googlegroups. com>, Merrigan wrote:
The Script it available at this url : http://www.lewendewoord.co.za/theScript.py

P.S. I know it looks like crap, but I'm a n00b, and not yet through
the OOP part of the tutorial.
One spot of really horrible runtime is the `comp_are()` function, it has
quadratic runtime. Why the funny spelling BTW?

Why are you binding the objects to new names all the time and calling
`str()` repeatedly on string objects? The names `a`, `b` and `fn2up` are
unnecessary, you can use `file1`, `file2` and `filename` instead. And
``str(str(b))`` on a string object is a no-operation. It's the same as
simply writing ``b``.

Those two nested ``for``-loops can be replaced by converting both lists
into `set()` objects, calculating the difference and convert back to a
sorted list:

def compare(remote, local):
return sorted(set(local) - set(remote))

Ciao,
Marc 'BlackJack' Rintsch
May 7 '07 #4
On May 7, 5:14 am, Merrigan <charl.loub...@gmail.comwrote:
On May 7, 10:18 am, Steven D'Aprano

<ste...@REMOVE.THIS.cybersource.com.auwrote:
On Mon, 07 May 2007 00:28:14 -0700, Merrigan wrote:
1. I have the script popping all the files that need to be checked into
a list, and have it parsing the list for everything...Now the problem is
this : The sever needs to check (at the moment) 375 files and eliminate
those that don't need reuploading. This number will obviously get bigger
and bigger as more files gets uploaded. Now, the problem that I'm having
is that the script is taking forever to parse the list and give the
final result. How can I speed this up?
By writing faster code???
It's really hard to answer this without more information. In particular:
- what's the format of the list and how do you parse it?
- how does the script decide what files need uploading?
--
Steven.

Hi, Thanx for the reply,

The Script it available at this url :http://www.lewendewoord.co.za/theScript.py

P.S. I know it looks like crap, but I'm a n00b, and not yet through
the OOP part of the tutorial.

Thanx in advance!
Do you have access to the machine via ssh? I would try to get away
from FTP and use rsync for this kind of thing if possible.

~Sean

May 7 '07 #5
En Mon, 07 May 2007 09:14:34 -0300, Merrigan <ch***********@gmail.com>
escribió:
The Script it available at this url :
http://www.lewendewoord.co.za/theScript.py
I understand this as a learning exercise, since there are lot of utilities
for remote syncing.

Some comments:
- use os.path.join to build file paths, instead of concatenating strings.
- instead of reassigning sys.stdout before the call to retrlines, use the
callback:

saveinfo = sys.stdout
fsock = open(tempDir + "remotelist.txt", "a")
sys.stdout = fsock
ftpconn.cwd(remotedir) #This changes to the remote directory
ftpconn.retrlines("LIST") #This gets a complete list of everything in
the directory
sys.stdout = saveinfo
fsock.close()

becomes:

fsock = open(os.path.join(tempDir,"remotelist.txt"), "a")
ftpconn.cwd(remotedir) #This changes to the remote directory
ftpconn.retrlines("LIST", fsock.write) #This gets a complete list of
everything in the directory
fsock.close()
(Why mode="a"? Shouldn't it be "w"? Isn't the listing for a single
directory?)

- Saving both file lists may be useful, but why do you read them again? If
you already have a list of local filenames and remote filenames, why read
them from the saved copy?
- It's very confusing having "filenames" ending with "\n" - strip that as
you read it. You can use fname = fname.rstrip()
- If you are interested on filenames with a certain extension, only
process those files. That is, filter them *before* the processing begins.

- The time-consuming part appears to be this:

def comp_are():
global toup
temptoup = []
for file1 in remotefiles:
a = file1
for file2 in localfiles:
b = file2
if str(a) == str(b):
pass
if str(b) != str(a):
temptoup.append(str(str(b)))
toup = list(sets.Set(temptoup))
for filename in remotefiles:
fn2up = filename
for item in toup:
if fn2up == item:
toup.remove(item)
else:
pass
toup.sort()

(It's mostly nonsense... what do you expect from str(str(b)) different
from str(b)? and the next line is just a waste of time, can you see why?)
I think you want to compare two lists of filenames, and keep the elements
that are on one "localfiles" list but not on the other. As you appear to
know about sets: it's the set difference between "localfiles" and
"remotefiles". Keeping the same "globalish" thing:

def comp_are():
global toup
toup = list(sets.Set(localfiles) - sets.Set(remotefiles))
toup.sort()

Since Python 2.4, set is a builtin type, and you have sorted(), so you
could write:

def comp_are():
global toup
toup = sorted(set(localfiles) - set(remotefiles))

- Functions may have parameters and return useful things :)
That is, you may write, by example:

remotefiles = getRemoteFiles(host, remotedir)
localfiles = getLocalFiles(localdir)
newfiles = findNewFiles(localfiles, remotefiles)
uploadFiles(host, newfiles)

--
Gabriel Genellina

May 7 '07 #6
On May 7, 10:21 pm, "Gabriel Genellina" <gagsl-...@yahoo.com.ar>
wrote:
En Mon, 07 May 2007 09:14:34 -0300, Merrigan <charl.loub...@gmail.com>
escribió:
The Script it available at this url :
http://www.lewendewoord.co.za/theScript.py

I understand this as a learning exercise, since there are lot of utilities
for remote syncing.

Some comments:
- use os.path.join to build file paths, instead of concatenating strings.
- instead of reassigning sys.stdout before the call to retrlines, use the
callback:

saveinfo = sys.stdout
fsock = open(tempDir + "remotelist.txt", "a")
sys.stdout = fsock
ftpconn.cwd(remotedir) #This changes to the remote directory
ftpconn.retrlines("LIST") #This gets a complete list of everything in
the directory
sys.stdout = saveinfo
fsock.close()

becomes:

fsock = open(os.path.join(tempDir,"remotelist.txt"), "a")
ftpconn.cwd(remotedir) #This changes to the remote directory
ftpconn.retrlines("LIST", fsock.write) #This gets a complete list of
everything in the directory
fsock.close()
(Why mode="a"? Shouldn't it be "w"? Isn't the listing for a single
directory?)

- Saving both file lists may be useful, but why do you read them again? If
you already have a list of local filenames and remote filenames, why read
them from the saved copy?
- It's very confusing having "filenames" ending with "\n" - strip that as
you read it. You can use fname = fname.rstrip()
- If you are interested on filenames with a certain extension, only
process those files. That is, filter them *before* the processing begins.

- The time-consuming part appears to be this:

def comp_are():
global toup
temptoup = []
for file1 in remotefiles:
a = file1
for file2 in localfiles:
b = file2
if str(a) == str(b):
pass
if str(b) != str(a):
temptoup.append(str(str(b)))
toup = list(sets.Set(temptoup))
for filename in remotefiles:
fn2up = filename
for item in toup:
if fn2up == item:
toup.remove(item)
else:
pass
toup.sort()

(It's mostly nonsense... what do you expect from str(str(b)) different
from str(b)? and the next line is just a waste of time, can you see why?)
I think you want to compare two lists of filenames, and keep the elements
that are on one "localfiles" list but not on the other. As you appear to
know about sets: it's the set difference between "localfiles" and
"remotefiles". Keeping the same "globalish" thing:

def comp_are():
global toup
toup = list(sets.Set(localfiles) - sets.Set(remotefiles))
toup.sort()

Since Python 2.4, set is a builtin type, and you have sorted(), so you
could write:

def comp_are():
global toup
toup = sorted(set(localfiles) - set(remotefiles))

- Functions may have parameters and return useful things :)
That is, you may write, by example:

remotefiles = getRemoteFiles(host, remotedir)
localfiles = getLocalFiles(localdir)
newfiles = findNewFiles(localfiles, remotefiles)
uploadFiles(host, newfiles)

--
Gabriel Genellina
Hmmm, thanks a lot. This has really been helpful. I have tried putting
it in the set, and whoops, it workes. Now, I think I need to start
learning some more.

now the script is running a lot slower...
Now to get the rest of it up and running...

Thanx for the help!

May 8 '07 #7

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

7
by: svilen | last post by:
hello again. i'm now into using python instead of another language(s) for describing structures of data, including names, structure, type-checks, conversions, value-validations, metadata etc....
0
by: Bennett Haselton | last post by:
Is there any way to do a long update/delete on a table such that you can view the progress as the command runs, so that long before it's over, you have some idea of what the total running time will...
0
by: Michael Widenius | last post by:
Hi! >>>>> "Bennett" == Bennett Haselton <bennett@peacefire.org> writes: Bennett> Is there any way to do a long update/delete on a table such that you can Bennett> view the progress as the...
0
by: Michael Widenius | last post by:
Hi! >>>>> "Bennett" == Bennett Haselton <bennett@peacefire.org> writes: Bennett> Is there any way to do a long update/delete on a table such that you can Bennett> view the progress as the...
6
by: MackS | last post by:
Hello everyone I am faced with the following problem. For the first time I've asked myself "might this actually be easier to code in C rather than in python?", and I am not looking at device...
0
by: Hystou | last post by:
There are some requirements for setting up RAID: 1. The motherboard and BIOS support RAID configuration. 2. The motherboard has 2 or more available SATA protocol SSD/HDD slots (including MSATA, M.2...
0
marktang
by: marktang | last post by:
ONU (Optical Network Unit) is one of the key components for providing high-speed Internet services. Its primary function is to act as an endpoint device located at the user's premises. However,...
0
by: Hystou | last post by:
Most computers default to English, but sometimes we require a different language, especially when relocating. Forgot to request a specific language before your computer shipped? No problem! You can...
1
by: Hystou | last post by:
Overview: Windows 11 and 10 have less user interface control over operating system update behaviour than previous versions of Windows. In Windows 11 and 10, there is no way to turn off the Windows...
0
tracyyun
by: tracyyun | last post by:
Dear forum friends, With the development of smart home technology, a variety of wireless communication protocols have appeared on the market, such as Zigbee, Z-Wave, Wi-Fi, Bluetooth, etc. Each...
0
agi2029
by: agi2029 | last post by:
Let's talk about the concept of autonomous AI software engineers and no-code agents. These AIs are designed to manage the entire lifecycle of a software development project—planning, coding, testing,...
1
isladogs
by: isladogs | last post by:
The next Access Europe User Group meeting will be on Wednesday 1 May 2024 starting at 18:00 UK time (6PM UTC+1) and finishing by 19:30 (7.30PM). In this session, we are pleased to welcome a new...
0
by: 6302768590 | last post by:
Hai team i want code for transfer the data from one system to another through IP address by using C# our system has to for every 5mins then we have to update the data what the data is updated ...
0
muto222
php
by: muto222 | last post by:
How can i add a mobile payment intergratation into php mysql website.

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.