How do I speedup this loop?

Steve

Hi,

I'm getting some output by running a command using os.popen. I need to
parse the output and transform it in some sense so that it's 'DB
compatible', (i.e I need to store the output in a database (postgres)
after escaping some characters). Since I'm new to python, I wasn't sure
if there was a better way of doing this so this is what I did:
# Parse the output returned by popen and return the script
out = os.popen('some command')
all_lines = out.readlines()

script = []
for i in xrange(len(all_lines)):
line = all_lines[i].replace("'", "\\'")[0:len(line)-1]
# replace ' with \'
line_without_carriage = line[0:len(line)-1] # remove
carriage
line_without_carriage =
line_without_carriage.replace("\\n", "$___n") # replace end of line with
$___n
line_without_carriage += "@___n" # add a 'end of line'
character to the end
script.append(line_without_carriage)
# end for

script = ''.join(script)

Please help because I'm pretty sure I'm wasting a lot of cpu time in
this loop. Thanks

Steve

Jul 18 '05 #1

Subscribe Post Reply

1682

Marco Aschwanden

On Tue, 13 Jul 2004 16:48:36 +1000, Unknown <un*****@unknown.invalid>
wrote:

I'm getting some output by running a command using os.popen. I need to
parse the output and transform it in some sense so that it's 'DB
compatible', (i.e I need to store the output in a database (postgres)
after escaping some characters).

If you are using Python's DB API 2.0 than this escaping would be done by
the API:

import odbc,dbi
con = odbc.odbc("DB_ID/USERNAME/PASSWORD")
cur = con.cursor()
sql = "INSERT INTO output (line) VALUES (?)"
dirty_line = 'Some text with forbidden characters\n\r...'
cur.execute(sql, dirty_line)

So, no need to parse (and afterwards unparse) the ouput - I don't think
that anyone can beat this speed up!

Regards,
Marco

Jul 18 '05 #2

David Fraser

Marco Aschwanden wrote:

On Tue, 13 Jul 2004 16:48:36 +1000, Unknown <un*****@unknown.invalid>
wrote:
I'm getting some output by running a command using os.popen. I need to
parse the output and transform it in some sense so that it's 'DB
compatible', (i.e I need to store the output in a database (postgres)
after escaping some characters).

If you are using Python's DB API 2.0 than this escaping would be done by
the API:
import odbc,dbi
con = odbc.odbc("DB_ID/USERNAME/PASSWORD")
cur = con.cursor()
sql = "INSERT INTO output (line) VALUES (?)"
dirty_line = 'Some text with forbidden characters\n\r...'
cur.execute(sql, dirty_line)

So, no need to parse (and afterwards unparse) the ouput - I don't think
that anyone can beat this speed up!

Except if you're aiming for database independence, as different database
drivers support different means of escaping parameters...

David

Jul 18 '05 #3

Riccardo Attilio Galli

On Tue, 13 Jul 2004 16:48:36 +1000, Steve wrote:

Hi,

I'm getting some output by running a command using os.popen. I need to
parse the output and transform it in some sense so that it's 'DB
compatible', (i.e I need to store the output in a database (postgres)
after escaping some characters). Since I'm new to python, I wasn't sure
if there was a better way of doing this so this is what I did:
if you were replacing a character with another, the best and quick way was
to use a translation table, but you're not lucky
line = all_lines[i].replace("'", "\\'")[0:len(line)-1]
# replace ' with \'
? this can't work. You never defined "line" and you're using "len" on it.
I think you want to delete the last character of the string. if so, you
can use negative indexes

line = all_lines[i].replace("'", "\\'")[:-1]

the 0 disappeared is the default value

line_without_carriage = line[0:len(line)-1] # remove carriage
similar here
line_without_carriage = line[:-1]

but you're just deleting the current last character of the string.
so you could delete this line and change the indexing in the first one
so the first one would become
line = all_lines[i].replace("'", "\\'")[:-2]

ah, I don't think you're removing a carriage return ('\r') here.
If your line end with '\r\n' you're killing '\n' , a line feed.
This is important 'cause in the next line....
line_without_carriage = line_without_carriage.replace("\\n", "$___n") # replace end of line with $___n
.... you try to replace '\\n' ,
are you intending to delete the line feed, the end of line ?
if this is the case you should write '\n' (one character) not '\\n' (a
string of len 2)
line_without_carriage += "@___n"
script.append(line_without_carriage)
# end for
script = ''.join(script)
the best here is to do
script.append(line_without_carriage)
script.append('@___n')
# end for
script = ''.join(script)

Appending '@___n' you don't need to loose memory for destroying and
creating a new string each time
Please help because I'm pretty sure I'm wasting a lot of cpu time in
this loop. Thanks

Steve

Ciao,
Riccardo

--
-=Riccardo Galli=-

_,e.
s~ ``
~@. ideralis Programs
.. ol
`**~ http://www.sideralis.net

Jul 18 '05 #4

Istvan Albert

David Fraser wrote:

Except if you're aiming for database independence, as different database
drivers support different means of escaping parameters...

IMHO database independence is both overrated not to mention impossible.
You can always try the 'greatest common factor' approach but that
causes more trouble (and work) than it saves.

I agree with the previous poster stating that escaping should be done
in the DB API, but it is better to use the 'typed' escaping:

sql = 'SELECT FROM users WHERE user_id=%d AND user_passwd=%s'
par = [1, 'something']
cursor.execute(sql, par)

Istvan.

Jul 18 '05 #5

george young

On Tue, 13 Jul 2004 16:48:36 +1000
Steve <nospam@nopes> threw this fish to the penguins:

I'm getting some output by running a command using os.popen. I need to
parse the output and transform it in some sense so that it's 'DB
compatible', (i.e I need to store the output in a database (postgres)
after escaping some characters). Since I'm new to python, I wasn't sure
if there was a better way of doing this so this is what I did:

# Parse the output returned by popen and return the script
out = os.popen('some command')
all_lines = out.readlines()

script = []
for i in xrange(len(all_lines)):
line = all_lines[i].replace("'", "\\'")[0:len(line)-1]
# replace ' with \'
line_without_carriage = line[0:len(line)-1] # remove
carriage
line_without_carriage =
line_without_carriage.replace("\\n", "$___n") # replace end of line with
$___n
line_without_carriage += "@___n" # add a 'end of line'
character to the end
script.append(line_without_carriage)
# end for

script = ''.join(script)

How about:

lines = []
out = os.popen('some command')
for l in out:
lines.append(l.strip())
script = ''.join(lines)
out.close()

The "strip" actually removes white space from front and back of the string;
you could say l.strip('\n') if you only want the newlines removed (or '\r'
if they're really carriage return characters.)

Or if you want a clever (and most CPU efficient!) one-liner:

script = [l.strip() for l in os.popen('some command')]

I'm not advocating such a terse one-liner unless you are very comfortable
with it's meaning; will you easily know what it does when you see it
six months from now in the heat of battle?

Also, the one-liner does not allow you to explicitly close the file
descriptor from popen. This could be a serious problem if it gets run
hundreds of times in a loop.

Have fun,
-- George Young

--
"Are the gods not just?" "Oh no, child.
What would become of us if they were?" (CSL)

Jul 18 '05 #6

Jean Brouwers

What about handling all output as one string?

script = os.popen('some command')
script = script.replace("'", "\\'") # replace ' with \'
script = script.replace("\r", ") # remove cr
script = script.replace("\\n", "$___n") # replace \n
script = script.replace("\n", "@___n'") # replace nl
/Jean Brouwers
In article <40********@clarion.carno.net.au>, Steve <nospam@nopes>
wrote:

Hi,

I'm getting some output by running a command using os.popen. I need to
parse the output and transform it in some sense so that it's 'DB
compatible', (i.e I need to store the output in a database (postgres)
after escaping some characters). Since I'm new to python, I wasn't sure
if there was a better way of doing this so this is what I did:
# Parse the output returned by popen and return the script
out = os.popen('some command')
all_lines = out.readlines()

script = []
for i in xrange(len(all_lines)):
line = all_lines[i].replace("'", "\\'")[0:len(line)-1]
# replace ' with \'
line_without_carriage = line[0:len(line)-1] # remove
carriage
line_without_carriage =
line_without_carriage.replace("\\n", "$___n") # replace end of line with
$___n
line_without_carriage += "@___n" # add a 'end of line'
character to the end
script.append(line_without_carriage)
# end for

script = ''.join(script)

Please help because I'm pretty sure I'm wasting a lot of cpu time in
this loop. Thanks

Steve

Jul 18 '05 #7

David Fraser

Istvan Albert wrote:

David Fraser wrote:
Except if you're aiming for database independence, as different
database drivers support different means of escaping parameters...

IMHO database independence is both overrated not to mention impossible.
You can always try the 'greatest common factor' approach but that
causes more trouble (and work) than it saves.

Not overrated or impossible. It's part of our business model. It works.
I agree with the previous poster stating that escaping should be done
in the DB API, but it is better to use the 'typed' escaping:

sql = 'SELECT FROM users WHERE user_id=%d AND user_passwd=%s'
par = [1, 'something']
cursor.execute(sql, par)

Better if the database driver you are using supports it... otherwise
userless
I think there is a need to drive towards some sort of standard approach
to this in DB-API (maybe version 3?) as it otherwise nullifies
parameters for anyone using multiple database drivers.

David

Jul 18 '05 #8

Lonnie Princehouse

Welcome to Python =)
Somebody else already mentioned checking out the DBI API's way of
escaping data; this is a good idea. Besides that, here are some
general tips-

1. Consider using out.xreadlines() if you only need one line at a
time:

for line in out.xreadlines():
...

If you need all of the data at once, try out.read()

2. You can use negative numbers to index relative to the end of a
sequence:

line[0:-1] is equivalent to line[0:len(line)-1]
(i.e. cut off the last character of a string)

You can also use line.strip() to remove trailing whitespace,
including newlines.

3. If you omit the index on either side of a slice, Python will
default to the beginning and end of a sequence:

line[:] is equivalent to line[0:len(line)]

4. Check out the regular expression module. Here's how to read all of
your output and make multiple escape substitutions. Smashing this
into one regular expression means you only need one pass over the
data. It also avoids string concatenation.

import re, os

out = os.popen('some command')
data = out.read()

substitution_map = {
"'" : r"\'",
"\n": "$___n",
}

def sub_func(match_object, smap=substitution_map):
return smap[match_object.group(0)]

escape_expr = re.compile('|'.join(substitution_map.keys())))

escaped_data = escape_expr.sub(sub_func, data)

# et voila... now you've got a big escaped string without even
# writing a single for loop. Tastes great, less filling.

(caveat: I didn't run this code. It might have typos.)
Steve <nospam@nopes> wrote in message news:<40********@clarion.carno.net.au>...

Hi,

I'm getting some output by running a command using os.popen. I need to
parse the output and transform it in some sense so that it's 'DB
compatible', (i.e I need to store the output in a database (postgres)
after escaping some characters). Since I'm new to python, I wasn't sure
if there was a better way of doing this so this is what I did:
# Parse the output returned by popen and return the script
out = os.popen('some command')
all_lines = out.readlines()

script = []
for i in xrange(len(all_lines)):
line = all_lines[i].replace("'", "\\'")[0:len(line)-1]
# replace ' with \'
line_without_carriage = line[0:len(line)-1] # remove
carriage
line_without_carriage =
line_without_carriage.replace("\\n", "$___n") # replace end of line with
$___n

line_without_carriage += "@___n" # add a 'end of line' character to the end
script.append(line_without_carriage)
# end for

script = ''.join(script)

Please help because I'm pretty sure I'm wasting a lot of cpu time in
this loop. Thanks

Steve

Jul 18 '05 #9

Bart Nessux

george young wrote:

How about:

lines = []
out = os.popen('some command')
for l in out:
lines.append(l.strip())
script = ''.join(lines)
out.close()

The "strip" actually removes white space from front and back of the string;
you could say l.strip('\n') if you only want the newlines removed (or '\r'
if they're really carriage return characters.)
The above is a great solution... should make for a good speed up. It's
how I might pproach it.
Or if you want a clever (and most CPU efficient!) one-liner:

script = [l.strip() for l in os.popen('some command')]

Clever progammers should be shot! I've had to work behind them... they
are too smart for their own good. They think everyone else in the world
is as clever as they are... this is where they are wrong ;)

Jul 18 '05 #10

george young

On Wed, 14 Jul 2004 15:40:15 -0400
Bart Nessux <ba*********@hotmail.com> threw this fish to the penguins:

george young wrote:
How about:

lines = []
out = os.popen('some command')
for l in out:
lines.append(l.strip())
script = ''.join(lines)
out.close()

The "strip" actually removes white space from front and back of the string;
you could say l.strip('\n') if you only want the newlines removed (or '\r'
if they're really carriage return characters.)
The above is a great solution... should make for a good speed up. It's
how I might pproach it.
Or if you want a clever (and most CPU efficient!) one-liner:

script = ''.join([l.strip() for l in os.popen('some command')])

[fixed up a bit...I forgot about the join]
Clever progammers should be shot! I've had to work behind them... they
are too smart for their own good. They think everyone else in the world
is as clever as they are... this is where they are wrong ;)

Oh, all right. How about:

out = os.popen('some command')
temp_script = [l.strip() for l in out]
script = ''.join(temp_script)

That's clear enough, and still takes advantage of the efficiency of
the list comprehension! (I admit, that for reading the whole file,
the other postings of regexp substitution on the total string are
certainly faster, given enough RAM, but still not as clear and concise
and elegant ... blah blah blah as mine...

-- George Young

--
"Are the gods not just?" "Oh no, child.
What would become of us if they were?" (CSL)

Jul 18 '05 #11

Steve

george young wrote:

On Wed, 14 Jul 2004 15:40:15 -0400
Bart Nessux <ba*********@hotmail.com> threw this fish to the penguins:

george young wrote:
How about:

lines = []
out = os.popen('some command')
for l in out:
lines.append(l.strip())
script = ''.join(lines)
out.close()

The "strip" actually removes white space from front and back of the string;
you could say l.strip('\n') if you only want the newlines removed (or '\r'
if they're really carriage return characters.)
The above is a great solution... should make for a good speed up. It's
how I might pproach it.

Or if you want a clever (and most CPU efficient!) one-liner:

script = ''.join([l.strip() for l in os.popen('some command')])

[fixed up a bit...I forgot about the join]
Clever progammers should be shot! I've had to work behind them... they
are too smart for their own good. They think everyone else in the world
is as clever as they are... this is where they are wrong ;)

Oh, all right. How about:

out = os.popen('some command')
temp_script = [l.strip() for l in out]
script = ''.join(temp_script)

That looks good but the problem is that I don't want to 'strip' off the
'end of line' characters etc., because I need to reproduce/print the
output exactly as it was at a later stage. What's more... I need to
print it out on a HTML page, and so if I know the different between \n
(in code) and the implicit end of line character, I can interpret that
in HTML accordingly. For example, the output can contain something like:

printf("Hey there\n");

and so, there's a \n embedded inside the text as well as the end of line
character which isn't visible. Although escaping characters using a
DB-API function might do the trick, this still won't help me much in
the end, where I need to print a "<br>" for each 'end of line character'.

--
Steve
That's clear enough, and still takes advantage of the efficiency of
the list comprehension! (I admit, that for reading the whole file,
the other postings of regexp substitution on the total string are
certainly faster, given enough RAM, but still not as clear and concise
and elegant ... blah blah blah as mine...

-- George Young

Jul 18 '05 #12

Steve

Jean Brouwers wrote:

What about handling all output as one string?

script = os.popen('some command')
script = script.replace("'", "\\'") # replace ' with \'
script = script.replace("\r", ") # remove cr
script = script.replace("\\n", "$___n") # replace \n
script = script.replace("\n", "@___n'") # replace nl

This won't do any better than what I was already doing. I need the code
to be very fast and this will only end up creating a lot of copies
everytime the string is going to be modified (strings are immutable). I
really like the idea of using regex for this (proposed by lonnie), but I
still need to a hang of it.

Cheers,
Steve

------------ And now a word from our sponsor ------------------
Want to have instant messaging, and chat rooms, and discussion
groups for your local users or business, you need dbabble!
-- See http://netwinsite.com/sponsor/sponsor_dbabble.htm ----

Jul 18 '05 #13

Similar topics

Nested array loop problem

by: Charles Alexander | last post by:

Hello I am new to php & MySQL - I am trying to retrieve some records from a MySQL table and redisplay them. The data in list form looks like this: Sample_ID Marker_ID Variation ...

PHP

Threads not getting expected speedup

by: andrewpalumbo | last post by:

I'm trying to write some code which will split up a vector into two halves and run a method on the objects in the vector using two seperate threads. I was hoping to see a near linear speedup on an...

Java

how to speedup this code?

by: Ognen Duzlevski | last post by:

Hi all, I have rewritten a C program to solve a bioinformatics problem. Portion where most of the time is spent is: def DynAlign(scoremat,insertmat,delmat,tseq,qseq,tlen,qlen): global...

Python

Speed quirk: redundant line gives six-fold speedup

by: Mark Dickinson | last post by:

I have a simple 192-line Python script that begins with the line: dummy0 = 47 The script runs in less than 2.5 seconds. The variable dummy0 is never referenced again, directly or indirectly,...

Python

how to speedup program load

by: TM | last post by:

I have a small application that displays records from an access mdb into two datagrids and am looking to see if it is possible to speedup the loadtime somehow. In my formload I am filling my...

Visual Basic .NET

64 bit performance speedup over 32 bit

by: Lars Schouw | last post by:

All, Does anyone know how much performance speedup I can expect by using 64 bit C++ / Windows XP 64 bit over the 32 bit versions? Did anyone test this under Visual Studio 2005 or Intel C++...

C / C++

I miss loop

by: cj | last post by:

When I'm inside a do while loop sometimes it's necessary to jump out of the loop using exit do. I'm also used to being able to jump back and begin the loop again. Not sure which language my...

Visual Basic .NET

VB2005 line drawing speedup

by: Galen Somerville | last post by:

My VB2005 app gets real time Heart sounds and an ECG from a USB device. I'm looking for a way to speed up the drawing of the traces on the screen. In the following code the routine GetSounds...

Visual Basic .NET

rewrite for achieving speedup

by: Johnny Blonde | last post by:

Hello Group! I really tried hard for two hours to rewrite the following expression (python 2.4): -------------------------- teilnehmer = for r in Reisen.select(AND(Reisen.q.RESVON <= datum,...

Python

How to turn on java script in a villaon keypad mobile phone

by: Charles Arthur | last post by:

How do i turn on java script on a villaon, callus and itel keypad mobile phone

Java

Merging data from multiple Excel files

by: ryjfgjl | last post by:

In our work, we often receive Excel tables with data in the same format. If we want to analyze these data, it can be difficult to analyze them because the data is spread across multiple Excel files...

Data Management

Migrating Website to Cloud - Emmanuel Katto

by: emmanuelkatto | last post by:

Hi All, I am Emmanuel katto from Uganda. I want to ask what challenges you've faced while migrating a website to cloud. Please let me know. Thanks! Emmanuel

General

Looking to do Android software development, any suggestions? Is flutter better?

by: nemocccc | last post by:

hello, everyone, I want to develop a software for my android phone for daily needs, any suggestions?

General

How to build RAID in BIOS?

by: Hystou | last post by:

There are some requirements for setting up RAID: 1. The motherboard and BIOS support RAID configuration. 2. The motherboard has 2 or more available SATA protocol SSD/HDD slots (including MSATA, M.2...

Computer Hardware

Changing the language in Windows 10

by: Hystou | last post by:

Most computers default to English, but sometimes we require a different language, especially when relocating. Forgot to request a specific language before your computer shipped? No problem! You can...

Windows Server

Problem With Comparison Operator <=> in G++

by: Oralloy | last post by:

Hello folks, I am unable to find appropriate documentation on the type promotion of bit-fields when using the generalised comparison operator "<=>". The problem is that using the GNU compilers,...

C / C++

Maximizing Business Potential: The Nexus of Website Design and Digital Marketing

by: jinu1996 | last post by:

In today's digital age, having a compelling online presence is paramount for businesses aiming to thrive in a competitive landscape. At the heart of this digital strategy lies an intricately woven...

Online Marketing

Discussion: How does Zigbee compare with other wireless protocols in smart home applications?

by: tracyyun | last post by:

Dear forum friends, With the development of smart home technology, a variety of wireless communication protocols have appeared on the market, such as Zigbee, Z-Wave, Wi-Fi, Bluetooth, etc. Each...

General