By using this site, you agree to our updated Privacy Policy and our Terms of Use. Manage your Cookies Settings.
438,427 Members | 1,356 Online
Bytes IT Community
+ Ask a Question
Need help? Post your question and get tips & solutions from a community of 438,427 IT Pros & Developers. It's quick & easy.

What is a perl hash in python

P: n/a
I am new to Pyton. I am trying to modify and understand a script someone
else wrote. I am trying to make sense of the following code snippet. I know
line 7 would be best coded with regex. I first would like to understand
what was coded originally. thelistOut looks like a hash to me (I'm more
familiar with perl). Perhaps someone could translate from perl to python
for me - not in code but just in concept.
Here is the code. This script is reading the list thelistOut and then
removing any items in RSMlist and taking the remainder and putting them in
graphAddressOut with the formatting.

This is a SAMPLE of what is in the lists referenced below in the loop:
thelistOut = [(632,
['/usr/local/www/data-dist/mrtg/main/test/172.16.0.23_9.log']), (145,
['/usr/local/www/data-dist/mrtg/main/test/172.16.0.23_13.log']), (0,
['/usr/local/www/data-dist/mrtg/main/test/172.16.0.23_5.log'])]

RSMList = ['172.16.0.1_1', '172.16.0.1_2', '172.16.0.1_3', '172.16.0.1_4',
'172.16.0.1_5']

#--------------------------Loop 1 -------------------------

w = 0
while w < 45:

fileOut = string.split(thelistOut[w][1][0],".log")
fileOutSplitedCommon = string.split(fileOut[0], "main/")
fileOut2D = string.split(fileOutSplitedCommon[1], "/")
fileOut = string.split(fileOut[0],"data-dist")

if fileOut2D[1] in RSMList:
w = w + 1
continue
graphAddressOut = tag1 + logUrl + fileOut[1] + extention1 + tag2 +
"<b>SWITCH: " + string.swapcase(fileOut2D[0]) + "&nbsp;&nbsp;&
nbsp;PORT ID: " + fileOut2D[1] + "</b><br>" + imgTitleTag + imgTag1 +
logUrl + fileOut[1] + extention2 + imgTag2 + tag3 + tag5
outputOut.append(graphAddressOut)
strOut = strOut + graphAddressOut

w = w + 1

#--------------------------Loop 1 -------------------------

--

Karyn Williams
Network Services Manager
California Institute of the Arts
ka***@calarts.edu
http://www.calarts.edu/network
Jan 12 '07 #1
Share this Question
Share on Google+
5 Replies


P: n/a
In <ma***************************************@python. org>, Karyn Williams
wrote:
I am new to Pyton. I am trying to modify and understand a script someone
else wrote. I am trying to make sense of the following code snippet. I know
line 7 would be best coded with regex.
What is line 7 in the snippet?
I first would like to understand what was coded originally. thelistOut
looks like a hash to me (I'm more familiar with perl).
It's a list which contains tuples. Each tuple contains an integer and a
list with one string that looks like a pathname.
Perhaps someone could translate from perl to python for me - not in code
but just in concept.
Which Perl? You gave us Python!?
Here is the code. This script is reading the list thelistOut and then
removing any items in RSMlist and taking the remainder and putting them
in graphAddressOut with the formatting.
There's nothing removed from `thelistOut`. Names where the
filename/basename without the extension is in `RSMList` are not processed
and added to `outputOut`.
This is a SAMPLE of what is in the lists referenced below in the loop:
thelistOut = [(632,
['/usr/local/www/data-dist/mrtg/main/test/172.16.0.23_9.log']), (145,
['/usr/local/www/data-dist/mrtg/main/test/172.16.0.23_13.log']), (0,
['/usr/local/www/data-dist/mrtg/main/test/172.16.0.23_5.log'])]

RSMList = ['172.16.0.1_1', '172.16.0.1_2', '172.16.0.1_3',
'172.16.0.1_4', '172.16.0.1_5']

#--------------------------Loop 1 -------------------------

w = 0
while w < 45:
The loop looks odd. Is it really a literal 45 here or are all elements of
`thelistOut` processed? Then a for loop over the list if you don't need
`w` for something other than indexing into the list or an `xrange()`
object are much cleaner than using a while loop and updating the counter
manually. That the second element of the tuple seems to be always a list
with one item looks odd too.
fileOut = string.split(thelistOut[w][1][0],".log")
fileOutSplitedCommon = string.split(fileOut[0], "main/")
fileOut2D = string.split(fileOutSplitedCommon[1], "/")
fileOut = string.split(fileOut[0],"data-dist")
This might be more readable and understandable if `os.path.splitext()` and
`os.path.split()` where used.
if fileOut2D[1] in RSMList:
w = w + 1
continue
Might be cleaner to negate the test and use the remaining code as body of
that ``if`` statement.
graphAddressOut = tag1 + logUrl + fileOut[1] + extention1 + tag2
+
"<b>SWITCH: " + string.swapcase(fileOut2D[0]) + "&nbsp;&nbsp;& nbsp;PORT
ID: " + fileOut2D[1] + "</b><br>" + imgTitleTag + imgTag1 + logUrl +
fileOut[1] + extention2 + imgTag2 + tag3 + tag5
outputOut.append(graphAddressOut)
strOut = strOut + graphAddressOut
That's an unreadable mess. Better use string formatting.

And last but not least: a hash is called dictionary in Python.

Ciao,
Marc 'BlackJack' Rintsch
Jan 12 '07 #2

P: n/a
At 06:54 PM 1/12/07 GMT, you wrote:
>On Fri, 12 Jan 2007 09:15:44 -0800, Karyn Williams <ka***@calarts.edu>
declaimed the following in comp.lang.python:
>I am new to Pyton. I am trying to modify and understand a script someone
else wrote. I am trying to make sense of the following code snippet. I know

"someone else" didn't write Python either, looking at that mishmash
<G>
Thanks, Marc and Dennis.

Actually as I think about it, this operation should be able to be done in
one loop, not the ten or so that it is currently taking.
Read in a file "*.log" (excluding certain named files "1.log"), total up x
number of rows of the 2nd and third columns, push (filename, total col2)
(filename, total col 3) to two lists, sort -r and generate one web page
each with the top ten.

That is what this script is supposed to be doing.
>line 7 would be best coded with regex. I first would like to understand
what was coded originally. thelistOut looks like a hash to me (I'm more
familiar with perl). Perhaps someone could translate from perl to python
for me - not in code but just in concept.
Here is the code. This script is reading the list thelistOut and then
removing any items in RSMlist and taking the remainder and putting them in
graphAddressOut with the formatting.

This is a SAMPLE of what is in the lists referenced below in the loop:
thelistOut = [(632,
['/usr/local/www/data-dist/mrtg/main/test/172.16.0.23_9.log']), (145,
['/usr/local/www/data-dist/mrtg/main/test/172.16.0.23_13.log']), (0,
['/usr/local/www/data-dist/mrtg/main/test/172.16.0.23_5.log'])]

This is a list containing three elements. Each element is a tuple
containing two sub-elements. The first sub-element appears to be an
integer (I have no idea of the significance of the value at this time).
The second sub-element is another list containing a single
sub-sub-element -- that sub-sub-element is a string (file path name).
>RSMList = ['172.16.0.1_1', '172.16.0.1_2', '172.16.0.1_3', '172.16.0.1_4',
'172.16.0.1_5']

#--------------------------Loop 1 -------------------------

w = 0
while w < 45:
for w in xrange(45):
> fileOut = string.split(thelistOut[w][1][0],".log")
fileOutSplitedCommon = string.split(fileOut[0], "main/")
fileOut2D = string.split(fileOutSplitedCommon[1], "/")
fileOut = string.split(fileOut[0],"data-dist")
Direct use of the string module is now frowned upon.

For future reference, why is direct use of the string module frowned upon,
and what does one use instead ?

>Also, since these are file path names, using operations in the os.path
module would
>be more appropriate...

I'll look into os.path, but what this loop should be doing is matching and
removing the entries from thelistOut ( and thelistIn )
that are listed in RSMlist. Or as is being done, not writing them to the
new list, outputOut (graphAddressOut).
Its just a matching operation, not really a path/filename op. This is why I
will be changing this to a regex.

> if fileOut2D[1] in RSMList:
w = w + 1
continue

Confusing logic, having two places where "w" is incremented. Using a
"for" loop would mean neither increment statement is needed. Actually,
"w" isn't even needed, replace the while/for with

for fid in thelistOut:
fileOut = fid[1][0] #that [1] is getting the second element of

the tuple, and the [0] gets the string out of that list (why a list of
one element string data?)
> graphAddressOut = tag1 + logUrl + fileOut[1] + extention1 + tag2 +
"<b>SWITCH: " + string.swapcase(fileOut2D[0]) + "&nbsp;&nbsp;&
nbsp;PORT ID: " + fileOut2D[1] + "</b><br>" + imgTitleTag + imgTag1 +
logUrl + fileOut[1] + extention2 + imgTag2 + tag3 + tag5

This could be cleaned up too, but I'll ignore it at the moment.
> outputOut.append(graphAddressOut)
strOut = strOut + graphAddressOut

w = w + 1

#--------------------------Loop 1 -------------------------


I think what you call a "hash" in PERL is a dictionary in Python:

dct = { key1 : value1, ... , keyn : valuen }

aval = dct[keyx]

Nothing of the sort used in the code you show above.


--

Karyn Williams
Network Services Manager
California Institute of the Arts
ka***@calarts.edu
http://www.calarts.edu/network
Jan 12 '07 #3

P: n/a
Karyn Williams wrote:
>
For future reference, why is direct use of the string module frowned upon,
and what does one use instead ?
Karyn,

http://docs.python.org/lib/node42.html

rd

Jan 12 '07 #4

P: n/a
Karyn Williams a écrit :
I am new to Pyton. I am trying to modify and understand a script someone
else wrote. I am trying to make sense of the following code snippet. I know
line 7 would be best coded with regex. I first would like to understand
what was coded originally. thelistOut looks like a hash to me (I'm more
familiar with perl).
It's not a hash (the Python type for hashtables is 'dict'), it's a list
of 2-tuples. FWIW, the dict type can accept such a list as an argument
to it's constructor - but then you loose the ordering.

Also, the data structure is somewhat weird, since the second item of
each tuple is always a one-element list.
Perhaps someone could translate from perl to python
Do you mean "from Python to Perl" ?
for me - not in code but just in concept.
Here is the code. This script is reading the list thelistOut and then
removing any items in RSMlist and taking the remainder and putting them in
graphAddressOut with the formatting.

This is a SAMPLE of what is in the lists referenced below in the loop:
thelistOut = [(632,
['/usr/local/www/data-dist/mrtg/main/test/172.16.0.23_9.log']), (145,
['/usr/local/www/data-dist/mrtg/main/test/172.16.0.23_13.log']), (0,
['/usr/local/www/data-dist/mrtg/main/test/172.16.0.23_5.log'])]

RSMList = ['172.16.0.1_1', '172.16.0.1_2', '172.16.0.1_3', '172.16.0.1_4',
'172.16.0.1_5']

#--------------------------Loop 1 -------------------------

w = 0
while w < 45:

fileOut = string.split(thelistOut[w][1][0],".log")
fileOutSplitedCommon = string.split(fileOut[0], "main/")
fileOut2D = string.split(fileOutSplitedCommon[1], "/")
fileOut = string.split(fileOut[0],"data-dist")

if fileOut2D[1] in RSMList:
w = w + 1
continue
graphAddressOut = tag1 + logUrl + fileOut[1] + extention1 + tag2 +
"<b>SWITCH: " + string.swapcase(fileOut2D[0]) + "&nbsp;&nbsp;&
nbsp;PORT ID: " + fileOut2D[1] + "</b><br>" + imgTitleTag + imgTag1 +
logUrl + fileOut[1] + extention2 + imgTag2 + tag3 + tag5
outputOut.append(graphAddressOut)
strOut = strOut + graphAddressOut

w = w + 1

#--------------------------Loop 1 -------------------------
Yuck. This code stinks. Whoever wrote this ought to be shot down. I
refuse to try&clean this mess unless I get payed (and well payed).

Jan 12 '07 #5

P: n/a

Karyn Williams wrote:
I am new to Pyton. I am trying to modify and understand a script someone
else wrote. I am trying to make sense of the following code snippet. I know
line 7 would be best coded with regex. I first would like to understand
what was coded originally. thelistOut looks like a hash to me (I'm more
familiar with perl).
thelistOut seems to be a list of tuples. It also seems that one of the
tuple elements are a list containing a single string. To be honest,
this is one of the most ugly examples of Python code I have ever seen.
I am not sure I would trust code written like this at all. One can very
often tell the competence of the programmer from the looks of the code.

To answer the subject: An associative container in Python is called a
'dictionary'. CPython dictonaries are implemented using hash tables
(and one of the fastest hashing algorithms known to man). There is
nothing in the Python semantics that mandates this particular
implementation of dictionaries, though. Balanced binary trees could
have been used instead of hashes, as they usually are in STL's
associative containers, but in CPython a dictionary is implemented with
a hash table under the hood.

Dictionaries work like this:

mydict = { key1 : val1, key2 : val2, key3 : val3 }
oldval3 = mydict[key3]
mydict[key3] = newval3
mydict[key4] = val4

Jan 13 '07 #6

This discussion thread is closed

Replies have been disabled for this discussion.