472,353 Members | 1,467 Online
Bytes | Software Development & Data Engineering Community
+ Post

Home Posts Topics Members FAQ

Join Bytes to post your question to a community of 472,353 software developers and data experts.

What is a perl hash in python

I am new to Pyton. I am trying to modify and understand a script someone
else wrote. I am trying to make sense of the following code snippet. I know
line 7 would be best coded with regex. I first would like to understand
what was coded originally. thelistOut looks like a hash to me (I'm more
familiar with perl). Perhaps someone could translate from perl to python
for me - not in code but just in concept.
Here is the code. This script is reading the list thelistOut and then
removing any items in RSMlist and taking the remainder and putting them in
graphAddressOut with the formatting.

This is a SAMPLE of what is in the lists referenced below in the loop:
thelistOut = [(632,
['/usr/local/www/data-dist/mrtg/main/test/172.16.0.23_9.log']), (145,
['/usr/local/www/data-dist/mrtg/main/test/172.16.0.23_13.log']), (0,
['/usr/local/www/data-dist/mrtg/main/test/172.16.0.23_5.log'])]

RSMList = ['172.16.0.1_1', '172.16.0.1_2', '172.16.0.1_3', '172.16.0.1_4',
'172.16.0.1_5']

#--------------------------Loop 1 -------------------------

w = 0
while w < 45:

fileOut = string.split(thelistOut[w][1][0],".log")
fileOutSplitedCommon = string.split(fileOut[0], "main/")
fileOut2D = string.split(fileOutSplitedCommon[1], "/")
fileOut = string.split(fileOut[0],"data-dist")

if fileOut2D[1] in RSMList:
w = w + 1
continue
graphAddressOut = tag1 + logUrl + fileOut[1] + extention1 + tag2 +
"<b>SWITCH: " + string.swapcase(fileOut2D[0]) + "&nbsp;&nbsp;&
nbsp;PORT ID: " + fileOut2D[1] + "</b><br>" + imgTitleTag + imgTag1 +
logUrl + fileOut[1] + extention2 + imgTag2 + tag3 + tag5
outputOut.append(graphAddressOut)
strOut = strOut + graphAddressOut

w = w + 1

#--------------------------Loop 1 -------------------------

--

Karyn Williams
Network Services Manager
California Institute of the Arts
ka***@calarts.edu
http://www.calarts.edu/network
Jan 12 '07 #1
5 2416
In <ma***************************************@python. org>, Karyn Williams
wrote:
I am new to Pyton. I am trying to modify and understand a script someone
else wrote. I am trying to make sense of the following code snippet. I know
line 7 would be best coded with regex.
What is line 7 in the snippet?
I first would like to understand what was coded originally. thelistOut
looks like a hash to me (I'm more familiar with perl).
It's a list which contains tuples. Each tuple contains an integer and a
list with one string that looks like a pathname.
Perhaps someone could translate from perl to python for me - not in code
but just in concept.
Which Perl? You gave us Python!?
Here is the code. This script is reading the list thelistOut and then
removing any items in RSMlist and taking the remainder and putting them
in graphAddressOut with the formatting.
There's nothing removed from `thelistOut`. Names where the
filename/basename without the extension is in `RSMList` are not processed
and added to `outputOut`.
This is a SAMPLE of what is in the lists referenced below in the loop:
thelistOut = [(632,
['/usr/local/www/data-dist/mrtg/main/test/172.16.0.23_9.log']), (145,
['/usr/local/www/data-dist/mrtg/main/test/172.16.0.23_13.log']), (0,
['/usr/local/www/data-dist/mrtg/main/test/172.16.0.23_5.log'])]

RSMList = ['172.16.0.1_1', '172.16.0.1_2', '172.16.0.1_3',
'172.16.0.1_4', '172.16.0.1_5']

#--------------------------Loop 1 -------------------------

w = 0
while w < 45:
The loop looks odd. Is it really a literal 45 here or are all elements of
`thelistOut` processed? Then a for loop over the list if you don't need
`w` for something other than indexing into the list or an `xrange()`
object are much cleaner than using a while loop and updating the counter
manually. That the second element of the tuple seems to be always a list
with one item looks odd too.
fileOut = string.split(thelistOut[w][1][0],".log")
fileOutSplitedCommon = string.split(fileOut[0], "main/")
fileOut2D = string.split(fileOutSplitedCommon[1], "/")
fileOut = string.split(fileOut[0],"data-dist")
This might be more readable and understandable if `os.path.splitext()` and
`os.path.split()` where used.
if fileOut2D[1] in RSMList:
w = w + 1
continue
Might be cleaner to negate the test and use the remaining code as body of
that ``if`` statement.
graphAddressOut = tag1 + logUrl + fileOut[1] + extention1 + tag2
+
"<b>SWITCH: " + string.swapcase(fileOut2D[0]) + "&nbsp;&nbsp;& nbsp;PORT
ID: " + fileOut2D[1] + "</b><br>" + imgTitleTag + imgTag1 + logUrl +
fileOut[1] + extention2 + imgTag2 + tag3 + tag5
outputOut.append(graphAddressOut)
strOut = strOut + graphAddressOut
That's an unreadable mess. Better use string formatting.

And last but not least: a hash is called dictionary in Python.

Ciao,
Marc 'BlackJack' Rintsch
Jan 12 '07 #2
At 06:54 PM 1/12/07 GMT, you wrote:
>On Fri, 12 Jan 2007 09:15:44 -0800, Karyn Williams <ka***@calarts.edu>
declaimed the following in comp.lang.python:
>I am new to Pyton. I am trying to modify and understand a script someone
else wrote. I am trying to make sense of the following code snippet. I know

"someone else" didn't write Python either, looking at that mishmash
<G>
Thanks, Marc and Dennis.

Actually as I think about it, this operation should be able to be done in
one loop, not the ten or so that it is currently taking.
Read in a file "*.log" (excluding certain named files "1.log"), total up x
number of rows of the 2nd and third columns, push (filename, total col2)
(filename, total col 3) to two lists, sort -r and generate one web page
each with the top ten.

That is what this script is supposed to be doing.
>line 7 would be best coded with regex. I first would like to understand
what was coded originally. thelistOut looks like a hash to me (I'm more
familiar with perl). Perhaps someone could translate from perl to python
for me - not in code but just in concept.
Here is the code. This script is reading the list thelistOut and then
removing any items in RSMlist and taking the remainder and putting them in
graphAddressOut with the formatting.

This is a SAMPLE of what is in the lists referenced below in the loop:
thelistOut = [(632,
['/usr/local/www/data-dist/mrtg/main/test/172.16.0.23_9.log']), (145,
['/usr/local/www/data-dist/mrtg/main/test/172.16.0.23_13.log']), (0,
['/usr/local/www/data-dist/mrtg/main/test/172.16.0.23_5.log'])]

This is a list containing three elements. Each element is a tuple
containing two sub-elements. The first sub-element appears to be an
integer (I have no idea of the significance of the value at this time).
The second sub-element is another list containing a single
sub-sub-element -- that sub-sub-element is a string (file path name).
>RSMList = ['172.16.0.1_1', '172.16.0.1_2', '172.16.0.1_3', '172.16.0.1_4',
'172.16.0.1_5']

#--------------------------Loop 1 -------------------------

w = 0
while w < 45:
for w in xrange(45):
> fileOut = string.split(thelistOut[w][1][0],".log")
fileOutSplitedCommon = string.split(fileOut[0], "main/")
fileOut2D = string.split(fileOutSplitedCommon[1], "/")
fileOut = string.split(fileOut[0],"data-dist")
Direct use of the string module is now frowned upon.

For future reference, why is direct use of the string module frowned upon,
and what does one use instead ?

>Also, since these are file path names, using operations in the os.path
module would
>be more appropriate...

I'll look into os.path, but what this loop should be doing is matching and
removing the entries from thelistOut ( and thelistIn )
that are listed in RSMlist. Or as is being done, not writing them to the
new list, outputOut (graphAddressOut).
Its just a matching operation, not really a path/filename op. This is why I
will be changing this to a regex.

> if fileOut2D[1] in RSMList:
w = w + 1
continue

Confusing logic, having two places where "w" is incremented. Using a
"for" loop would mean neither increment statement is needed. Actually,
"w" isn't even needed, replace the while/for with

for fid in thelistOut:
fileOut = fid[1][0] #that [1] is getting the second element of

the tuple, and the [0] gets the string out of that list (why a list of
one element string data?)
> graphAddressOut = tag1 + logUrl + fileOut[1] + extention1 + tag2 +
"<b>SWITCH: " + string.swapcase(fileOut2D[0]) + "&nbsp;&nbsp;&
nbsp;PORT ID: " + fileOut2D[1] + "</b><br>" + imgTitleTag + imgTag1 +
logUrl + fileOut[1] + extention2 + imgTag2 + tag3 + tag5

This could be cleaned up too, but I'll ignore it at the moment.
> outputOut.append(graphAddressOut)
strOut = strOut + graphAddressOut

w = w + 1

#--------------------------Loop 1 -------------------------


I think what you call a "hash" in PERL is a dictionary in Python:

dct = { key1 : value1, ... , keyn : valuen }

aval = dct[keyx]

Nothing of the sort used in the code you show above.


--

Karyn Williams
Network Services Manager
California Institute of the Arts
ka***@calarts.edu
http://www.calarts.edu/network
Jan 12 '07 #3
Karyn Williams wrote:
>
For future reference, why is direct use of the string module frowned upon,
and what does one use instead ?
Karyn,

http://docs.python.org/lib/node42.html

rd

Jan 12 '07 #4
Karyn Williams a écrit :
I am new to Pyton. I am trying to modify and understand a script someone
else wrote. I am trying to make sense of the following code snippet. I know
line 7 would be best coded with regex. I first would like to understand
what was coded originally. thelistOut looks like a hash to me (I'm more
familiar with perl).
It's not a hash (the Python type for hashtables is 'dict'), it's a list
of 2-tuples. FWIW, the dict type can accept such a list as an argument
to it's constructor - but then you loose the ordering.

Also, the data structure is somewhat weird, since the second item of
each tuple is always a one-element list.
Perhaps someone could translate from perl to python
Do you mean "from Python to Perl" ?
for me - not in code but just in concept.
Here is the code. This script is reading the list thelistOut and then
removing any items in RSMlist and taking the remainder and putting them in
graphAddressOut with the formatting.

This is a SAMPLE of what is in the lists referenced below in the loop:
thelistOut = [(632,
['/usr/local/www/data-dist/mrtg/main/test/172.16.0.23_9.log']), (145,
['/usr/local/www/data-dist/mrtg/main/test/172.16.0.23_13.log']), (0,
['/usr/local/www/data-dist/mrtg/main/test/172.16.0.23_5.log'])]

RSMList = ['172.16.0.1_1', '172.16.0.1_2', '172.16.0.1_3', '172.16.0.1_4',
'172.16.0.1_5']

#--------------------------Loop 1 -------------------------

w = 0
while w < 45:

fileOut = string.split(thelistOut[w][1][0],".log")
fileOutSplitedCommon = string.split(fileOut[0], "main/")
fileOut2D = string.split(fileOutSplitedCommon[1], "/")
fileOut = string.split(fileOut[0],"data-dist")

if fileOut2D[1] in RSMList:
w = w + 1
continue
graphAddressOut = tag1 + logUrl + fileOut[1] + extention1 + tag2 +
"<b>SWITCH: " + string.swapcase(fileOut2D[0]) + "&nbsp;&nbsp;&
nbsp;PORT ID: " + fileOut2D[1] + "</b><br>" + imgTitleTag + imgTag1 +
logUrl + fileOut[1] + extention2 + imgTag2 + tag3 + tag5
outputOut.append(graphAddressOut)
strOut = strOut + graphAddressOut

w = w + 1

#--------------------------Loop 1 -------------------------
Yuck. This code stinks. Whoever wrote this ought to be shot down. I
refuse to try&clean this mess unless I get payed (and well payed).

Jan 12 '07 #5

Karyn Williams wrote:
I am new to Pyton. I am trying to modify and understand a script someone
else wrote. I am trying to make sense of the following code snippet. I know
line 7 would be best coded with regex. I first would like to understand
what was coded originally. thelistOut looks like a hash to me (I'm more
familiar with perl).
thelistOut seems to be a list of tuples. It also seems that one of the
tuple elements are a list containing a single string. To be honest,
this is one of the most ugly examples of Python code I have ever seen.
I am not sure I would trust code written like this at all. One can very
often tell the competence of the programmer from the looks of the code.

To answer the subject: An associative container in Python is called a
'dictionary'. CPython dictonaries are implemented using hash tables
(and one of the fastest hashing algorithms known to man). There is
nothing in the Python semantics that mandates this particular
implementation of dictionaries, though. Balanced binary trees could
have been used instead of hashes, as they usually are in STL's
associative containers, but in CPython a dictionary is implemented with
a hash table under the hood.

Dictionaries work like this:

mydict = { key1 : val1, key2 : val2, key3 : val3 }
oldval3 = mydict[key3]
mydict[key3] = newval3
mydict[key4] = val4

Jan 13 '07 #6

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

220
by: Brandon J. Van Every | last post by:
What's better about Ruby than Python? I'm sure there's something. What is it? This is not a troll. I'm language shopping and I want people's...
2
by: ben moretti | last post by:
hi i'm learning python, and one area i'd use it for is data management in scientific computing. in the case i've tried i want to reformat a data...
1
by: Xah Lee | last post by:
suppose you want to do find & replace of string of all files in a directory. here's the code: ©# -*- coding: utf-8 -*- ©# Python © ©import...
7
by: Xah Lee | last post by:
a year ago i wrote this perl program as part of a larger program. as a exercise of fun, let's do a python version. I'll post my version later...
41
by: Xah Lee | last post by:
here's another interesting algorithmic exercise, again from part of a larger program in the previous series. Here's the original Perl...
44
by: Xah Lee | last post by:
here's a large exercise that uses what we built before. suppose you have tens of thousands of files in various directories. Some of these files...
5
by: John Smith | last post by:
Can someone point me to an example of how to implement and access the kind of object shown below? Most of the examples if found are an object...
5
by: Robert Oschler | last post by:
I am converting a Perl script over to "C" for a potential open source project. I need some open source "C" code that will give me the same...
9
by: Blubaugh, David A. | last post by:
Sir, You are absolutely correct. I was praying to G_d I did not have to slaughter my project's source code in this manner. However, like life...
1
by: Kemmylinns12 | last post by:
Blockchain technology has emerged as a transformative force in the business world, offering unprecedented opportunities for innovation and...
0
by: Naresh1 | last post by:
What is WebLogic Admin Training? WebLogic Admin Training is a specialized program designed to equip individuals with the skills and knowledge...
0
by: antdb | last post by:
Ⅰ. Advantage of AntDB: hyper-convergence + streaming processing engine In the overall architecture, a new "hyper-convergence" concept was...
0
by: Matthew3360 | last post by:
Hi there. I have been struggling to find out how to use a variable as my location in my header redirect function. Here is my code. ...
2
by: Matthew3360 | last post by:
Hi, I have a python app that i want to be able to get variables from a php page on my webserver. My python app is on my computer. How would I make it...
0
by: AndyPSV | last post by:
HOW CAN I CREATE AN AI with an .executable file that would suck all files in the folder and on my computerHOW CAN I CREATE AN AI with an .executable...
0
hi
by: WisdomUfot | last post by:
It's an interesting question you've got about how Gmail hides the HTTP referrer when a link in an email is clicked. While I don't have the specific...
0
by: Carina712 | last post by:
Setting background colors for Excel documents can help to improve the visual appeal of the document and make it easier to read and understand....
0
BLUEPANDA
by: BLUEPANDA | last post by:
At BluePanda Dev, we're passionate about building high-quality software and sharing our knowledge with the community. That's why we've created a SaaS...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.