468,556 Members | 2,410 Online
Bytes | Developer Community
New Post

Home Posts Topics Members FAQ

Post your question to a community of 468,556 developers. It's quick & easy.

Question regarding lists and regex

Here is a simple program, which queries /var/log/daemon on my OpenBSD box and
gets the list of valid ntp peers.

Questions:
what is the easiest way for me to create lists on the fly, by that I mean like perl

push my @foo, something_from_say_stderr. The reason is as you can ip = [""]
statement before the for loop, I want to avoid that and use list within the
second ip loop, where I extract the ip address. Am I confusing?

regex: I presume this is rather a dumb question, anyways here it comes! as you
can see from my program, pattIp = r\d{1,3}\.... etc, is there any other easy way
to group the reptitions, instead of typing the same regex 4 times.

TIA
Prabhu
-

amazon: [~/working/programs/python/regex]
ttyp4: [109]$ cat syslog.py
#!/usr/bin/env python
# $Id: syslog.py,v 1.6 2006/11/09 06:24:03 pgurumur Exp $

import getopt, re, os, string, sys, time
(dirname, program) = os.path.split(sys.argv[0])
argc = len(sys.argv)

def usage():
print program + ": options"
print "options: "
print " --filename | -f [ name of the file ]"
print " --help | -h [ prints this help ]"
sys.exit(1)

if __name__ == "__main__":
if (argc <= 1):
usage()
else:
try:
opts, args = getopt.getopt(sys.argv[1:], "f:h", ["help", "filename="])
except getopt.GetoptError:
usage()
else:
filename = ""
for optind, optarg in opts:
if optind in ("-f", "--filename"):
filename = optarg
elif optind in ("-h", "--help"):
usage()

if len(filename):
fh = 0
try:
fh = open(filename, "r")
except IOError, (error, message):
print program + ": cannot open " + filename + ": " + message
sys.exit(1)

pattNtp = r'.*ntpd(?=.*now\s+valid)'
count = 0
ip = [""]
pid = 0
for line in fh.readlines():
if re.match(pattNtp, line.strip(), re.IGNORECASE):
string = line.strip()
pattPid = r'\[\d{1,5}\]'
pidMatch = re.search(pattPid, string, re.IGNORECASE)
if pidMatch is not None:
pid = int(re.sub(r'\[|\]', "", pidMatch.group()))

pattIp = r'\d{1,3}\.\d{1,3}\.\d{1,3}\.\d{1,3}'
match = re.search(pattIp, string, re.IGNORECASE)
if match is not None:
ip.append(match.group())
count += 1

print "NTP program started with pid:", pid
print "Number of valid peers:", count
for x in ip:
if len(x):
print x

fh.close()
Nov 9 '06 #1
2 1031
"Prabhu Gurumurthy" <pg******@gmail.comwrote in message
news:ma***************************************@pyt hon.org...
Here is a simple program, which queries /var/log/daemon on my OpenBSD box
and gets the list of valid ntp peers.

Questions:
what is the easiest way for me to create lists on the fly, by that I mean
like perl

push my @foo, something_from_say_stderr. The reason is as you can ip =
[""] statement before the for loop, I want to avoid that and use list
within the second ip loop, where I extract the ip address. Am I confusing?
Typically, one initializes a list to be empty, that is [], not [""]. Python
will not read your mind at append time and think "oh! we're appending to a
list and we forgot to create one in the first place, let's make one now." I
guess Perl allows this, but the clarity of including the initialization
statement overrules the convenience of leaving it out.
regex: I presume this is rather a dumb question, anyways here it comes! as
you can see from my program, pattIp = r\d{1,3}\.... etc, is there any
other easy way to group the reptitions, instead of typing the same regex 4
times.
Here's one way, tested at the Python command line:
>>print r'\.'.join( [r'\d{1,3}']*4 )
\d{1,3}\.\d{1,3}\.\d{1,3}\.\d{1,3}

This avoids the pattern duplication, but I think using join is much less
easily recognized as a pattern for an IP address.
TIA
Prabhu
Some other comments/free advice:
1. I was curious about this line:
pid = int(re.sub(r'\[|\]', "", pidMatch.group()))
You already know pidMatch.group() is going to start with a '[', followed by
an integer string, and end with a ']', otherwise it wouldn't have matched
pidPatt. Instead of whacking this with another re-type call, how about just
some simple string slicing:
pid = pidMatch.group()[1:-1]

2. No real need to keep count of the found ip's, just use len(ip) to tell
you how many entries there are in the list (especially once you convert to
intializing with an empty list).

3. Similarly, you'll be able to remove the 'if len(x)' test when printing
out the contents of the ip list if you init with [] instead of [""]. Also,
the Python idiom for testing if x is the empty string is usually just 'if
x', not 'if len(x)'.

-- Paul
Nov 9 '06 #2
Ant


On Nov 9, 6:29 am, Prabhu Gurumurthy <pguru...@gmail.comwrote:
....
regex: I presume this is rather a dumb question, anyways here it comes! as you
can see from my program, pattIp = r\d{1,3}\.... etc, is there any other easy way
to group the reptitions, instead of typing the same regex 4 times.
....
pattIp = r'\d{1,3}\.\d{1,3}\.\d{1,3}\.\d{1,3}'
pattIp = r"\d{1,3}(\.\d{1,3}){3}"

Is the best you can get using pure regexes (rather than something like
Paul's solution).

Nov 9 '06 #3

This discussion thread is closed

Replies have been disabled for this discussion.

By using this site, you agree to our Privacy Policy and Terms of Use.