By using this site, you agree to our updated Privacy Policy and our Terms of Use. Manage your Cookies Settings.
440,236 Members | 1,011 Online
Bytes IT Community
+ Ask a Question
Need help? Post your question and get tips & solutions from a community of 440,236 IT Pros & Developers. It's quick & easy.

Question on regex

P: n/a
Hello all -

I have a file which has IP address and subnet number and I use regex to extract
the IP separately from subnet.

pattern used for IP: \d{1,3}(\.\d{1,3}){3}
pattern used for subnet:((\d{1,3})|(\d{1,3}(\.\d{1,3}){1,3}))/(\d{1,2})

so I have list of ip/subnets strewn around like this

10.200.0.34
10.200.4.5
10.178.9.45
10.200/22
10.178/16
10.100.4.64/26,
10.150.100.0/28
10/8

with that above examples:
ip regex pattern works for all IP address
subnet regex pattern works for all subnets

problem now is ip pattern also matches the last 2 subnet numbers, because it
falls under ip regex.

to fix this problem, i used negative lookahead with ip pattern:
so the ip pattern now changes to:
\d{1,3}(\.\d{1,3}){3}(?!/\d+)

now the problem is 10.150.100.0 works fine, 10.100.4.64 subnet gets matched
with ip pattern with the following result:

10.100.4.6

Is there a workaround for this or what should change in ip regex pattern.

python script:
#!/usr/bin/env python

import re, sys

fh = 0
try:
fh = open(sys.argv[1], "r")
except IOError, message:
print "cannot open file: %s" %message
else:

for lines in fh.readlines():
lines = lines.strip()

pattIp = re.compile("(\d{1,3}(\.\d{1,3}){3})(?!/\d+)")
pattNet = re.compile("((\d{1,3})|(\d{1,3}(\.\d{1,3}){1,3}))/(\d{1,2})")

match = pattIp.search(lines)
if match is not None:
print "ipmatch: %s" %match.groups()[0]

match = pattNet.search(lines)
if match is not None:
print "subnet: %s" %match.groups()[0]

fh.close()

output with that above ip/subnet in a file

ipmatch: 10.200.0.34
ipmatch: 10.200.4.5
ipmatch: 10.178.9.45
subnet: 10.200
subnet: 10.178
ipmatch: 10.100.4.6
subnet: 10.100.4.64
subnet: 10.150.100.0
subnet: 10

TIA
Prabhu

Dec 23 '06 #1
Share this Question
Share on Google+
1 Reply


P: n/a
Prabhu Gurumurthy schrieb:
to fix this problem, i used negative lookahead with ip pattern:
so the ip pattern now changes to:
\d{1,3}(\.\d{1,3}){3}(?!/\d+)

now the problem is 10.150.100.0 works fine, 10.100.4.64 subnet gets
matched with ip pattern with the following result:

10.100.4.6

Is there a workaround for this or what should change in ip regex pattern.
I think what you want is that neither /d+ nor another digit nor a . follows:
\d{1,3}(\.\d{1,3}){3}(?!(/\d)|\d|\.)
This way 10.0.0.1234 won't be recognized as ip. Neither will 23.12.
which could be a problem if an ip is at the end of a sentence, so you
might want to omit that.
Dec 23 '06 #2

This discussion thread is closed

Replies have been disabled for this discussion.