can anyone tell me usage of (\1) in regex??
Test String: <font size=10><span>label<span></font>
Pattern: <font[^>]*>(.*?)\s*\w+\s*(\1)</font>
This is working
but if i change latter <span> to </span>and try to write pattern with (/1),am not getting it.
Pls give the solution
Thank you
The matched substring will be saved when matched to an expression enclosed in parentheses and can be accessed with the
group(group number) method of match objects. The '
\1' in your expression matches the text that was matched by group number 1. Since it is in parentheses, it also will be saved. This may give you some ideas:
- import re
-
-
p = re.compile(r'<font[^>]*><?(.*?)>?\s*(\w+)\s*<?/?(\1)>?</font>')
-
-
s = '<font size=10><span>label</span></font>'
-
-
m = p.match(s)
-
print m.group(0)
-
print m.groups()
-
-
>>> <font size=10><span>label</span></font>
-
('span', 'label', 'span')
-
>>> m.group(1)
-
'span'
-
>>> m.group(2)
-
'label'
-
>>> m.group(3)
-
'span'
-
>>>
Same, but add another group to the expression:
- p1 = re.compile(r'<font\s*([^>]*)><?(.*?)>?\s*(\w+)\s*<?/?(\2)>?</font>')
-
s1 = '<font size=10><span>label</span></font>'
-
m1 = p1.match(s1)
-
print m1.group(0)
-
print m1.groups()
-
-
'''
-
>>> <font size=10><span>label</span></font>
-
('size=10', 'span', 'label', 'span')
-
>>>
-
'''
-
-
s2 = '<font><span>label</span></font>'
-
m2 = p1.match(s2)
-
print m2.group(0)
-
print m2.groups()
-
-
'''
-
>>> <font><span>label</span></font>
-
('', 'span', 'label', 'span')
-
>>>
-
'''