By using this site, you agree to our updated Privacy Policy and our Terms of Use. Manage your Cookies Settings.
439,957 Members | 2,038 Online
Bytes IT Community
+ Ask a Question
Need help? Post your question and get tips & solutions from a community of 439,957 IT Pros & Developers. It's quick & easy.

Question on re.IGNORECASE

P: n/a
Hi,

I'm having some problems with basic RE in python. I was wondering
whether
somebody could provide a hint on what's going wrong with the following
script. Comments are included.

TIA.
-myself
python2.3

Python 2.3.4 (#1, Nov 18 2004, 13:39:30)
[GCC 3.2.3 20030502 (Red Hat Linux 3.2.3-39)] on linux2
Type "help", "copyright", "credits" or "license" for more information.
import re
pattern = re.compile('.*HTTP/(\d\.\d) *(\d*) *(.*)$')
pattern.search("GGHTTP/1.1 200 OK\r\n", re.IGNORECASE) <_sre.SRE_Match object at 0xb75c6ed0> pattern.search("GHTTP/1.1 200 OK\r\n", re.IGNORECASE) # this makes no sense to me. Why is the previous line matched
# and this not? pattern.search("GHTTP/1.1 200 OK\r\n")

<_sre.SRE_Match object at 0xb758d020>

Oct 20 '05 #1
Share this Question
Share on Google+
3 Replies


P: n/a
ch****@gmail.com wrote:
Hi,

I'm having some problems with basic RE in python. I was wondering
whether
somebody could provide a hint on what's going wrong with the following
script. Comments are included.

TIA.
-myself

python2.3
Python 2.3.4 (#1, Nov 18 2004, 13:39:30)
[GCC 3.2.3 20030502 (Red Hat Linux 3.2.3-39)] on linux2
Type "help", "copyright", "credits" or "license" for more information.
import re
pattern = re.compile('.*HTTP/(\d\.\d) *(\d*) *(.*)$')
pattern.search("GGHTTP/1.1 200 OK\r\n", re.IGNORECASE)
<_sre.SRE_Match object at 0xb75c6ed0>
pattern.search("GHTTP/1.1 200 OK\r\n", re.IGNORECASE)
# this makes no sense to me. Why is the previous line matched


Because the second argument to pattern.search() is the position where the search starts, not a flag. re.IGNORECASE == 2 so your search is skipping the first two chars.

Try giving the flags as a second argument to re.compile():
import re
re.IGNORECASE 2 pattern = re.compile('.*HTTP/(\d\.\d) *(\d*) *(.*)$', re.IGNORECASE)
pattern.search("GGHTTP/1.1 200 OK\r\n") <_sre.SRE_Match object at 0x00965980> pattern.search("GHTTP/1.1 200 OK\r\n") <_sre.SRE_Match object at 0x009659D0> pattern.search("Ghttp/1.1 200 OK\r\n")
<_sre.SRE_Match object at 0x009651B0>

Kent

# and this not?
pattern.search("GHTTP/1.1 200 OK\r\n")


<_sre.SRE_Match object at 0xb758d020>

Oct 20 '05 #2

P: n/a
OK, I got it.

- The re module search function syntax is:

search( pattern, string[, flags])

where re.IGNORECASE is a valid flag.

- The RE Object search method syntax is:

search( string[, pos[, endpos]])

where "The optional second parameter pos gives an index in the string
where the search is to start; it defaults to 0"

It turns out that re.IGNORECASE has the value
2.

Am I the only person bitten by this?

http://docs.python.org/lib/node115.html
http://docs.python.org/lib/re-objects.html

-myself

Oct 20 '05 #3

P: n/a
Thanks for your help.

-myself

Oct 20 '05 #4

This discussion thread is closed

Replies have been disabled for this discussion.