By using this site, you agree to our updated Privacy Policy and our Terms of Use. Manage your Cookies Settings.
440,235 Members | 1,057 Online
Bytes IT Community
+ Ask a Question
Need help? Post your question and get tips & solutions from a community of 440,235 IT Pros & Developers. It's quick & easy.

quick regex question

Xx r3negade
P: 39
Hi, I am very bad with regexes.
I need a regular expression that will reduce a url like this:

hxxp://example.com/somefolder/something/whatever

to its base:

hxxp://example.com

The two slashes after the http: complicate things. I know I could just remove the "http://" and add it again later, but as a learning experience, I would like to do this regex in a single line.

Thanks in advance
Apr 1 '09 #1
Share this Question
Share on Google+
2 Replies


P: 2
erm..

import re
p = re.compile(r'http://.*?\.(com|co\.uk)')

you can obviously adjust the end to add as many different domains as you can think of.
Apr 1 '09 #2

bvdet
Expert Mod 2.5K+
P: 2,851
The re solution may depend on the different possibilities of the URL name. Here is one possible solution:
Expand|Select|Wrap|Line Numbers
  1. import re
  2.  
  3. patt = re.compile(r'([a-z]+?:/+?\w+?\.\w+?)/')
  4.  
  5. m = patt.match("hxxp://example.com/somefolder/something/whatever")
  6. if m:
  7.      print m.group(1)
Apr 1 '09 #3

Post your reply

Sign in to post your reply or Sign up for a free account.