468,167 Members | 1,965 Online
Bytes | Developer Community
New Post

Home Posts Topics Members FAQ

Post your question to a community of 468,167 developers. It's quick & easy.

quick regex question

Xx r3negade
Hi, I am very bad with regexes.
I need a regular expression that will reduce a url like this:


to its base:


The two slashes after the http: complicate things. I know I could just remove the "http://" and add it again later, but as a learning experience, I would like to do this regex in a single line.

Thanks in advance
Apr 1 '09 #1
2 1302

import re
p = re.compile(r'http://.*?\.(com|co\.uk)')

you can obviously adjust the end to add as many different domains as you can think of.
Apr 1 '09 #2
2,851 Expert Mod 2GB
The re solution may depend on the different possibilities of the URL name. Here is one possible solution:
Expand|Select|Wrap|Line Numbers
  1. import re
  3. patt = re.compile(r'([a-z]+?:/+?\w+?\.\w+?)/')
  5. m = patt.match("hxxp://example.com/somefolder/something/whatever")
  6. if m:
  7.      print m.group(1)
Apr 1 '09 #3

Post your reply

Sign in to post your reply or Sign up for a free account.

Similar topics

6 posts views Thread by Du Dang | last post: by
1 post views Thread by K. Shier | last post: by
17 posts views Thread by clintonG | last post: by
5 posts views Thread by Troy | last post: by
5 posts views Thread by Chris | last post: by
6 posts views Thread by Martin Evans | last post: by
7 posts views Thread by Extremest | last post: by
10 posts views Thread by yoni | last post: by
1 post views Thread by gcdp | last post: by
reply views Thread by kamranasdasdas | last post: by
By using this site, you agree to our Privacy Policy and Terms of Use.