473,382 Members | 1,348 Online
Bytes | Software Development & Data Engineering Community
Post Job

Home Posts Topics Members FAQ

Join Bytes to post your question to a community of 473,382 software developers and data experts.

String Regex problem

Hello,

I have a string which has a url (Begins with a http://) somewhere in
it. I want to detect such a url and just spit out the url. Since I
am very poor in regex, can someone show me how to do it using a few
examples?

Thanks a lot!
Jul 18 '05 #1
4 1860
djw
Fazer wrote:
Hello,

I have a string which has a url (Begins with a http://) somewhere in
it. I want to detect such a url and just spit out the url. Since I
am very poor in regex, can someone show me how to do it using a few
examples?

Thanks a lot!


I would look here to improve your re-ex skills:

http://www.amk.ca/python/howto/regex/

Also, I find Kodos to be invaluable in developing and debugging regexs.
Highly recommended.

http://kodos.sourceforge.net

Of course, you could just use urlparse in the standard library...

Good luck,

Don

Jul 18 '05 #2
Since I am very poor in regex, can someone show me how to do it using
a few examples?


Don> http://www.amk.ca/python/howto/regex/
...
Don> http://kodos.sourceforge.net

If you're a Mac Python person there's also Dinu Gherman's excellent
RegexPlor:

http://starship.python.net/crew/gherman/RegexPlor.html

Even if you're not, it's worth popping over there to watch the MPEG clip of
RegexPlor in action.

Skip

Jul 18 '05 #3
Skip Montanaro wrote on Mon, 24 Nov 2003 21:35:48 -0600:
>> Since I am very poor in regex, can someone show me how to do it using
>> a few examples?

<snip> Don> http://kodos.sourceforge.net

If you're a Mac Python person there's also Dinu Gherman's excellent
RegexPlor:

http://starship.python.net/crew/gherman/RegexPlor.html

<snip>

I'm biased here, but Kiki (but http://project5.freezope.org/kiki) is
cross-platform and doesn't depend on Qt but on wxPy which is much easier
for Windows users.

Anyway, here's a regex I ripped out of my own code - you might want to
simplify it though:

"""Regex for finding URLs:
URL's start with http(s)/ftp/news ((http)|(ftp)|(news))
followed by ://
then any number of non-whitespace characters including
numbers, dots, forward slashes, commas, question marks,
ampersands, equality signs, dashes, underscores and plusses,
but ending in a non-dot and non-plus!

Result:

(?:http|https|ftp|news)://(?:[@a-zA-Z0-9,/%:\&+#\?=\-_~;]+\.*)+[a-zA-Z0-9,/%:\&#\?=\-_]

Tests:
Plain old link: http://www.mail.yahoo.com.
Containing numbers: ftp://bla.com/di~ng/co.rt,39,%93 or other
Go to news://bl_a.com/?ha-h+a&query=tb for more info.
A real link: <a href="http://x.com">http://x.com</a>.
ftp://verylong.org/url/must/be/chopp...itwontfit.html
(long one)
<IMG src="http://b.com/image.gif" /> (a plain image tag)
<a href=http://fixedlink.com/orginialinvalid.html>fixed</a> (original
invalid HTML)
Link containing an anchor
<b>"http://myhomepage.com/index.html#01"</b>.
"""

--
Yours,

Andrei

=====
Mail address in header catches spam. Real contact info (decode with rot13):
ce******@jnanqbb.ay. Fcnz-serr! Cyrnfr qb abg hfr va choyvp cbfgf. V ernq
gur yvfg, fb gurer'f ab arrq gb PP.
Jul 18 '05 #4
djw <dw*************@comcast.net> wrote in message news:<fSzwb.293286$HS4.2642954@attbi_s01>...
Fazer wrote:
Hello,

I have a string which has a url (Begins with a http://) somewhere in
it. I want to detect such a url and just spit out the url. Since I
am very poor in regex, can someone show me how to do it using a few
examples?

Thanks a lot!


I would look here to improve your re-ex skills:

http://www.amk.ca/python/howto/regex/

Also, I find Kodos to be invaluable in developing and debugging regexs.
Highly recommended.

http://kodos.sourceforge.net

Of course, you could just use urlparse in the standard library...

Good luck,

Don


Wow awesome! Thanks a lot for kodos. I hope I find it useful. I
have actually found a better solution rather than using regex it self.

Here's my solution and I think it works well:
[x for x in moo.split(' ') if x.startswith('http://')]
Jul 18 '05 #5

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

11
by: Martin Robins | last post by:
I am trying to parse a string that is similar in form to an OLEDB connection string using regular expressions; in principle it is working, but certain character combinations in the string being...
19
by: David Logan | last post by:
We need an additional function in the String class. We need the ability to suppress empty fields, so that we can more effectively parse. Right now, multiple whitespace characters create multiple...
32
by: tshad | last post by:
Can you do a search for more that one string in another string? Something like: someString.IndexOf("something1","something2","something3",0) or would you have to do something like: if...
2
by: Dan Schumm | last post by:
I'm relatively new to regular expressions and was looking for some help on a problem that I need to solve. Basically, given an HTML string, I need to highlight certain words within the text of the...
17
by: Tom | last post by:
Is there such a thing as a CONTAINS for a string variable in VB.NET? For instance, I want to do something like the following: If strTest Contains ("A","B", "C") Then Debug.WriteLine("Found...
16
by: Charles Law | last post by:
I have a string similar to the following: " MyString 40 "Hello world" all " It contains white space that may be spaces or tabs, or a combination, and I want to produce an array...
7
by: Brian Mitchell | last post by:
Is there an easy way to pull a date/time stamp from a string? The DateTime stamp is located in different parts of each string and the DateTime stamp could be in different formats (mm/dd/yy or...
4
by: Chris | last post by:
Hi Everyone, I am using a regex to check for a string. When all the file contains is my test string the regex returns a match, but when I embed the test string in the middle of a text file a...
15
by: morleyc | last post by:
Hi, i would like to remove a number of characters from my string (\t \r \n which are throughout the string), i know regex can do this but i have no idea how. Any pointers much appreciated. Chris
3
by: ommail | last post by:
Hi I wonder if regular expressions are in general sower than using classes like String and Char when used for validating/parsing text data? I've done some simple test (using IsMatch()) method...
0
by: Charles Arthur | last post by:
How do i turn on java script on a villaon, callus and itel keypad mobile phone
0
by: aa123db | last post by:
Variable and constants Use var or let for variables and const fror constants. Var foo ='bar'; Let foo ='bar';const baz ='bar'; Functions function $name$ ($parameters$) { } ...
0
by: ryjfgjl | last post by:
If we have dozens or hundreds of excel to import into the database, if we use the excel import function provided by database editors such as navicat, it will be extremely tedious and time-consuming...
0
by: ryjfgjl | last post by:
In our work, we often receive Excel tables with data in the same format. If we want to analyze these data, it can be difficult to analyze them because the data is spread across multiple Excel files...
0
by: emmanuelkatto | last post by:
Hi All, I am Emmanuel katto from Uganda. I want to ask what challenges you've faced while migrating a website to cloud. Please let me know. Thanks! Emmanuel
0
BarryA
by: BarryA | last post by:
What are the essential steps and strategies outlined in the Data Structures and Algorithms (DSA) roadmap for aspiring data scientists? How can individuals effectively utilize this roadmap to progress...
1
by: nemocccc | last post by:
hello, everyone, I want to develop a software for my android phone for daily needs, any suggestions?
1
by: Sonnysonu | last post by:
This is the data of csv file 1 2 3 1 2 3 1 2 3 1 2 3 2 3 2 3 3 the lengths should be different i have to store the data by column-wise with in the specific length. suppose the i have to...
0
by: Hystou | last post by:
There are some requirements for setting up RAID: 1. The motherboard and BIOS support RAID configuration. 2. The motherboard has 2 or more available SATA protocol SSD/HDD slots (including MSATA, M.2...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.