473,395 Members | 1,726 Online
Bytes | Software Development & Data Engineering Community
Post Job

Home Posts Topics Members FAQ

Join Bytes to post your question to a community of 473,395 software developers and data experts.

Wildcards for regexps?

If I have an expression like "bob marley" and I want to match
everything with one letter wrong, how would I do?
so "bob narely" and "vob marley" should match etc.
Aug 11 '08 #1
4 825
On Aug 10, 11:10*pm, ssecorp <circularf...@gmail.comwrote:
If I have an expression like "bob marley" and I want to match
everything with one letter wrong, how would I do?
so "bob narely" and "vob marley" should match etc.
At first, I was going to suggest the brute force solution:

".ob marley|b.b marley|bo. marley|bob.marley|bob .arley|bob m.rley|bob
ma.ley|bob mar.ey|bob marl.y|bob marle."

But then I realized that after matching the initial 'b', later
alternative matches wouldn't need to keep retesting for a leading 'b',
so here is a recursive re that does not go back to match previously
matched characters:

".ob marley|b(.b marley|o(. marley|b(.marley| (.arley|m(.rley|a(.ley|
r(.ey|l(.y|e.))))))))"

Here are some functions to generate these monstrosities:

base = "bob marley"

def makeOffByOneMatchRE(s):
return "|".join(s[:i]+'.'+s[i+1:] for i in range(len(s)))
re_string = makeOffByOneMatchRE(base)
print re_string

def makeOffByOneMatchRE(s,i=0):
if i==len(s)-2:
return '.' + s[-1] + '|' + s[-2] + '.'
return '.' + s[i+1:] + '|' + s[i] + '(' + makeOffByOneMatchRE(s,i
+1) + ')'
re_string = makeOffByOneMatchRE(base)
print re_string
-- Paul
Aug 11 '08 #2
ssecorp schrieb:
If I have an expression like "bob marley" and I want to match
everything with one letter wrong, how would I do?
so "bob narely" and "vob marley" should match etc.
Fuzzy matching is better done using Levensthein-distance [1] or
n-gram-matching [2].
Diez
[1] http://en.wikipedia.org/wiki/Levenshtein_distance
[2] http://en.wikipedia.org/wiki/Ngram#n...imate_matching
Aug 11 '08 #3
On Sun, Aug 10, 2008 at 9:10 PM, ssecorp <ci**********@gmail.comwrote:
If I have an expression like "bob marley" and I want to match
everything with one letter wrong, how would I do?
so "bob narely" and "vob marley" should match etc.
At one point I needed something like this so did a straight port of
the double-metaphone code to python.

It's horrible, it's ugly, it's non-pythonic in ways that make me
cringe, it has no unit tests, but it does work.

--
Stand Fast,
tjg. [Timothy Grant]
Aug 11 '08 #4
In article <7d**********************************@25g2000hsx.g ooglegroups.com>,
ssecorp <ci**********@gmail.comwrote:
>
If I have an expression like "bob marley" and I want to match
everything with one letter wrong, how would I do?
so "bob narely" and "vob marley" should match etc.
difflib
--
Aahz (aa**@pythoncraft.com) <* http://www.pythoncraft.com/

Adopt A Process -- stop killing all your children!
Aug 12 '08 #5

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

6
by: Bharath Dhurjati | last post by:
Hello, I am looking for documentation that specifies the following behavior exhibited by java. The following (assuming MyClass.class is accessible and has a main()) java MyClass * yields...
5
by: Klaus Alexander Seistrup | last post by:
Hi, Is there a way to "expand" simple regexps? Something along the lines of: #v+ >>> rx = '(a|b)c?(d|f)' >>> expand_regexp(rx)
1
by: Ralph Noble | last post by:
I thought this problem would go away over the Christmas holiday, but of course it did not. I'm trying to write a stored procedure incorporating wildcards, so I can search for variations. Example,...
11
by: Shyguy | last post by:
I need to import a text file pretty much daily. I download the file and change the name to a standard name and then run the code to import the file into a table in my database. The problem is...
10
by: Alvaro Puente | last post by:
Hi all! Do any of you know if wildcards are accepted when calling rename() function? Thanks/Alvaro
1
by: Anandan | last post by:
Hi, This is regarding Dataset Filter: WILDCARD CHARACTERS Both the * and % can be used interchangeably for wildcards in a LIKE comparison. If the string in a LIKE clause contains a * or %,...
10
by: Phil Latio | last post by:
How do I use wildcards when searching in array? At least that's what I think I need !! I have the line: if ($attribute != "id") The above is not 100% correct because it should also be...
19
by: Alan Carpenter | last post by:
Access 8 on Win98 and WinXP I'm having trouble with wildcards in the .Filename property of the FileSearch Object giving different results on win98 and XP. I've been successfully using...
2
by: Yorian | last post by:
I just started to try regexps in php and I didn't have too many problems, however I found a few when trying to build a templte engine. The first one is found is the dollar sign. In my template I...
5
by: nidaar | last post by:
From a security point of view, is accepting wildcards like "%" in input parameters of stored procedures against any best practices? As an example, if a user defined function uses "Productname...
0
by: ryjfgjl | last post by:
If we have dozens or hundreds of excel to import into the database, if we use the excel import function provided by database editors such as navicat, it will be extremely tedious and time-consuming...
0
by: emmanuelkatto | last post by:
Hi All, I am Emmanuel katto from Uganda. I want to ask what challenges you've faced while migrating a website to cloud. Please let me know. Thanks! Emmanuel
0
BarryA
by: BarryA | last post by:
What are the essential steps and strategies outlined in the Data Structures and Algorithms (DSA) roadmap for aspiring data scientists? How can individuals effectively utilize this roadmap to progress...
1
by: nemocccc | last post by:
hello, everyone, I want to develop a software for my android phone for daily needs, any suggestions?
0
by: Hystou | last post by:
There are some requirements for setting up RAID: 1. The motherboard and BIOS support RAID configuration. 2. The motherboard has 2 or more available SATA protocol SSD/HDD slots (including MSATA, M.2...
0
marktang
by: marktang | last post by:
ONU (Optical Network Unit) is one of the key components for providing high-speed Internet services. Its primary function is to act as an endpoint device located at the user's premises. However,...
0
by: Hystou | last post by:
Most computers default to English, but sometimes we require a different language, especially when relocating. Forgot to request a specific language before your computer shipped? No problem! You can...
0
jinu1996
by: jinu1996 | last post by:
In today's digital age, having a compelling online presence is paramount for businesses aiming to thrive in a competitive landscape. At the heart of this digital strategy lies an intricately woven...
0
tracyyun
by: tracyyun | last post by:
Dear forum friends, With the development of smart home technology, a variety of wireless communication protocols have appeared on the market, such as Zigbee, Z-Wave, Wi-Fi, Bluetooth, etc. Each...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.