473,583 Members | 2,787 Online
Bytes | Software Development & Data Engineering Community
+ Post

Home Posts Topics Members FAQ

spell checking

I was just curious if there were any spell checker python modules around
that can guess at what the user meant to type in. I wrote up a quick
function that splices a string up into bigrams and then checks how many
bigrams are identical to a given word, which I think is how google does
it. But support for trigrams etc. could be added, so I'm curious if
anyone out there has done something more. Here's the script:

def StringsSimilari ty(str1, str2):
"""Divides the two strings into bigrams and reports
what percentage of them are equal"""
str1 = str1.strip().lo wer()
str2 = str2.strip().lo wer()
bigramStr1 = []
bigramStr2 = []
currentList = bigramStr1
i = 0
j = 0

# Empty versus non empty strings are never similar
if not (str1 and str2):
return 0

# 100% match if equal
if str1 == str2:
return 1.0

# Make strings equal length, simplifies things
len1 = len(str1)
len2 = len(str2)

if len1 > len2:
str2 = str2 + " "*(len1-len2)
elif len2 > len1:
str1 = str1 + " "*(len2-len1)

len1 = len(str1)
len2 = len(str2)

currentString = str1

# Generate bigrams
while j < 2:
i = 0
while i < len1:
if i+1 >= len1:
currentList.app end(currentStri ng[i])
else:
currentList.app end(currentStri ng[i] + currentString[i+1])

i += 2

j += 1
currentList = bigramStr2
currentString = str2

similarity = 0

for i in range(len(bigra mStr1)):
if bigramStr1[i] == bigramStr2[i]:
similarity += 1.0

if similarity == 0:
return 0

return similarity/len(bigramStr1)

def StringsSimilar( str1, str2):
"""Using StringSimilarit y, decides if the two
strings score is good enough, 50%, to be
considered similar"""
return StringsSimilari ty(str1, str2) >= 0.50

Jul 18 '05 #1
0 937

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

4
9711
by: Leo P. | last post by:
I am trying to write a spelling checker. Specifically the part that suggests words. From what I've read online, it seems like the preferred way to do this, is to find the double metaphone equivalent of a mispelled word, then to compute the mispelled word's distance from other words which have the same, or a similar metaphone equivalent. ...
1
1945
by: Pater Maximus | last post by:
I want to use the MS Word spell checker from my Python program. I check the spelling a word at a time. It works fine for English but I can not force it to use French when I am using that language. The French spell checking works find when done directly in the Word window. I have the automatic language detection turned off, the language...
84
5867
by: Andy Glew | last post by:
I am in search of any rigourous, scientific, academic or industrial studies comparing naming conventions in C++ or similar languages such as Ada: Specifically, are names formed with underscores more or less readable than names formed with MixedCase StudlyCaps camelCase?
0
2115
by: kk | last post by:
..NET Winforms Control for Spell and grammar check with formatting capabilitie ------------------------------------------------------------------------------------------- Need a control which has spell and grammar check capabilities to be added to our Visual C# application Our C# application has HTML controls every where in our application...
1
1216
by: Lloyd Dupont | last post by:
I see that outlook express has some spell checking capacity. so I though maybe there is a spell checking feature available in windows. the only thing I found out requires word2003, which is to big a dependency for a generalu public application. any other tips on how to implement spell checking in my application? -- There are 10 kinds of...
12
5901
by: Ryan | last post by:
Is there anyway to enable spell-checking for user input in a Text Box? Either auto spell-check or create a spell-check button. Using VB 2005.
6
10823
by: Neil | last post by:
Is there way to have control over the MS-Access spell checking (besides just launching it)? We want to tell it to check all records, but skip certain fields (or, alternatively, ONLY check certain fields). Is that possible? Alternatively, if that's not, we noticed that the spell checker skips fields that are disabled. So one could disable the...
5
1998
by: Dean Slindee | last post by:
Anybody got any helpful suggestions on how to implement spell checking in a textbox. Perhaps solutions using Microsoft Word spell checker as a called routine? What has worked for you? Thanks, Dean S
22
2880
by: SmokeWilliams | last post by:
Hi, I am working on a Spell checker for my richtext editor. I cannot use any open source, and must develop everything myself. I need a RegExp pattern to split text into a word array. I have been doing it by splitting by spaces or <ptags. I run into a probelm with the richtext part of my editor. When I change the font, it wraps the text...
4
2954
by: BillE | last post by:
I have found articles on line about using word interop for spell checking with visual studio applications. Most of the articles are several years old, though - VS2003, maybe 2005. I couldn't find anything for VS2008. Are there any new improvements in spell checking with VS2008? Thanks Bill
0
7897
marktang
by: marktang | last post by:
ONU (Optical Network Unit) is one of the key components for providing high-speed Internet services. Its primary function is to act as an endpoint device located at the user's premises. However, people are often confused as to whether an ONU can Work As a Router. In this blog post, we’ll explore What is ONU, What Is Router, ONU & Router’s main...
0
8189
Oralloy
by: Oralloy | last post by:
Hello folks, I am unable to find appropriate documentation on the type promotion of bit-fields when using the generalised comparison operator "<=>". The problem is that using the GNU compilers, it seems that the internal comparison operator "<=>" tries to promote arguments from unsigned to signed. This is as boiled down as I can make it. ...
1
7940
by: Hystou | last post by:
Overview: Windows 11 and 10 have less user interface control over operating system update behaviour than previous versions of Windows. In Windows 11 and 10, there is no way to turn off the Windows Update option using the Control Panel or Settings app; it automatically checks for updates and installs any it finds, whether you like it or not. For...
1
5705
isladogs
by: isladogs | last post by:
The next Access Europe User Group meeting will be on Wednesday 1 May 2024 starting at 18:00 UK time (6PM UTC+1) and finishing by 19:30 (7.30PM). In this session, we are pleased to welcome a new presenter, Adolph Dupré who will be discussing some powerful techniques for using class modules. He will explain when you may want to use classes...
0
3824
by: TSSRALBI | last post by:
Hello I'm a network technician in training and I need your help. I am currently learning how to create and manage the different types of VPNs and I have a question about LAN-to-LAN VPNs. The last exercise I practiced was to create a LAN-to-LAN VPN between two Pfsense firewalls, by using IPSEC protocols. I succeeded, with both firewalls in...
0
3850
by: adsilva | last post by:
A Windows Forms form does not have the event Unload, like VB6. What one acts like?
1
2336
by: 6302768590 | last post by:
Hai team i want code for transfer the data from one system to another through IP address by using C# our system has to for every 5mins then we have to update the data what the data is updated we have to send another system
1
1438
muto222
by: muto222 | last post by:
How can i add a mobile payment intergratation into php mysql website.
0
1162
bsmnconsultancy
by: bsmnconsultancy | last post by:
In today's digital era, a well-designed website is crucial for businesses looking to succeed. Whether you're a small business owner or a large corporation in Toronto, having a strong online presence can significantly impact your brand's success. BSMN Consultancy, a leader in Website Development in Toronto offers valuable insights into creating...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.