473,508 Members | 2,180 Online
Bytes | Software Development & Data Engineering Community
+ Post

Home Posts Topics Members FAQ

spell checker for python

2 New Member
hey guys, im having some trouble with this. I've got 2 text files, one in a paragraph with some wrong words, and a dictionary file with 1 word in 1 line. Im not sure how to separate the words in the paragraph to become 1 word to 1 line so that i can compare it to the dictionary file. ive put the codes up but if anyone finds anything else wrong with the code please let me know. thanks


import sys

# setText holds the words from the sample text file
setText = set()
# setProperDic holds the words from the dictionary text file
setProperDic = set()
# setWrong holds the incorrect words after comparing each set
setWrong = set()

fText = open(sys.argv[1], 'r')
line = fText.readline()
while line != '':
setText.add(line)
line = fText.readline()
fText.close()

fProperDic = open(sys.argv[2], 'r')
line = fProperDic.readline()
while line != '':
setProperDic.add(line)
line = fProperDic.readline()
fProperDic.close()

# Find the values in setText
# that don't exist in setDict
setWrong = setText - setProperDic

# Output each entry seperately
# in alphabetical order
for x in sorted(setWrong):
print(x)
May 19 '10 #1
3 2557
dwblas
626 Recognized Expert Contributor
You should strip the newline character(s) from the file words. You can also read the file one record at one time and convert into a set as in the following code, although there is nothing wrong with the way you are doing it.
Expand|Select|Wrap|Line Numbers
  1. fText = open(sys.argv[1], 'r')
  2. for rec in fText:
  3.     rec = rec.strip()
  4.     setText.add(rec) 
May 20 '10 #2
vino7
2 New Member
ooh thanks, but see the problem i have is that the text that im comparing with the dictionary is in a paragraph so i need to split the words so that it becomes 1 word in 1 line. i think i should use split.lines but i dont know where that code should go
May 20 '10 #3
Glenton
391 Recognized Expert Contributor
@vino7
Can you please used code tags when posting code. It's hard to read otherwise.

But it seems the issue is here:
Expand|Select|Wrap|Line Numbers
  1. fText = open(sys.argv[1], 'r')
  2. line = fText.readline()
  3. while line != '':
  4.     setText.add(line)
  5.     line = fText.readline()
  6.     fText.close()
Firstly, there's a far more efficient way to read files! And, assuming that each line has multiple words, you need to split each line. When debugging this kind of thing (which is a big part of programming), it's often helpful to include a print statement in your code so you can see what the variables are as the code is running, which will tell you straight away if it's doing what you want/expect.

For now I'll suppose that there's no punctuation, just words separated by spaces in each line.

Expand|Select|Wrap|Line Numbers
  1. fText = open(sys.argv[1], 'r')
  2. for line in fText:  #note you can treat the file object as an iterable!
  3.     setText.update(line.strip().split(" "))  #see interactive session below about fText.close()
The key line makes use of three useful methods:
- "update" is a set method and it adds all the elements of an iterable to a set (obviously, if there's a repetition only one gets added).
- "strip" is a string method that gets rid of white space (tabs, returns, spaces etc) at the beginning and end of a string
- "split" is a string method that returns a list of strings separated by the specified separator.

The below interactive session should help clarify:
Expand|Select|Wrap|Line Numbers
  1. In [22]: t="  hello hello hello mum how are you\n"
  2.  
  3. In [23]: t
  4. Out[23]: '  hello hello hello mum how are you\n'
  5.  
  6. In [24]: t.strip()
  7. Out[24]: 'hello hello hello mum how are you'
  8.  
  9. In [25]: t.split(" ")
  10. Out[25]: ['', '', 'hello', 'hello', 'hello', 'mum', 'how', 'are', 'you\n']
  11.  
  12. In [26]: t.strip().split(" ")
  13. Out[26]: ['hello', 'hello', 'hello', 'mum', 'how', 'are', 'you']
  14.  
  15. In [27]: s=set()
  16.  
  17. In [28]: s.update(t)
  18.  
  19. In [29]: s
  20. Out[29]: set(['\n', ' ', 'a', 'e', 'h', 'l', 'm', 'o', 'r', 'u', 'w', 'y'])
  21.  
  22. In [30]: s=set()
  23.  
  24. In [31]: s.update(t.strip().split(" "))
  25.  
  26. In [32]: s
  27. Out[32]: set(['are', 'hello', 'how', 'mum', 'you'])
  28.  
May 21 '10 #4

Sign in to post your reply or Sign up for a free account.

Similar topics

7
3489
by: Hank Reed | last post by:
I am trying to use the spell checker on an unbound control in Access 2000. I run the checker in the AfterUpdate event of the control. After the spell checker is done, I get the following message:...
3
2118
by: ACaunter | last post by:
Hi all, I was wondering if there is a Spell Checker for asp.net.. i have found a bunch written in vb.net or c#, but i'm having trouble bringing it over into a web application from a windows...
8
2096
by: Joe | last post by:
Hello All: Does anyone know of a spell checker that works with .NET? Any options will be welcome. TIA, -- Joe
4
6239
by: sweetguy1only | last post by:
Hi all, I am a MS Access developer using VB 6 (yes, I know it is a bit old). The problem I am having is, I have a software that allows my customers to put in the information of their clients....
6
10818
by: Neil | last post by:
Is there way to have control over the MS-Access spell checking (besides just launching it)? We want to tell it to check all records, but skip certain fields (or, alternatively, ONLY check certain...
6
5278
by: Don | last post by:
I am looking for a C/C++ code spell checker, a tool that checks for spelling errors in C++ source code. Linux/Unix platforms. Prefer command line tool. Open source preferred, but will consider...
9
3727
by: John Zenger | last post by:
To my horror, someone pointed out to me yesterday that a web app I wrote has been prominently displaying a misspelled word. The word was buried in my code. Is there a utility out there that...
9
6255
by: ARC | last post by:
Hello all, I developed a tool a year or so ago for adding your own spell-checker to an access application. This is mainly for those using the runtime, as you can't distribute the spell-checker...
3
4008
by: Mike | last post by:
I have an app running at a client where, when the spell checker is supposed to run, it reports "Can't start spell checker because it is not installed". I have never had this before - it works...
1
2171
by: farhin | last post by:
Hi there, i m developing an application with richtextbox control and asp spell checker control in it.the prblm faced is that Spell check is happening but the change of wrong spelling is not...
0
7231
marktang
by: marktang | last post by:
ONU (Optical Network Unit) is one of the key components for providing high-speed Internet services. Its primary function is to act as an endpoint device located at the user's premises. However,...
0
7132
by: Hystou | last post by:
Most computers default to English, but sometimes we require a different language, especially when relocating. Forgot to request a specific language before your computer shipped? No problem! You can...
0
7401
jinu1996
by: jinu1996 | last post by:
In today's digital age, having a compelling online presence is paramount for businesses aiming to thrive in a competitive landscape. At the heart of this digital strategy lies an intricately woven...
1
7063
by: Hystou | last post by:
Overview: Windows 11 and 10 have less user interface control over operating system update behaviour than previous versions of Windows. In Windows 11 and 10, there is no way to turn off the Windows...
0
5640
agi2029
by: agi2029 | last post by:
Let's talk about the concept of autonomous AI software engineers and no-code agents. These AIs are designed to manage the entire lifecycle of a software development project—planning, coding, testing,...
0
4720
by: conductexam | last post by:
I have .net C# application in which I am extracting data from word file and save it in database particularly. To store word all data as it is I am converting the whole word file firstly in HTML and...
0
3211
by: TSSRALBI | last post by:
Hello I'm a network technician in training and I need your help. I am currently learning how to create and manage the different types of VPNs and I have a question about LAN-to-LAN VPNs. The...
0
3196
by: adsilva | last post by:
A Windows Forms form does not have the event Unload, like VB6. What one acts like?
1
773
muto222
by: muto222 | last post by:
How can i add a mobile payment intergratation into php mysql website.

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.