hey guys, im having some trouble with this. I've got 2 text files, one in a paragraph with some wrong words, and a dictionary file with 1 word in 1 line. Im not sure how to separate the words in the paragraph to become 1 word to 1 line so that i can compare it to the dictionary file. ive put the codes up but if anyone finds anything else wrong with the code please let me know. thanks
import sys
# setText holds the words from the sample text file
setText = set()
# setProperDic holds the words from the dictionary text file
setProperDic = set()
# setWrong holds the incorrect words after comparing each set
setWrong = set()
fText = open(sys.argv[1], 'r')
line = fText.readline()
while line != '':
setText.add(line)
line = fText.readline()
fText.close()
fProperDic = open(sys.argv[2], 'r')
line = fProperDic.readline()
while line != '':
setProperDic.add(line)
line = fProperDic.readline()
fProperDic.close()
# Find the values in setText
# that don't exist in setDict
setWrong = setText - setProperDic
# Output each entry seperately
# in alphabetical order
for x in sorted(setWrong):
print(x)
3 2557 dwblas 626
Recognized Expert Contributor
You should strip the newline character(s) from the file words. You can also read the file one record at one time and convert into a set as in the following code, although there is nothing wrong with the way you are doing it. - fText = open(sys.argv[1], 'r')
-
for rec in fText:
-
rec = rec.strip()
-
setText.add(rec)
ooh thanks, but see the problem i have is that the text that im comparing with the dictionary is in a paragraph so i need to split the words so that it becomes 1 word in 1 line. i think i should use split.lines but i dont know where that code should go
Glenton 391
Recognized Expert Contributor @vino7
Can you please used code tags when posting code. It's hard to read otherwise.
But it seems the issue is here: - fText = open(sys.argv[1], 'r')
-
line = fText.readline()
-
while line != '':
-
setText.add(line)
-
line = fText.readline()
-
fText.close()
Firstly, there's a far more efficient way to read files! And, assuming that each line has multiple words, you need to split each line. When debugging this kind of thing (which is a big part of programming), it's often helpful to include a print statement in your code so you can see what the variables are as the code is running, which will tell you straight away if it's doing what you want/expect.
For now I'll suppose that there's no punctuation, just words separated by spaces in each line. - fText = open(sys.argv[1], 'r')
-
for line in fText: #note you can treat the file object as an iterable!
-
setText.update(line.strip().split(" ")) #see interactive session below about fText.close()
The key line makes use of three useful methods:
- "update" is a set method and it adds all the elements of an iterable to a set (obviously, if there's a repetition only one gets added).
- "strip" is a string method that gets rid of white space (tabs, returns, spaces etc) at the beginning and end of a string
- "split" is a string method that returns a list of strings separated by the specified separator.
The below interactive session should help clarify: - In [22]: t=" hello hello hello mum how are you\n"
-
-
In [23]: t
-
Out[23]: ' hello hello hello mum how are you\n'
-
-
In [24]: t.strip()
-
Out[24]: 'hello hello hello mum how are you'
-
-
In [25]: t.split(" ")
-
Out[25]: ['', '', 'hello', 'hello', 'hello', 'mum', 'how', 'are', 'you\n']
-
-
In [26]: t.strip().split(" ")
-
Out[26]: ['hello', 'hello', 'hello', 'mum', 'how', 'are', 'you']
-
-
In [27]: s=set()
-
-
In [28]: s.update(t)
-
-
In [29]: s
-
Out[29]: set(['\n', ' ', 'a', 'e', 'h', 'l', 'm', 'o', 'r', 'u', 'w', 'y'])
-
-
In [30]: s=set()
-
-
In [31]: s.update(t.strip().split(" "))
-
-
In [32]: s
-
Out[32]: set(['are', 'hello', 'how', 'mum', 'you'])
-
Sign in to post your reply or Sign up for a free account.
Similar topics |
by: Hank Reed |
last post by:
I am trying to use the spell checker on an unbound control in Access
2000. I run the checker in the AfterUpdate event of the control.
After the spell checker is done, I get the following message:...
|
by: ACaunter |
last post by:
Hi all,
I was wondering if there is a Spell Checker for asp.net.. i have found a
bunch written in vb.net or c#, but i'm having trouble bringing it over into a
web application from a windows...
|
by: Joe |
last post by:
Hello All:
Does anyone know of a spell checker that works with .NET?
Any options will be welcome.
TIA,
--
Joe
|
by: sweetguy1only |
last post by:
Hi all,
I am a MS Access developer using VB 6 (yes, I know it is a bit
old). The problem I am having is, I have a software that allows my
customers to put in the information of their clients....
|
by: Neil |
last post by:
Is there way to have control over the MS-Access spell checking (besides just
launching it)? We want to tell it to check all records, but skip certain
fields (or, alternatively, ONLY check certain...
| |
by: Don |
last post by:
I am looking for a C/C++ code spell checker, a tool that checks for
spelling errors in C++ source code. Linux/Unix platforms. Prefer command
line tool. Open source preferred, but will consider...
|
by: John Zenger |
last post by:
To my horror, someone pointed out to me yesterday that a web app I
wrote has been prominently displaying a misspelled word. The word was
buried in my code.
Is there a utility out there that...
|
by: ARC |
last post by:
Hello all,
I developed a tool a year or so ago for adding your own spell-checker to an
access application. This is mainly for those using the runtime, as you can't
distribute the spell-checker...
|
by: Mike |
last post by:
I have an app running at a client where, when the spell checker is
supposed to run, it reports "Can't start spell checker because it is
not installed".
I have never had this before - it works...
|
by: farhin |
last post by:
Hi there,
i m developing an application with richtextbox control and asp spell checker control in it.the prblm faced is that Spell check is happening but the change of wrong spelling is not...
|
by: marktang |
last post by:
ONU (Optical Network Unit) is one of the key components for providing high-speed Internet services. Its primary function is to act as an endpoint device located at the user's premises. However,...
| |
by: Hystou |
last post by:
Most computers default to English, but sometimes we require a different language, especially when relocating. Forgot to request a specific language before your computer shipped? No problem! You can...
|
by: jinu1996 |
last post by:
In today's digital age, having a compelling online presence is paramount for businesses aiming to thrive in a competitive landscape. At the heart of this digital strategy lies an intricately woven...
|
by: Hystou |
last post by:
Overview:
Windows 11 and 10 have less user interface control over operating system update behaviour than previous versions of Windows. In Windows 11 and 10, there is no way to turn off the Windows...
|
by: agi2029 |
last post by:
Let's talk about the concept of autonomous AI software engineers and no-code agents. These AIs are designed to manage the entire lifecycle of a software development project—planning, coding, testing,...
|
by: conductexam |
last post by:
I have .net C# application in which I am extracting data from word file and save it in database particularly. To store word all data as it is I am converting the whole word file firstly in HTML and...
|
by: TSSRALBI |
last post by:
Hello
I'm a network technician in training and I need your help.
I am currently learning how to create and manage the different types of VPNs and I have a question about LAN-to-LAN VPNs.
The...
| |
by: adsilva |
last post by:
A Windows Forms form does not have the event Unload, like VB6. What one acts like?
|
by: muto222 |
last post by:
How can i add a mobile payment intergratation into php mysql website.
| |