I have a txt file with words in it and I had to print the incorrect words into a set which I have.
Now I need to find in which line the incorrect words are in the text file and print it as a dictionary
for e.g. it should look like together:
togeher 3 4 #word being the incorrect word and 3 and 4 the line number where it is located in the txt file.
I know I need to use a line counter but dont know how to use it
words = [] # is my txt file
text1 # is my set of incorrect words
i have done this so far:
d = {} # an empty dictionary
key = text1
value = linecounter # dont know what to assign the value to
for
20 2229 bvdet 2,851
Expert Mod 2GB
Assume you have a function correct(word) that returns False if a word is incorrect. The following untested code would compile a dictionary of the incorrect words where the words are keys and the line numbers are contained in a list associated with the keys. - f = open("words.txt")
-
dd = {}
-
for i, word in enumerate(f):
-
if not correct(word.strip()):
-
dd.setdefault(word, []).append(i)
-
f.close()
@bvdet
cheers for your reply ive done this - d = {}
-
def correct(word):
-
for i, word in enumerate(words):
-
if not correct(word.strip()):
-
d.setdefault(word, []).append(i)
-
print(d)
but how would i print the word and the line number next to each
e.g.
loo: 5 8 # numbers being the line number
so would i create something like
keys = text1 # text1 being my incorrect words
then what though
bvdet 2,851
Expert Mod 2GB
Please use code tags when posting code.
To print the contents of a dictionary, iterate on the dictionary and format the printing of the key and value. In this case, the value is a list of line numbers, so... - >>> dd = {"word1": [6,9], "word2": [5], "word3": [0,3,8]}
-
>>> for key in dd:
-
... print "%s: %s" % (key, ", ".join([str(n) for n in dd[key]]))
-
...
-
word1: 6, 9
-
word3: 0, 3, 8
-
word2: 5
-
>>>
i have only been programming for 2 months and a bit so im not fully following you
so is the code that i prevously posted correct ?
and do i have to add what u said in you post by bvdet 2,851
Expert Mod 2GB
Your definition of function correct() is not right. You did not understand my post that explained that you need a function or some code that decides for you if a word is correct or not. If a word is correct, do not add it to the dictionary. If a word is incorrect, add it. In pseudo code: - def correct(word):
-
If word is a correct word, return True
-
Else, word is incorrect, return False
My post regarding printing - I used a list comprehension to create a string of the line numbers. It is equivalent to: - >>> dd[key]
-
[5, 12, 16]
-
>>> tem = []
-
>>> for n in dd[key]:
-
... tem.append(str(n))
-
...
-
>>> ", ".join(tem)
-
'5, 12, 16'
-
>>>
You don't have to do what he said, but your code is not correct. You have correct called in your definition of correct. Of course this is actually allowed, but I don't think it's what you intend, and it wouldn't work in your situation.
What @bvdet was asking was how are you going to determine whether a word is correct or not?
@Glenton
i have already created a list of incorrect words by comparing the txtfile and a dictionary i used now all i want to do is count the line number for each incorrect word in the txt file
i have tried another piece of code this: - from collections import defaultdict
-
d = defaultdict(list)
-
for lineno, word in enumerate(words):
-
if word in text1:
-
d[word].append(lineno)
-
print(d)
but it prints the incorrect word and like which place it is not the line it is located in
Okay, re-reading it seems we've been missing you about the line numbers. Sorry about that.
Am I right in saying that your original text file has a bunch of lines each with a bunch of words, and you're trying to figure out how to figure out which line the incorrectly spelled words are in. But all you have is the words list.
I don't see how this is possible. There seems to be no information linking the words in the words list to the line number from the original file. So probably the best is to do this when your extracting the information from the file in the first place! Ie fiddle around with the code that you used to create the incorrect file list.
Regarding getting the line number, something like this will work fine: - myfile=open("file.txt")
-
for i,line in myfile:
-
print i+1,line #i starts from 0, so if you don't want that, you need to add 1
-
myfile.close()
@Glenton
this is what i have
# text is a list of my txt file
# words is a list of my incorrect words
i want to find the line number of the incorrect words in the txt file ?
@lightning18
Oh, so you're just looking for the index command. - In [5]: text="helo mum how arew you".split(" ")
-
-
In [6]: text
-
Out[6]: ['helo', 'mum', 'how', 'arew', 'you']
-
-
In [7]: words=["arew","helo"]
-
-
In [8]: for w in words:
-
...: print w, text.index(w)
-
...:
-
...:
-
arew 3
-
helo 0
-
A quick browse through the python docs or a text book or whatever is a good idea just to get a feel for what's possible.
Unless I'm still not understanding what you're wanting!
Oh, so maybe the word appears multiple times. Similar idea.
Eg this function: - def findLineNos(text,word):
-
"returns a list of all the line numbers where word appears"
-
ans=[]
-
reps = text.count(word)
-
n=0
-
for i in range(reps):
-
ans.append(text[n:].index(word)+n)
-
n=text[n:].index(word)+1
-
return ans
-
-
text="helo mum how arew you helo mum how arew you".split(" ")
-
words=["arew","helo","false"]
-
-
for w in words:
-
print w,findLineNos(text,w)
-
returns this: - arew [3,8]
-
helo [0,5]
-
false []
cheers grenton that is pretty much what i want a set of incorrect words and the line number its located in the txtfile howver i get an error this is my code
the error is:
syntaxerror: invalid syntac - import sys
-
import string
-
-
text = []
-
infile = open(sys.argv[1], 'r').read()
-
for punct in string.punctuation:
-
infile = infile.replace(punct, "")
-
text = infile.split()
-
-
dict = open(sys.argv[2], 'r').read()
-
dictset = []
-
dictset = dict.split()
-
-
words = []
-
words = list(set(text) - set(dictset))
-
words = [text.lower() for text in words]
-
words.sort()
-
-
def findline(text, word):
-
ans = []
-
reps = text.count(word)
-
n = 0
-
for i in range(reps):
-
ans.append(text[n:].index(word)+n)
-
n = text[n:].index(word)+1
-
return ans
-
for w in words:
-
print(w,findline(text, w)
-
-
You'll need to be more specific than that on the error code. I can't run your file cos I don't have your inputs, so I'm guessing just by reading your code.
However, looking at it, it seems that text is a list of words, with no line information. Changing your line 8 to - text = infile.split("\n")
will mean that text is a list of the lines from the text file, rather than a list of words.
This should make it possible.
still get the same error - import sys
-
import string
-
-
text = []
-
infile = open(sys.argv[1], 'r').read()
-
for punct in string.punctuation:
-
infile = infile.replace(punct, "")
-
text = infile.split("\n")
-
-
dict = open(sys.argv[2], 'r').read()
-
dictset = []
-
dictset = dict.split()
-
-
words = []
-
words = list(set(text) - set(dictset))
-
words = [text.lower() for text in words]
-
words.sort()
-
-
-
def findline(text, word):
-
ans = []
-
reps = text.count(word)
-
n = 0
-
for i in range(reps):
-
ans.append(text[n:].index(word)+n)
-
n = text[n:].index(word)+1
-
return ans
-
for w in words:
-
print(w,findline(text, w)
-
I wasn't trying to solve your syntax error. You need to post more details or do your own debugging. There's normally a clue about where the syntax error is with whatever compiler your using.
Although looking through your line 29 is wrong. print w,etc instead of print(w,etc
Incidentally lines 4,7 and 14 are not needed.
@Glenton
im using python
and this is my first program i have ever made in my life
so im not good with syntax
the code i showed u is fully what i have
the rror im getting is on line 29 all the time
@lightning18
As I said in my previous message replace line 29 with this: - print w,findline(text, w)
Your current line 29 doesn't have matching brackets.
@Glenton
it still gives me a synatx error for - print w,findline(text, w)
- import sys
-
import string
-
-
-
infile = open(sys.argv[1], 'r').read()
-
for punct in string.punctuation:
-
text = infile.split()
-
-
dict = open(sys.argv[2], 'r').read()
-
dictset = []
-
dictset = dict.split()
-
-
words = list(set(text) - set(dictset))
-
words = [text.lower() for text in words]
-
words.sort()
-
-
-
def findline(text, word):
-
ans = []
-
reps = text.count(word)
-
n = 0
-
for i in range(reps):
-
ans.append(text[n:].index(word)+n)
-
n = text[n:].index(word)+1
-
return ans
-
for w in words:
-
print w,findline(text, w)
I'm afraid I have no idea why. You're going to have to debug it. This is a normal part of coding. Try commenting out bits of the code and rerunning, until you narrow it down to where it is.
Good luck!
Sign in to post your reply or Sign up for a free account.
Similar topics
by: janet |
last post by:
how can i count how many words have i written in a text
area???
Like taking an example ... i am writing in this textarea
of microsoft usergroup. and say in total i have written 50
words .. how...
|
by: cw bebop |
last post by:
Hi all
Using Visual Studio C#
Have a string
string st = "Hi, these pretzels are making me thirsty; drink this tea.
Run like heck."
******
|
by: Oleg.Ogurok |
last post by:
Hi there,
The .pdb files are generally not installed in a production environment.
As a result, when an exception occurs, the runtime can't resolve the
lines of the code where the problem...
|
by: Sandesh |
last post by:
Hello All,
Me saying " has any body come across such error would be
underestimating".
Well I am getting a very peculiar and unique error "Line 1: Incorrect
syntax near 'Actions'."
...
|
by: Gary Wessle |
last post by:
hi
I have a data file with equal number of columns for each row. I need
to get the number of rows and columns to allocate a matrix in gsl.
getline (in, line) and parse the line for the number of...
|
by: Tigerlily |
last post by:
Hello! I need to count the number of words in a string read in from an infile, in a function, but I don't know how to do this. This is what I have so far.
//Tiffany Lynn Goodseit
#include...
|
by: surekhareddy |
last post by:
can i count the number of words in a file
|
by: humaid |
last post by:
hi,guys i have done a program to count the number of bigrams.
i have taken a input file by using @ARGV,then icounted the number of lines in the file,using the split function i splited the sentence...
|
by: jaisi |
last post by:
Hi
I have a csv file with 3 columns.
1...."bkpf","zr","PDF"
2:.. "bkpf","zq","FAX"
Now i have to write a batch program to count the number of pdf files and fax files and watever other...
|
by: alwaali |
last post by:
Hi
I need help please
This is my project and i need a help to solve it with you
A page of text is to be read and analyzed to determine number of occurrences and locations of different words. The...
|
by: Charles Arthur |
last post by:
How do i turn on java script on a villaon, callus and itel keypad mobile phone
|
by: aa123db |
last post by:
Variable and constants
Use var or let for variables and const fror constants.
Var foo ='bar';
Let foo ='bar';const baz ='bar';
Functions
function $name$ ($parameters$) {
}
...
|
by: ryjfgjl |
last post by:
If we have dozens or hundreds of excel to import into the database, if we use the excel import function provided by database editors such as navicat, it will be extremely tedious and time-consuming...
|
by: ryjfgjl |
last post by:
In our work, we often receive Excel tables with data in the same format. If we want to analyze these data, it can be difficult to analyze them because the data is spread across multiple Excel files...
|
by: nemocccc |
last post by:
hello, everyone, I want to develop a software for my android phone for daily needs, any suggestions?
|
by: Sonnysonu |
last post by:
This is the data of csv file
1 2 3
1 2 3
1 2 3
1 2 3
2 3
2 3
3
the lengths should be different i have to store the data by column-wise with in the specific length.
suppose the i have to...
|
by: Hystou |
last post by:
There are some requirements for setting up RAID:
1. The motherboard and BIOS support RAID configuration.
2. The motherboard has 2 or more available SATA protocol SSD/HDD slots (including MSATA, M.2...
|
by: Hystou |
last post by:
Most computers default to English, but sometimes we require a different language, especially when relocating. Forgot to request a specific language before your computer shipped? No problem! You can...
|
by: jinu1996 |
last post by:
In today's digital age, having a compelling online presence is paramount for businesses aiming to thrive in a competitive landscape. At the heart of this digital strategy lies an intricately woven...
| |