I was browsing to see if I could find something similair to my problem. But I couldn't find anything..
I have this script that counts every word in a file. And then also says how many times that word occurs. Now I have this directory containing about 60 text files which I need to run this script on. Seeing as I'm not really a star in programming, I made a script that puts all those files in 1 file. And then that 1 file runs trough the counting script.
What I actually want.. is that I don't have to take that extra step. But to adjust the counting script so that it just loops trough the files in the given directory and then counts all the words in all the files so that my output is the same as if they were in 1 file.
Below here is what I have so far. Too bad I'm just not that grand in combining scripts.
Expand|Select|Wrap|Line Numbers
- import re
- lineList = open(r'/blablabla/bla/bla/file.txt').readlines()
- pat = "\w+"
- wordList = []
- for line in lineList:
- wordList += [w.lower() for w in re.findall(pat,line)]
- wordCnt = [wordList.count(w) for w in wordList]
- dd = dict(zip(wordList,wordCnt))
- for item in dd:
- if dd[item] >40 and dd[item] < 200 :
- print "Word '%s' occurs %d times." % (item, dd[item])
Then instead of lineList = open blabla
I should
Expand|Select|Wrap|Line Numbers
- dirname=r'c:/blablabla/bla/bla/'
I'm really getting confused/stressed by this. Could any of you perhaps give me a subtle hint to help me out?