Getting word frequencies from files which are in folder.

krisbee1983

Hello to all,

I'm beginer in learning Python I wish somebody help me with solving
this problem. I would like to read all text files wchich are in some
folder. For this text files I need to make some word frequencies using
defined words like "buy", "red", "good". If some file don't have that
word will get "0" for this frequency. It shoud be stored in array. If
I have alredy got frequencies for every file in folder, my array wrote
to text file.

I will be very gratefully for receiving any help.

Apr 4 '07 #1

Subscribe Reply

2113

Irmen de Jong

kr*********@gma il.com wrote:

Hello to all,

I'm beginer in learning Python I wish somebody help me with solving
this problem. I would like to read all text files wchich are in some
folder. For this text files I need to make some word frequencies using
defined words like "buy", "red", "good". If some file don't have that
word will get "0" for this frequency. It shoud be stored in array. If
I have alredy got frequencies for every file in folder, my array wrote
to text file.

This sounds suspiciously like a homework assignment.
I don't think you'll get much help for this one, unless
you show some code you wrote yourself already with a specific
question about problems you're having....

--Irmen

Apr 4 '07 #2

krisbee1983

This sounds suspiciously like a homework assignment.

I don't think you'll get much help for this one, unless
you show some code you wrote yourself already with a specific
question about problems you're having....

Well you have some right. I will make it more specific.
I have got something like that:

import os, os.path

def wyswietlanie_dr zewa(dir_path):
#function is reading folders and sub folders until it gets to a file.
for name in os.listdir(dir_ path):
full_path = os.path.join(di r_path, name)
print full_path
if os.path.isdir(f ull_path):
wyswietlanie_dr zewa(full_path)

My question is how to get word frequencies from this files?
I will be glad to get any help.

Krisbee

Apr 4 '07 #3

Terry Reedy

<kr*********@gm ail.comwrote in message
news:11******** **************@ e65g2000hsc.goo glegroups.com.. .
|
| My question is how to get word frequencies from this files?
| I will be glad to get any help.

Go to
http://groups.google.com/group/comp.lang.python/topics
and search on "count word frequency" and you will find several previous
posts on this topic.

tjr

Apr 4 '07 #4

7stud

On Apr 4, 2:07 pm, krisbee1...@gma il.com wrote:

My question is how to get word frequencies from this files?
I will be glad to get any help.

--files have a read(), readline(), and readlines() method
--strings have a split() method, which splits the string on
whitespace(e.g. spaces)
--lists have a count() method

Apr 5 '07 #5

Alex Martelli

<kr*********@gm ail.comwrote:

This sounds suspiciously like a homework assignment.
I don't think you'll get much help for this one, unless
you show some code you wrote yourself already with a specific
question about problems you're having....

Well you have some right. I will make it more specific.
I have got something like that:

import os, os.path

def wyswietlanie_dr zewa(dir_path):
#function is reading folders and sub folders until it gets to a file.
for name in os.listdir(dir_ path):
full_path = os.path.join(di r_path, name)
print full_path
if os.path.isdir(f ull_path):
wyswietlanie_dr zewa(full_path)

My question is how to get word frequencies from this files?
I will be glad to get any help.

You may want to consider os.walk as an alternative way to get all files;
it's easy to wrap it into a generator yielding all files in the subtree.

This, I would think, is the proper factoring in Python: have a generator
yielding each file, and a function taking a file and returning the word
frequencies for that one file. This neatly separates the two halves of
the task -- and you can easily factor things down further...

Give a text file, you can iterate on it: the items are the lines. Given
a line, you can extract all words in it and iterate on those: look at
the re module, and the \w feature of regular-expression pattern strings.
So, a generator that turns a file into a stream of words is also an easy
sub-task to accomplish.

Given a stream of words, and a set of "interestin g words", it's easy to
count the occurrences of interesting words. There, I'll supply that
part, to entice you to write the others, and thereby perhaps learn some
Python...:

def count_interesti ng_words(all_wo rds, interesting_wor ds):
d = dict.fromkeys(i nteresting_word s, 0)
for word in all_words:
if word in d: d[word] += 1
return d
Alex

Apr 5 '07 #6

Similar topics

6089

Word Table to Access

by: Ruby Tuesday | last post by:

Hi, I was wondering if expert can give me some lite to convert my word table into access database. Note: within each cell of my word table(s), some has multi-line data in it. In addition, there is one row containing picture(s) as well. So far, what I did is doing it manually for each word docs I have. Select Table Convert Table to Text(I use ^ character for delimiter)

Microsoft Access / VBA

1145

Getting program to wait

by: ken | last post by:

I'm using VB.Net to process information out of a word document. If the document fails a test, I would like to close it and move it to an error folder. However, when I try to do that (see below) it says that another process is using the document. Is there some way to force the move to wait for the doc to close? Thanks. Dim wrd As Word.Application = New Word.Application Dim doc As Word.Document For Each sFile In Files strName =...

Visual Basic .NET

2862

Open word files using combo

by: jpr | last post by:

Hello, I have a form with a cbo which get's its data from a table. This combo returns names of MS Word files in the following path: C:\shares\files\*.dot I would like to open these files (actually it should open a copy of the template and not the dot file itself) using the OnChange event of my cbo. Is there a way or some code to help me? Thanks

Microsoft Access / VBA

2467

Hyperlinking to external word and excel files from within access

by: reb0101 | last post by:

hey all, I would very much appreciate any help or ideas on how to do this as I am stumped. I need to develop an access database to track documents but also link to them. I’ll explain what it needs to do; Every day there is a numbered (and titled) Word format document that is sent. Most, but not all of the time an accompanying excel file is also sent. The excel file is used for updates to the word document of the same name. Lets say...

Microsoft Access / VBA

1859

program that reads a text file and prints out the word frequencies using structure

by: tnt84 | last post by:

I want to write a program that reads a text file and prints out the word frequencies using structure but I don't know exactly what the word frequency is and how can I write that program using structure

C / C++

3444

word automation vb.net

by: Steve | last post by:

I've been building an application that will merge fields in a text file with a word template, save the resulting word file out to the user's hard drive, and then email the file as an attachment. The problem I'm having is that I can't delete the word file I saved at the end of the process due to the file being locked by the email process. It appears to take longer than the code takes to complete due to virus checking software that...

Visual Basic .NET

1151

Help ASP.NET 2.0 and WORD

by: SteveM | last post by:

Hi, I am needing some help/advice on how to display a word document in my ASP.NET web pages that can update itself from a word document located on the server. The idea here is that when the user makes changes to the document and uploads it to the website, that a refresh of the page will load the changes. Ideally what I would like would be to render the document as HTML because of the hyperlinks in the document that work nicely as HTML but...

ASP.NET

5816

getting data from CSV file into a Flexgrid

by: ArmageddonAsh | last post by:

I'm trying to make an application that will allow the user to enter data into a flexgrid (that's done) and then save the data from that flexgrid into a CSV file but even though the file is made none of the data from the flexgrid goes into the CSV file and so I have a couple of questions. 1, How do I make the application load the data from a specific CSV file? 2, How do I make it so that the user is able to add information to the data and...

Visual Basic 4 / 5 / 6

2799

getting absolute directory path?

by: lawpoop | last post by:

Hello all - I have a two part question. First of all, I have a website under /home/user/www/. The index.php and all the other website pages are under /home/user/www/. For functions that are used in multiple files, I have php files under / home/user/www/functions/. These files simply have So, in index.php and other files, I have

PHP

8392

Changing the language in Windows 10

by: Hystou | last post by:

Most computers default to English, but sometimes we require a different language, especially when relocating. Forgot to request a specific language before your computer shipped? No problem! You can effortlessly switch the default language on Windows 10 without reinstalling. I'll walk you through it. First, let's disable language synchronization. With a Microsoft account, language settings sync across devices. To prevent any complications,...

Windows Server

8912

Problem With Comparison Operator <=> in G++

by: Oralloy | last post by:

Hello folks, I am unable to find appropriate documentation on the type promotion of bit-fields when using the generalised comparison operator "<=>". The problem is that using the GNU compilers, it seems that the internal comparison operator "<=>" tries to promote arguments from unsigned to signed. This is as boiled down as I can make it. Here is my compilation command: g++-12 -std=c++20 -Wnarrowing bit_field.cpp Here is the code in...

C / C++

8819

Maximizing Business Potential: The Nexus of Website Design and Digital Marketing

by: jinu1996 | last post by:

In today's digital age, having a compelling online presence is paramount for businesses aiming to thrive in a competitive landscape. At the heart of this digital strategy lies an intricately woven tapestry of website design and digital marketing. It's not merely about having a website; it's about crafting an immersive digital experience that captivates audiences and drives business growth. The Art of Business Website Design Your website is...

Online Marketing

7428

AI Job Threat for Devs

by: agi2029 | last post by:

Let's talk about the concept of autonomous AI software engineers and no-code agents. These AIs are designed to manage the entire lifecycle of a software development project—planning, coding, testing, and deployment—without human intervention. Imagine an AI that can take a project description, break it down, write the code, debug it, and then launch it, all on its own.... Now, this would greatly impact the work of software developers. The idea...

Career Advice

6222

Access Europe - Using VBA to create a class based on a table - Wed 1 May

by: isladogs | last post by:

The next Access Europe User Group meeting will be on Wednesday 1 May 2024 starting at 18:00 UK time (6PM UTC+1) and finishing by 19:30 (7.30PM). In this session, we are pleased to welcome a new presenter, Adolph Dupré who will be discussing some powerful techniques for using class modules. He will explain when you may want to use classes instead of User Defined Types (UDT). For example, to manage the data in unbound forms. Adolph will...

Microsoft Access / VBA

5692

Couldn’t get equations in html when convert word .docx file to html file in C#.

by: conductexam | last post by:

I have .net C# application in which I am extracting data from word file and save it in database particularly. To store word all data as it is I am converting the whole word file firstly in HTML and then checking html paragraph one by one. At the time of converting from word file to html my equations which are in the word document file was convert into image. Globals.ThisAddIn.Application.ActiveDocument.Select();...

C# / C Sharp

4222

Trying to create a lan-to-lan vpn between two differents networks

by: TSSRALBI | last post by:

Hello I'm a network technician in training and I need your help. I am currently learning how to create and manage the different types of VPNs and I have a question about LAN-to-LAN VPNs. The last exercise I practiced was to create a LAN-to-LAN VPN between two Pfsense firewalls, by using IPSEC protocols. I succeeded, with both firewalls in the same network. But I'm wondering if it's possible to do the same thing, with 2 Pfsense firewalls...

Networking - Hardware / Configuration

4403

Windows Forms - .Net 8.0

by: adsilva | last post by:

A Windows Forms form does not have the event Unload, like VB6. What one acts like?

Visual Basic .NET

2809

transfer the data from one system to another through ip address

by: 6302768590 | last post by:

Hai team i want code for transfer the data from one system to another through IP address by using C# our system has to for every 5mins then we have to update the data what the data is updated we have to send another system

C# / C Sharp