my code is trying to get double word from multube files but give errore please help

How can I get every Token (word) and PreviousToken(Previous word) From multube files and frequency of each two word

my code is trying to get all single word and double word (every Token (word) and PreviousToken(Previous word)) from multube files and get frequency of both. it can get for single word but double word give error

line 50, in most_frequant_word
word1+= ' ' + word_list[ix+1]
IndexError: list index out of range

Expand|Select|Wrap|Line Numbers

 import __future__

import Tkinter as tk

import os, glob

import sys

import string

import re

import tkFileDialog      

def most_frequant_word():

 browser= tkFileDialog.askdirectory()

 word_freq={}

 word_freq1={}

 count11=0

 for root, dirs, files in os.walk(browser):

    text1.insert(tk.INSERT, 'Found %d dirs and %d files' % (len(dirs), len(files)))

    text1.insert(tk.INSERT, "\n")

    for idx, file in enumerate(files):

     ff = open (os.path.join(root, file), "r")

     text = ff.read ( )

     ff.close ( )

     word_list = text.split()

     my_list = text.split()

     count11=len(word_list)+count11

     text1.insert(tk.INSERT, "total number of tokens %s" % pair_list)

     text1.insert(tk.INSERT, "\n") 

     for ix, word in enumerate(word_list):

      word = word.lower()

      word = word.rstrip('.,/"\ -_;\[](){} ')

     # build the dictionary

      word1=word

      word1+= ' ' + word_list[ix+1]

      count = word_freq.get(word, 0)

      word_freq[word] = count + 1

      count1 = word_freq1.get(word1,0)

      word_freq1[word1] = count1 + 1

       # create a list of (freq, word) tuples

      freq_list = [(word,freq ) for freq,word  in word_freq.items()]

      freq_list1 = [(word1,freq1 ) for freq1,word1  in word_freq.items()]

       # sort the list by the first element in each tuple (default)

      freq_list.sort(reverse=True)

      freq_list1.sort(reverse=True)

     for n, tup in enumerate(freq_list1):

        text1.insert(tk.INSERT, "%s times: %s" % tup)

        text1.insert(tk.INSERT, "\n")
 
root = tk.Tk(className = " most_frequant_word")

# text entry field, width=width chars, height=lines text

v1 = tk.StringVar()

text1 = tk.Text(root, width=50, height=50, bg='green')

text1.pack()

# function listed in command will be executed on button click

button1 = tk.Button(root, text='Brows', command=most_frequant_word)

button1.pack(pady=5)

text1.focus()

root.mainloop()

the code subose to do
For example if the text file content is
"Every man has a price. Every woman has a price."

First Token(word) is "Every" PreviousToken(Previous word) is none(no previos)
Second Token(word) is "man" PreviousToken(Previous word) is "Every"
Third Token(word) is "has" PreviousToken(Previous word) is "man"
Forth Token(word) is "a" PreviousToken(Previous word) is "has"
Fifth Token(word) is "price" PreviousToken(Previous word) is "a"

Sixth Token(word) is "Every" PreviousToken(Previous word) is none(no previos)
Seventh Token(word) is "man" PreviousToken(Previous word) is "Every"
Eighth Token(word) is "has" PreviousToken(Previous word) is "man"
Ninth Token(word) is "a" PreviousToken(Previous word) is "has"
Tenth Token(word) is "price" PreviousToken(Previous word) is "a"

Frequency of "has a" is 2 (repeated two times first and second sentence)
Frequency of " a price" is 2 (repeated two times first and second sentence)
Frequency of "Every man" is 1 (occur one time only)
Frequency of "man has" is 1 (occur one time only)
Frequency of "Every woman" is 1 (occur one time only)
Frequency of "woman has" is 1 (occur one time only)

please I need help

May 18 '08 #1

Subscribe Post Reply

2519

Laharl

849

Expert 512MB

First, please only post one thread per question. This should probably have gone in your other thread. Your error occurs because when you get to the last element of the list, using ix+1 means that you're outside the list, thus giving you an error.

May 18 '08 #2

alivip

ok
but
how can I solve it?

May 18 '08 #3

jlm699

314

100+

Expand|Select|Wrap|Line Numbers

 
if ix == len(my_list - 1):

    break

May 19 '08 #4

Laharl

849

Expert 512MB

Expand|Select|Wrap|Line Numbers

if ix == len(my_list - 1):

break

Surely you mean:

Expand|Select|Wrap|Line Numbers

 
if ix == len(my_list)-1:

    break

May 19 '08 #5

jlm699

314

100+

Oh yeah, absolutely. Thanks for the catch, I think I was typing that as fast as I can whilst working...

May 19 '08 #6

Similar topics

242

Future reuse of code

by: James Cameron | last post by:

Hi I'm developing a program and the client is worried about future reuse of the code. Say 5, 10, 15 years down the road. This will be a major factor in selecting the development language. Any...

C / C++

Stampa unione Word 2003

by: Agos | last post by:

Quello che segue è la parte della routine che utilizza la stampa unione per creare un nuovo documento: --------------------- Set docWord =...

Microsoft Access / VBA

C# code

by: Alvin Bruney | last post by:

C# is great but it does have some short comings. Here, I examine one of them which I definitely think is a shortcoming. Coming from C++, there seems to be no equivalent in C# to separate code...

C# / C Sharp

Unhandled Exception (does not look like it's happening within code

by: Craig831 | last post by:

First off, I apologize if this gets long. I'm simply trying to give you all enough information to help me out. I'm writing (almost finished, actually), my first VB.Net application. It's a forms...

ASP.NET

double-click and launch application

by: Peter Ignarson | last post by:

Hi there - I am writing a paint program (I am following a learning tutorial, there is no point to writing a paint program) and I want to extend it so that I can copy the contents of my drawing and...

Visual Basic .NET

Help for my code

by: TC | last post by:

Hi. I write a program in c language that read a text file and extrapolate the word. for all word the program calculate the number of the times that word is present in the text. The problem is the...

C / C++

Comparing Two Files line by line and word by word

by: Frost | last post by:

Hi All, I am a newbie i have written a c program on unix for line by line comparison for two files now could some one help on how i could do word by word comparison in case both lines have the...

C / C++

Code fails with Segmentation Fault

by: Vlad Dogaru | last post by:

Hello, I am trying to learn C, especially pointers. The following code attempts to count the appearences of each word in a text file, but fails invariably with Segmentation Fault. Please help me...

C / C++

Help with templates and code generalization

by: StephQ | last post by:

I face the following problem. I wrote a poor's man plotting function: computePlot. This function append some x-values belonging to step and the correseponding f( x ) (for a given const member...

C / C++

How to turn on java script in a villaon keypad mobile phone

by: Charles Arthur | last post by:

How do i turn on java script on a villaon, callus and itel keypad mobile phone

Java

Migrating Website to Cloud - Emmanuel Katto

by: emmanuelkatto | last post by:

Hi All, I am Emmanuel katto from Uganda. I want to ask what challenges you've faced while migrating a website to cloud. Please let me know. Thanks! Emmanuel

General

Is that possible of reading the .csv file in column wise and the column have different lengths ?

by: Sonnysonu | last post by:

This is the data of csv file 1 2 3 1 2 3 1 2 3 1 2 3 2 3 2 3 3 the lengths should be different i have to store the data by column-wise with in the specific length. suppose the i have to...

C / C++

How to build RAID in BIOS?

by: Hystou | last post by:

There are some requirements for setting up RAID: 1. The motherboard and BIOS support RAID configuration. 2. The motherboard has 2 or more available SATA protocol SSD/HDD slots (including MSATA, M.2...

Computer Hardware

What is ONU?

by: marktang | last post by:

ONU (Optical Network Unit) is one of the key components for providing high-speed Internet services. Its primary function is to act as an endpoint device located at the user's premises. However,...

General

Problem With Comparison Operator <=> in G++

by: Oralloy | last post by:

Hello folks, I am unable to find appropriate documentation on the type promotion of bit-fields when using the generalised comparison operator "<=>". The problem is that using the GNU compilers,...

C / C++

The easy way to turn off automatic updates for Windows 10/11

by: Hystou | last post by:

Overview: Windows 11 and 10 have less user interface control over operating system update behaviour than previous versions of Windows. In Windows 11 and 10, there is no way to turn off the Windows...

Windows Server

Discussion: How does Zigbee compare with other wireless protocols in smart home applications?

by: tracyyun | last post by:

Dear forum friends, With the development of smart home technology, a variety of wireless communication protocols have appeared on the market, such as Zigbee, Z-Wave, Wi-Fi, Bluetooth, etc. Each...

General

Access Europe - Using VBA to create a class based on a table - Wed 1 May

by: isladogs | last post by:

The next Access Europe User Group meeting will be on Wednesday 1 May 2024 starting at 18:00 UK time (6PM UTC+1) and finishing by 19:30 (7.30PM). In this session, we are pleased to welcome a new...

Microsoft Access / VBA