473,386 Members | 1,693 Online
Bytes | Software Development & Data Engineering Community
Post Job

Home Posts Topics Members FAQ

Join Bytes to post your question to a community of 473,386 software developers and data experts.

computing uni-gram and bigram probability using python

I have 2 files. I have to calculate the monogram (uni-gram) and at the next step calculate bi-gram probability of the first file in terms of the words repetition of the second file. (the files are text files). for this, first I have to write a function that calculates the number of total words and unique words of the file, because the monogram is calculated by the division of unique word to the total word for each word. and at last write it to a new file. The code I wrote(it's just for computing uni-gram) doesn't work. how can I change it to work correctly? and how can I calculate bi-grams probability?

Expand|Select|Wrap|Line Numbers
  1. def CalculateMonoGram (file1, file2):
  2.     with open (file1, encoding="utf_8") as f1:
  3.         counts={}
  4.         s1=f1.read()
  5.         x1=s1.split()
  6.         for word in x1:
  7.             counts[word]=counts.get(word,0)+1
  8.  
  9.         total=sum(counts.values())            
  10.  
  11.     with open (file2, encoding="utf_8") as f2:
  12.         s2=f2.read()
  13.         x2=s2.split()
  14.  
  15.  
  16.     monogram=[]
  17.     for item in x2:
  18.         monogram[item]=counts(item)/total
  19.  
  20.  
  21.     with open ("LexiconMonogram.txt", "w", encoding="utf_8") as f3:
  22.         f3.write(monogram)
May 18 '15 #1
0 2198

Sign in to post your reply or Sign up for a free account.

Similar topics

15
by: Brandon J. Van Every | last post by:
Is anyone using Python for .NET? I mean Brian's version at Zope, which simply accesses .NET in a one-way fashion from Python. http://www.zope.org/Members/Brian/PythonNet Not the experimental...
4
by: The_Incubator | last post by:
As the subject suggests, I am interested in using Python as a scripting language for a game that is primarily implemented in C++, and I am also interested in using generators in those scripts... ...
8
by: Sridhar R | last post by:
Hi, I am a little experienced python programmer (2 months). I am somewhat experienced in C/C++. I am planning (now in design stage) to write an IDE in python. The IDE will not be a simple...
8
by: Joakim Persson | last post by:
Hello all. I am involved in a project where we have a desire to improve our software testing tools, and I'm in charge of looking for solutions regarding the logging of our software (originating...
29
by: 63q2o4i02 | last post by:
Hi, I'm interested in using python to start writing a CAD program for electrical design. I just got done reading Steven Rubin's book, I've used "real" EDA tools, and I have an MSEE, so I know what...
11
by: efrat | last post by:
Hello, I'm planning to use Python in order to teach a DSA (data structures and algorithms) course in an academic institute. If you could help out with the following questions, I'd sure...
0
by: P. Adhia | last post by:
Hello, I was wondering if anyone is successfully using using Python(2.5)+DB2+pydb2. I get an error in all situations. It seems that this problem might be limited to python 2.5. A quick Google...
9
by: dominiquevalentine | last post by:
Hello, I'm a teen trying to do my part in improving the world, and me and my pal came up with some concepts to improve the transportation system. I have googled up and down for examples of using...
53
by: Vicent Giner | last post by:
Hello. I am new to Python. It seems a very interesting language to me. Its simplicity is very attractive. However, it is usually said that Python is not a compiled but interpreted programming...
0
by: Charles Arthur | last post by:
How do i turn on java script on a villaon, callus and itel keypad mobile phone
0
by: aa123db | last post by:
Variable and constants Use var or let for variables and const fror constants. Var foo ='bar'; Let foo ='bar';const baz ='bar'; Functions function $name$ ($parameters$) { } ...
0
by: ryjfgjl | last post by:
If we have dozens or hundreds of excel to import into the database, if we use the excel import function provided by database editors such as navicat, it will be extremely tedious and time-consuming...
0
by: ryjfgjl | last post by:
In our work, we often receive Excel tables with data in the same format. If we want to analyze these data, it can be difficult to analyze them because the data is spread across multiple Excel files...
0
by: emmanuelkatto | last post by:
Hi All, I am Emmanuel katto from Uganda. I want to ask what challenges you've faced while migrating a website to cloud. Please let me know. Thanks! Emmanuel
0
BarryA
by: BarryA | last post by:
What are the essential steps and strategies outlined in the Data Structures and Algorithms (DSA) roadmap for aspiring data scientists? How can individuals effectively utilize this roadmap to progress...
1
by: nemocccc | last post by:
hello, everyone, I want to develop a software for my android phone for daily needs, any suggestions?
0
by: Hystou | last post by:
There are some requirements for setting up RAID: 1. The motherboard and BIOS support RAID configuration. 2. The motherboard has 2 or more available SATA protocol SSD/HDD slots (including MSATA, M.2...
0
marktang
by: marktang | last post by:
ONU (Optical Network Unit) is one of the key components for providing high-speed Internet services. Its primary function is to act as an endpoint device located at the user's premises. However,...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.