443,907 Members | 1,963 Online
Need help? Post your question and get tips & solutions from a community of 443,907 IT Pros & Developers. It's quick & easy.

[Python] Read .txt file and analayze

 P: 1 Hello all; I'm working huffman coding of any .txt file, so first I need to analyse this text file. I need to read it, then analyse. I need "exit" like table: **************************** letter frequency(how many times same latter repeated) Huffman code(this will come later) ************************ I started with: Expand|Select|Wrap|Line Numbers f = open('test.txt', 'r')    #open test.tx     for lines in f:         print lines          #to ensure if all work...   Can anyone help me? Dec 4 '10 #1
4 Replies

 P: 30 Expand|Select|Wrap|Line Numbers def countLetters(line, letter):     ret = 0     for character in line:         if character == letter: ret += 1     return ret   for line in open("file.txt"):     line = line.strip()     print line     print "Has", countLetters(line, "a"), "of letter a."     print "Has", countLetters(line, "e"), "of letter e."     print "Has", countLetters(line, "i"), "of letter i."     print "Has", countLetters(line, "o"), "of letter o."     print "Has", countLetters(line, "u"), "of letter u."     print "Has", countLetters(line, "y"), "of letter y."     print Dec 5 '10 #2

 P: 8 Modifying Sean's Code: Expand|Select|Wrap|Line Numbers #!/bin/python3   def countLetters(line, letter):    ret = 0    for character in line:       if character == letter: ret += 1    return ret   alphabet = 'abcdefghijklmnopqrstuvwxyz' #you can automate this using ascii codes too   for line in open("file.txt",'r'):    line = line.strip() #remove trailing spaces    print (line)    for letter in alphabet:       print ('Has {0} of letter {1}'.\       format(str(countLetters(line,letter)),letter)) output: gbfchjwshfcjkwhndfnxh;iquw;qemiziqmeuzngyegbfyewgy bqgzeqzydglndhlqhd;jkjnenjcejnrcjercbvvbdbggngnjmm nmsnvsdmsnfsfcsf>mNmvbmsdbfluaheregnctfbaxzmasqojm wqi;htrugttgp Has 3 of letter a Has 9 of letter b Has 7 of letter c Has 7 of letter d Has 10 of letter e Has 9 of letter f Has 12 of letter g Has 8 of letter h Has 4 of letter i Has 9 of letter j Has 2 of letter k Has 3 of letter l Has 11 of letter m Has 13 of letter n Has 1 of letter o Has 1 of letter p Has 8 of letter q Has 4 of letter r Has 8 of letter s Has 4 of letter t Has 4 of letter u Has 4 of letter v Has 5 of letter w Has 2 of letter x Has 4 of letter y Has 5 of letter z Dec 27 '10 #3

 P: 2 Expand|Select|Wrap|Line Numbers #-------------------------------------------------# # Set Variables                                   # #-------------------------------------------------#   input = open("file.txt") whitelist = ('a','b','c','d','e','f','g') # whitelist of letters letters = {}   #-------------------------------------------------# #  Functions                                      # #-------------------------------------------------#   def count_letter(c):   if c in letters:     letters[c] += 1  # if letter in letters add one   else:     letters[c] = 1   # if letter not in letters set add letter to dictionary object     def print_letters(letters):     for k,v in letters.items():     if k in whitelist:       print "Has %s of letter %s" % (v,k) # print out count for each letter     #-------------------------------------------------# #  Run code                                       # #-------------------------------------------------#     for line in input:          # for each line in input file   for letter in line:       # for each letter in line     count_letter(letter)    # tally a count of each letter   print_letters(letters)   Here I use a more pythonic syntax, which means less lines of code. If you count everything and whitelist the characters your concerned with then your code can be easily modified in the future. Hope this helps! Dec 31 '10 #4

 P: 8 Very neat Michael Dec 31 '10 #5