By using this site, you agree to our updated Privacy Policy and our Terms of Use. Manage your Cookies Settings.
443,922 Members | 1,493 Online
Bytes IT Community
+ Ask a Question
Need help? Post your question and get tips & solutions from a community of 443,922 IT Pros & Developers. It's quick & easy.

For loop headache

TMS
100+
P: 119
I'm working on the definitive igpay atinlay assignment. I've defined a module that has one function: def igpay(word): Its sole purpose is to process a word into pig latin. It seems to work well.

Now, the 2nd part of the assignment: Define a module that takes an argument on the command line (thanks to previous questions, this is complete) and processes the entire file into pig latin.

First I went through some basic tests to process the file: Find a space (for the delimiter), find punctuation and test for a capital. When I put this into a loop my logic doesn't seem to get it past the first word.

Expand|Select|Wrap|Line Numbers
  1. import sys
  2. import igpay
  3.  
  4. filename = sys.argv[1]
  5. data = open( filename ) .read()
  6. print data
  7.  
  8. def atinlay(data):
  9.     for i in range(len(data)):             #begin the loop
  10.         space = data.find(" ")            #find the first space
  11.         period = data.find(".")            #determine punctuation to handle later
  12.         comma = data.find(",")          
  13.         temp = data[0:space]           #get the first slice
  14.         a = temp[:1]                        #copy the first letter to see if it is capital
  15.         capital = a.isupper()
  16.         if capital:
  17.             temp = temp.lower()         #make it lower case if it is capital
  18.         newData = igpay.igpay(temp)    #new temp variable 
  19.         if capital:                             #capital flag set? Handle it
  20.             a = newData[:1]
  21.             a = a.upper()
  22.         newData = a + newData[1:]  #put new cap back on word
  23.         space += 1                         #increment to new space?
  24.         i = space                            #increment i?
  25.         temp = newData[space:]      #thought I needed another variable here.... :(
  26.     return newData    
  27.  
  28. c = atinlay(data)
  29. print c
  30.  
I think part of my problem is the temp assignment at the end. But I could use a gentle nudge to get this loop going because all it will do is process the first word right now.

Thank you
Jan 17 '07 #1
Share this Question
Share on Google+
14 Replies


bartonc
Expert 5K+
P: 6,596
I'll look at your code in a bit. In the mean time, something to consider is that file objects, themselves, are iterable. Text files appear as lists of end-of-line terminated strings. So you can write:
Expand|Select|Wrap|Line Numbers
  1. dateFile = open("filename")
  2. for line in dataFile:
  3.     for word in line.split():    # split() defaults to any whitespace
  4.         print word
  5. dataFile.close()
Jan 17 '07 #2

TMS
100+
P: 119
TMS
I'll look at your code in a bit. In the mean time, something to consider is that file objects, themselves, are iterable. Text files appear as lists of end-of-line terminated strings. So you can write:
Expand|Select|Wrap|Line Numbers
  1. dateFile = open("filename")
  2. for line in dataFile:
  3.     for word in line.split():    # split() defaults to any whitespace
  4.         print word
  5. dataFile.close()
But when its converted to a list python see's each word as 1 unit, not individual chars like a string. I tried something similar but wasn't able to process the word, so I gave up and went this direction. I appreciate your help.
Jan 17 '07 #3

bartonc
Expert 5K+
P: 6,596
Expand|Select|Wrap|Line Numbers
  1. import sys
  2. import igpay
  3.  
  4. filename = sys.argv[1]
  5. data = open( filename ) .read()
  6. print data
  7.  
  8. def atinlay(data):
  9.     pos = 0    # need to keep track of where you are in order to "move" through data
  10. ### you can then add pos to args of find()
  11.     for i in range(len(data)):             #begin the loop
  12.         space = data.find(" ")            #find the first space
  13.         period = data.find(".")            #determine punctuation to handle later
  14.         comma = data.find(",")          
  15.  
  16. ### data[0] should be data[pos] or you'll always start at the beginning
  17.         #### temp assigned at the end of the loop gets reassigned here ####
  18.         temp = data[0:space]           #get the first slice
  19.         a = temp[:1]                        #copy the first letter to see if it is capital
  20.         capital = a.isupper()
  21.  
  22. ### you can use str.capitalize() to upper the first letter
  23.         if capital:
  24.             temp = temp.lower()         #make it lower case if it is capital
  25.         newData = igpay.igpay(temp)    #new temp variable 
  26.         if capital:                             #capital flag set? Handle it
  27.             a = newData[:1]
  28.             a = a.upper()
  29.         newData = a + newData[1:]  #put new cap back on word
  30.  
  31. ### This would be you next Pos
  32.         pos = space + 1
  33. #        space += 1                         #increment to new space?
  34.  
  35. ### Even if this works, it's bad practice to change for's variable
  36. ### I'm not actully sure what happens
  37. #        i = space                            #increment i?
  38. ### and besides, I don't see it used anywhere
  39.  
  40.  
  41.         temp = newData[space:]      # this will be replaced at the top of the next loop
  42.     return newData    
  43.  
  44. c = atinlay(data)
  45. print c
  46.  
Jan 17 '07 #4

TMS
100+
P: 119
TMS
I am still only getting one word processed (at least printed).

The text file I'm working with is nonsensical, intended for testing. The result I get is this:

C:\Python25>atinlay.py someFile.txt
NewFile and, more new file.

Ewfilenay

C:\Python25>

It is capitalizing appropriately the first letter of the only word it processes. This is the same problem I was having before. Any ideas? :)
Jan 17 '07 #5

bartonc
Expert 5K+
P: 6,596
This'll get you started. Create an empty list, then append() to it and return the list.
Expand|Select|Wrap|Line Numbers
  1. import sys
  2. import igpay
  3.  
  4. filename = sys.argv[1]
  5. data = open( filename ) .read()
  6. print data
  7.  
  8. def atinlay(data):
  9.     pos = 0    # need to keep track of where you are in order to "move" through data
  10.     resultList = []
  11. ### you can then add pos to args of find()
  12.     for i in range(len(data)):             #begin the loop
  13.         space = data.find(" ")            #find the first space
  14.         period = data.find(".")            #determine punctuation to handle later
  15.         comma = data.find(",")          
  16.  
  17. ### data[0] should be data[pos] or you'll always start at the beginning
  18.         #### temp assigned at the end of the loop gets reassigned here ####
  19.         temp = data[0:space]           #get the first slice
  20.         a = temp[:1]                        #copy the first letter to see if it is capital
  21.         capital = a.isupper()
  22.  
  23. ### you can use str.capitalize() to upper the first letter
  24.         if capital:
  25.             temp = temp.lower()         #make it lower case if it is capital
  26.         newData = igpay.igpay(temp)    #new temp variable 
  27.         if capital:                             #capital flag set? Handle it
  28.             a = newData[:1]
  29.             a = a.upper()
  30.         newData = a + newData[1:]  #put new cap back on word
  31.  
  32. ### This would be you next Pos
  33.         pos = space + 1
  34. #        space += 1                         #increment to new space?
  35.  
  36. ### Even if this works, it's bad practice to change for's variable
  37. ### I'm not actully sure what happens
  38. #        i = space                            #increment i?
  39. ### and besides, I don't see it used anywhere
  40.  
  41.         resultList.append(newData + " ")
  42. #        temp = newData[space:]      # this will be replaced at the top of the next loop
  43.     return resultList
  44.  
  45. c = atinlay(data)
  46. print c
  47.  
[/quote]
Jan 17 '07 #6

dshimer
Expert 100+
P: 136
I haven't studied every snip of code, but based on what I understand, can I interject something as seen from another direction. As I understand it igpay() is supposed to take any word you send it and convert to the new string, and atinlay() should read through a whole file of text converting each word and capitalizing if the word falls after a period (or is already capitalized).

Could it possibly be easier to

1) write igpay so that if you send it a properly capitalized word it translates it to a properly capitalized word in the new language, or if you send it a word that has punctuation at the end it returns the translated word with the same punctuation.

Then..

2) Just take the whole data stream, split at white spaces (which will keep the puctuation with the word proceeding it), process it in a linear fashion from beginning to end. It could even test a word so that if a period is found in this string then make sure the next is capitalized before sending to igpay().

It seems if it were approached from this directon igpay() would need a couple more lines, but atinlay would just be a simple..

read data
split it
for each word in that list
convert it and append to the output capitalizing if the previous word contained a period.
Jan 17 '07 #7

TMS
100+
P: 119
TMS
OK, so, now I've changed igpay to do capitalization. The same problem still remains. I can't seem to process the list. It only does one word, the first word. If I use split() it will be made into a list (the text file) and I won't be able to process it because it will no longer be a string. Is there a way to change it back into a string after making it a list? Lists are tuples, right?

Here is my code:
Expand|Select|Wrap|Line Numbers
  1.  
  2. import sys
  3. import igpay
  4.  
  5. filename = sys.argv[1]
  6. data = open( filename ) .read()
  7. print data
  8.  
  9. def atinlay(data):
  10.     pos = 0                             # begin position
  11.     for i in range(len(data)):       # begin loop
  12.         space = data.find(" ")      # find space for delimiter
  13.         #period = data.find(".")      # set a flag for punctuation period
  14.         #comma = data.find(",")    # set a flag for punctuation comma
  15.         temp = data[0:space]      # slice the first word
  16.         newData = igpay.igpay(temp)   #place to put processed words
  17.         pos = space + 1
  18.         temp = newData[space:]
  19.     return newData    
  20.  
  21. c = atinlay(data)
  22. print c
  23.  
  24.  
Jan 17 '07 #8

bvdet
Expert Mod 2.5K+
P: 2,851
OK, so, now I've changed igpay to do capitalization. The same problem still remains. I can't seem to process the list. It only does one word, the first word. If I use split() it will be made into a list (the text file) and I won't be able to process it because it will no longer be a string. Is there a way to change it back into a string after making it a list? Lists are tuples, right?

Here is my code:
Expand|Select|Wrap|Line Numbers
  1.  
  2. import sys
  3. import igpay
  4.  
  5. filename = sys.argv[1]
  6. data = open( filename ) .read()
  7. print data
  8.  
  9. def atinlay(data):
  10.     pos = 0                             # begin position
  11.     for i in range(len(data)):       # begin loop
  12.         space = data.find(" ")      # find space for delimiter
  13.         #period = data.find(".")      # set a flag for punctuation period
  14.         #comma = data.find(",")    # set a flag for punctuation comma
  15.         temp = data[0:space]      # slice the first word
  16.         newData = igpay.igpay(temp)   #place to put processed words
  17.         pos = space + 1
  18.         temp = newData[space:]
  19.     return newData    
  20.  
  21. c = atinlay(data)
  22. print c
  23.  
  24.  
Lists and tuples are similar but different. Lists are mutable and tuples are not. To make a string from a list:
Expand|Select|Wrap|Line Numbers
  1. >>> lst = ['I', 'am', 'a', 'detailer']
  2. >>> " ".join(lst)
  3. 'I am a detailer'
  4. >>> 
It looks like you are only processing the first word in each loop. I do not see where you are accumulating an output string. You could do something like this, but your igpay function would need to handle the capitalization and punctuation:
Expand|Select|Wrap|Line Numbers
  1. def process_file(fn):
  2.     f = open(fn)
  3.     outStr = ""
  4.     for line in f:
  5.         lineList = line.split(" ") # split on space character
  6.         lineListOut = []
  7.         for word in lineList:
  8.             lineListOut.append(igpay.igpay(word))
  9.         outStr += " ".join(lineListOut)
  10.     print outStr
HTH
Jan 17 '07 #9

dshimer
Expert 100+
P: 136
You could do something like this, but your igpay function would need to handle the capitalization and punctuation:
Very clean, now the for loop is simply doing what it does best, working through the sequence of words, and since split should send in the punctuation along with the single word that preceeds it, "handling" it could be as simple as...
Test if it's there.
If so remove it.
Process the string.
Replace punctuation and return.
Jan 17 '07 #10

TMS
100+
P: 119
TMS
Wow... very nice. It processes the list, but appends a bunch of stuff. The list after processing looks like this:

NewaywFiwaylewayway wayawayd,way wyamovwayrewayway waynewaywway wayfiwayleway.

LOL, its an entirely new language. I should name it...

I need to go through and see what is happening, but it is processing the list. I think I can handle it from here. Thank you!
Jan 17 '07 #11

dshimer
Expert 100+
P: 136
One more thing, it blows me away the little things you can miss as you go along. In case you didn't see it, look at the last few posts in the
how to convert gpr file to csv format: using python
thread. The fileinput tip is worth the price of admission all by itself and I had never looked at it before ghostdog74 mentioned it.

I need to go through and see what is happening, but it is processing the list. I think I can handle it from here. Thank you!
Jan 17 '07 #12

bvdet
Expert Mod 2.5K+
P: 2,851
One more thing, it blows me away the little things you can miss as you go along. In case you didn't see it, look at the last few posts in the
how to convert gpr file to csv format: using python
thread. The fileinput tip is worth the price of admission all by itself and I had never looked at it before ghostdog74 mentioned it.
The fileinput was new to me also. Good tip. One more thing - it is good practice to close each file you open when you are through with it:
Expand|Select|Wrap|Line Numbers
  1. f.close()
Jan 17 '07 #13

TMS
100+
P: 119
TMS
Its all done. Thank you so much for your help. It works very well, thanks to all your help!
Jan 18 '07 #14

bartonc
Expert 5K+
P: 6,596
Its all done. Thank you so much for your help. It works very well, thanks to all your help!
Thanks for the update. I'm glad the experts here were of help to you, Keep posting.
Jan 18 '07 #15

Post your reply

Sign in to post your reply or Sign up for a free account.