TMS 119
New Member
I'm working on the definitive igpay atinlay assignment. I've defined a module that has one function: def igpay(word): Its sole purpose is to process a word into pig latin. It seems to work well.
Now, the 2nd part of the assignment: Define a module that takes an argument on the command line (thanks to previous questions, this is complete) and processes the entire file into pig latin.
First I went through some basic tests to process the file: Find a space (for the delimiter), find punctuation and test for a capital. When I put this into a loop my logic doesn't seem to get it past the first word. -
import sys
-
import igpay
-
-
filename = sys.argv[1]
-
data = open( filename ) .read()
-
print data
-
-
def atinlay(data):
-
for i in range(len(data)): #begin the loop
-
space = data.find(" ") #find the first space
-
period = data.find(".") #determine punctuation to handle later
-
comma = data.find(",")
-
temp = data[0:space] #get the first slice
-
a = temp[:1] #copy the first letter to see if it is capital
-
capital = a.isupper()
-
if capital:
-
temp = temp.lower() #make it lower case if it is capital
-
newData = igpay.igpay(temp) #new temp variable
-
if capital: #capital flag set? Handle it
-
a = newData[:1]
-
a = a.upper()
-
newData = a + newData[1:] #put new cap back on word
-
space += 1 #increment to new space?
-
i = space #increment i?
-
temp = newData[space:] #thought I needed another variable here.... :(
-
return newData
-
-
c = atinlay(data)
-
print c
-
I think part of my problem is the temp assignment at the end. But I could use a gentle nudge to get this loop going because all it will do is process the first word right now.
Thank you
14 2732 bartonc 6,596
Recognized Expert Expert
I'll look at your code in a bit. In the mean time, something to consider is that file objects, themselves, are iterable. Text files appear as lists of end-of-line terminated strings. So you can write: - dateFile = open("filename")
-
for line in dataFile:
-
for word in line.split(): # split() defaults to any whitespace
-
print word
-
dataFile.close()
TMS 119
New Member
I'll look at your code in a bit. In the mean time, something to consider is that file objects, themselves, are iterable. Text files appear as lists of end-of-line terminated strings. So you can write: - dateFile = open("filename")
-
for line in dataFile:
-
for word in line.split(): # split() defaults to any whitespace
-
print word
-
dataFile.close()
But when its converted to a list python see's each word as 1 unit, not individual chars like a string. I tried something similar but wasn't able to process the word, so I gave up and went this direction. I appreciate your help.
bartonc 6,596
Recognized Expert Expert -
import sys
-
import igpay
-
-
filename = sys.argv[1]
-
data = open( filename ) .read()
-
print data
-
-
def atinlay(data):
-
pos = 0 # need to keep track of where you are in order to "move" through data
-
### you can then add pos to args of find()
-
for i in range(len(data)): #begin the loop
-
space = data.find(" ") #find the first space
-
period = data.find(".") #determine punctuation to handle later
-
comma = data.find(",")
-
-
### data[0] should be data[pos] or you'll always start at the beginning
-
#### temp assigned at the end of the loop gets reassigned here ####
-
temp = data[0:space] #get the first slice
-
a = temp[:1] #copy the first letter to see if it is capital
-
capital = a.isupper()
-
-
### you can use str.capitalize() to upper the first letter
-
if capital:
-
temp = temp.lower() #make it lower case if it is capital
-
newData = igpay.igpay(temp) #new temp variable
-
if capital: #capital flag set? Handle it
-
a = newData[:1]
-
a = a.upper()
-
newData = a + newData[1:] #put new cap back on word
-
-
### This would be you next Pos
-
pos = space + 1
-
# space += 1 #increment to new space?
-
-
### Even if this works, it's bad practice to change for's variable
-
### I'm not actully sure what happens
-
# i = space #increment i?
-
### and besides, I don't see it used anywhere
-
-
-
temp = newData[space:] # this will be replaced at the top of the next loop
-
return newData
-
-
c = atinlay(data)
-
print c
-
TMS 119
New Member
I am still only getting one word processed (at least printed).
The text file I'm working with is nonsensical, intended for testing. The result I get is this:
C:\Python25>atinlay.py someFile.txt
NewFile and, more new file.
Ewfilenay
C:\Python25>
It is capitalizing appropriately the first letter of the only word it processes. This is the same problem I was having before. Any ideas? :)
bartonc 6,596
Recognized Expert Expert
This'll get you started. Create an empty list, then append() to it and return the list. -
import sys
-
import igpay
-
-
filename = sys.argv[1]
-
data = open( filename ) .read()
-
print data
-
-
def atinlay(data):
-
pos = 0 # need to keep track of where you are in order to "move" through data
-
resultList = []
-
### you can then add pos to args of find()
-
for i in range(len(data)): #begin the loop
-
space = data.find(" ") #find the first space
-
period = data.find(".") #determine punctuation to handle later
-
comma = data.find(",")
-
-
### data[0] should be data[pos] or you'll always start at the beginning
-
#### temp assigned at the end of the loop gets reassigned here ####
-
temp = data[0:space] #get the first slice
-
a = temp[:1] #copy the first letter to see if it is capital
-
capital = a.isupper()
-
-
### you can use str.capitalize() to upper the first letter
-
if capital:
-
temp = temp.lower() #make it lower case if it is capital
-
newData = igpay.igpay(temp) #new temp variable
-
if capital: #capital flag set? Handle it
-
a = newData[:1]
-
a = a.upper()
-
newData = a + newData[1:] #put new cap back on word
-
-
### This would be you next Pos
-
pos = space + 1
-
# space += 1 #increment to new space?
-
-
### Even if this works, it's bad practice to change for's variable
-
### I'm not actully sure what happens
-
# i = space #increment i?
-
### and besides, I don't see it used anywhere
-
-
resultList.append(newData + " ")
-
# temp = newData[space:] # this will be replaced at the top of the next loop
-
return resultList
-
-
c = atinlay(data)
-
print c
-
[/quote]
dshimer 136
Recognized Expert New Member
I haven't studied every snip of code, but based on what I understand, can I interject something as seen from another direction. As I understand it igpay() is supposed to take any word you send it and convert to the new string, and atinlay() should read through a whole file of text converting each word and capitalizing if the word falls after a period (or is already capitalized).
Could it possibly be easier to
1) write igpay so that if you send it a properly capitalized word it translates it to a properly capitalized word in the new language, or if you send it a word that has punctuation at the end it returns the translated word with the same punctuation.
Then..
2) Just take the whole data stream, split at white spaces (which will keep the puctuation with the word proceeding it), process it in a linear fashion from beginning to end. It could even test a word so that if a period is found in this string then make sure the next is capitalized before sending to igpay().
It seems if it were approached from this directon igpay() would need a couple more lines, but atinlay would just be a simple..
read data
split it
for each word in that list
convert it and append to the output capitalizing if the previous word contained a period.
TMS 119
New Member
OK, so, now I've changed igpay to do capitalization. The same problem still remains. I can't seem to process the list. It only does one word, the first word. If I use split() it will be made into a list (the text file) and I won't be able to process it because it will no longer be a string. Is there a way to change it back into a string after making it a list? Lists are tuples, right?
Here is my code: -
-
import sys
-
import igpay
-
-
filename = sys.argv[1]
-
data = open( filename ) .read()
-
print data
-
-
def atinlay(data):
-
pos = 0 # begin position
-
for i in range(len(data)): # begin loop
-
space = data.find(" ") # find space for delimiter
-
#period = data.find(".") # set a flag for punctuation period
-
#comma = data.find(",") # set a flag for punctuation comma
-
temp = data[0:space] # slice the first word
-
newData = igpay.igpay(temp) #place to put processed words
-
pos = space + 1
-
temp = newData[space:]
-
return newData
-
-
c = atinlay(data)
-
print c
-
-
bvdet 2,851
Recognized Expert Moderator Specialist
OK, so, now I've changed igpay to do capitalization. The same problem still remains. I can't seem to process the list. It only does one word, the first word. If I use split() it will be made into a list (the text file) and I won't be able to process it because it will no longer be a string. Is there a way to change it back into a string after making it a list? Lists are tuples, right?
Here is my code: -
-
import sys
-
import igpay
-
-
filename = sys.argv[1]
-
data = open( filename ) .read()
-
print data
-
-
def atinlay(data):
-
pos = 0 # begin position
-
for i in range(len(data)): # begin loop
-
space = data.find(" ") # find space for delimiter
-
#period = data.find(".") # set a flag for punctuation period
-
#comma = data.find(",") # set a flag for punctuation comma
-
temp = data[0:space] # slice the first word
-
newData = igpay.igpay(temp) #place to put processed words
-
pos = space + 1
-
temp = newData[space:]
-
return newData
-
-
c = atinlay(data)
-
print c
-
-
Lists and tuples are similar but different. Lists are mutable and tuples are not. To make a string from a list: - >>> lst = ['I', 'am', 'a', 'detailer']
-
>>> " ".join(lst)
-
'I am a detailer'
-
>>>
It looks like you are only processing the first word in each loop. I do not see where you are accumulating an output string. You could do something like this, but your igpay function would need to handle the capitalization and punctuation: - def process_file(fn):
-
f = open(fn)
-
outStr = ""
-
for line in f:
-
lineList = line.split(" ") # split on space character
-
lineListOut = []
-
for word in lineList:
-
lineListOut.append(igpay.igpay(word))
-
outStr += " ".join(lineListOut)
-
print outStr
HTH
dshimer 136
Recognized Expert New Member
You could do something like this, but your igpay function would need to handle the capitalization and punctuation:
Very clean, now the for loop is simply doing what it does best, working through the sequence of words, and since split should send in the punctuation along with the single word that preceeds it, "handling" it could be as simple as...
Test if it's there.
If so remove it.
Process the string.
Replace punctuation and return.
TMS 119
New Member
Wow... very nice. It processes the list, but appends a bunch of stuff. The list after processing looks like this:
NewaywFiwaylewayway wayawayd,way wyamovwayrewayway waynewaywway wayfiwayleway.
LOL, its an entirely new language. I should name it...
I need to go through and see what is happening, but it is processing the list. I think I can handle it from here. Thank you!
dshimer 136
Recognized Expert New Member
One more thing, it blows me away the little things you can miss as you go along. In case you didn't see it, look at the last few posts in the how to convert gpr file to csv format: using python thread. The fileinput tip is worth the price of admission all by itself and I had never looked at it before ghostdog74 mentioned it.
I need to go through and see what is happening, but it is processing the list. I think I can handle it from here. Thank you!
bvdet 2,851
Recognized Expert Moderator Specialist
One more thing, it blows me away the little things you can miss as you go along. In case you didn't see it, look at the last few posts in the how to convert gpr file to csv format: using python thread. The fileinput tip is worth the price of admission all by itself and I had never looked at it before ghostdog74 mentioned it.
The fileinput was new to me also. Good tip. One more thing - it is good practice to close each file you open when you are through with it: TMS 119
New Member
Its all done. Thank you so much for your help. It works very well, thanks to all your help!
bartonc 6,596
Recognized Expert Expert
Its all done. Thank you so much for your help. It works very well, thanks to all your help!
Thanks for the update. I'm glad the experts here were of help to you, Keep posting.
Sign in to post your reply or Sign up for a free account.
Similar topics
by: David |
last post by:
On every web browser except Safari, this website works great. (Well,
by "every" I mean Mozilla, Netscape, and Internet Explorer, for Mac
and Windows).
The site is: http://www.ruleofthirds.com
...
|
by: David Mitchell |
last post by:
I use a function to read all of the files from a couple of directories
(and subfolders) and update a table(tblfiles) with the fullpath and
file name, the filesize and the date the file was created....
|
by: Steve Jorgensen |
last post by:
Hi all,
I've just finished almost all of what has turned out to be a real bear of a
project. It has to import data from a monthly spreadsheet export from another
program, and convert that into...
|
by: Crimsonwingz |
last post by:
Need to calculate a sum based on a number of factors over a period of
years. I can use formula ^x for some of it, but need totals to carry
over in the sum and have only been able to do this thus...
|
by: Art |
last post by:
Hello,
Here is a fragment of my code, which fails and I don't know why.
---------------------------------
XmlDocument doc = new XmlDocument();
doc.Load(@"C:\mydoc.xml");
foreach (XmlNode objNode...
|
by: Ryan Ternier |
last post by:
I'm having an issue with an SQL insert statement. It's a very simple
statement, and it's causing too much fuss.
strSQL = "INSERT INTO tblFieldLayouts(TypeID, FieldID, OrderID, Hidden)
VALUES("...
|
by: Sorin Schwimmer |
last post by:
I am thinking on something in the following form:
<code>
import time
import thread
delay=True
def fn()
global delay
|
by: Jen |
last post by:
Hi. I have this problem that I think should be easy but have been struggling
with this for days.
I have a list based on a recordset from a database. This list consists of
records meeting a certain...
|
by: fig000 |
last post by:
HI,
I'm new to generics. I've written a simple class to which I'm
passing a generic list. I'm able to pass the list and even pass the
type of the list so I can use it to traverse it. It's a...
|
by: Sonnysonu |
last post by:
This is the data of csv file
1 2 3
1 2 3
1 2 3
1 2 3
2 3
2 3
3
the lengths should be different i have to store the data by column-wise with in the specific length.
suppose the i have to...
|
by: marktang |
last post by:
ONU (Optical Network Unit) is one of the key components for providing high-speed Internet services. Its primary function is to act as an endpoint device located at the user's premises. However,...
|
by: Hystou |
last post by:
Most computers default to English, but sometimes we require a different language, especially when relocating. Forgot to request a specific language before your computer shipped? No problem! You can...
|
by: jinu1996 |
last post by:
In today's digital age, having a compelling online presence is paramount for businesses aiming to thrive in a competitive landscape. At the heart of this digital strategy lies an intricately woven...
|
by: Hystou |
last post by:
Overview:
Windows 11 and 10 have less user interface control over operating system update behaviour than previous versions of Windows. In Windows 11 and 10, there is no way to turn off the Windows...
|
by: tracyyun |
last post by:
Dear forum friends,
With the development of smart home technology, a variety of wireless communication protocols have appeared on the market, such as Zigbee, Z-Wave, Wi-Fi, Bluetooth, etc. Each...
|
by: agi2029 |
last post by:
Let's talk about the concept of autonomous AI software engineers and no-code agents. These AIs are designed to manage the entire lifecycle of a software development project—planning, coding, testing,...
|
by: conductexam |
last post by:
I have .net C# application in which I am extracting data from word file and save it in database particularly. To store word all data as it is I am converting the whole word file firstly in HTML and...
|
by: adsilva |
last post by:
A Windows Forms form does not have the event Unload, like VB6. What one acts like?
| |