By using this site, you agree to our updated Privacy Policy and our Terms of Use. Manage your Cookies Settings.
432,569 Members | 1,358 Online
Bytes IT Community
+ Ask a Question
Need help? Post your question and get tips & solutions from a community of 432,569 IT Pros & Developers. It's quick & easy.

problem parsing lines in a file

P: n/a
I'm having difficulty getting the following code to work. All I want
to do is remove the '0:00:00' from the end of each line. Here is part
of the original file:

3,3,"Dyspepsia NOS",9/12/2003 0:00:00
4,3,"OA of lower leg",9/12/2003 0:00:00
5,4,"Cholera NOS",9/12/2003 0:00:00
6,4,"Open wound of ear NEC*",9/12/2003 0:00:00
7,4,"Migraine with aura",9/12/2003 0:00:00
8,6,"HTN [Hypertension]",10/15/2003 0:00:00
10,3,"Imerslund syndrome",10/27/2003 0:00:00
12,4,"Juvenile neurosyphilis",11/4/2003 0:00:00
13,4,"Benign paroxysmal positional nystagmus",11/4/2003 0:00:00
14,3,"Salmonella infection, unspecified",11/7/2003 0:00:00
20,3,"Bubonic plague",11/11/2003 0:00:00

output = open('my/path/ProblemListFixed.txt', 'w')
for line in open('my/path/ProblemList.txt', 'r'):
newline = line.rstrip('0:00:00')
output.write(newline)
output.close()
This result is a copy of "ProblemList" without any changes made. What
am I doing wrong? Thanks for any help.

Mike
Dec 11 '07 #1
Share this Question
Share on Google+
5 Replies


P: n/a
barronmo wrote:
I'm having difficulty getting the following code to work. All I want
to do is remove the '0:00:00' from the end of each line. Here is part
of the original file:

3,3,"Dyspepsia NOS",9/12/2003 0:00:00
4,3,"OA of lower leg",9/12/2003 0:00:00
5,4,"Cholera NOS",9/12/2003 0:00:00
6,4,"Open wound of ear NEC*",9/12/2003 0:00:00
7,4,"Migraine with aura",9/12/2003 0:00:00
8,6,"HTN [Hypertension]",10/15/2003 0:00:00
10,3,"Imerslund syndrome",10/27/2003 0:00:00
12,4,"Juvenile neurosyphilis",11/4/2003 0:00:00
13,4,"Benign paroxysmal positional nystagmus",11/4/2003 0:00:00
14,3,"Salmonella infection, unspecified",11/7/2003 0:00:00
20,3,"Bubonic plague",11/11/2003 0:00:00

output = open('my/path/ProblemListFixed.txt', 'w')
for line in open('my/path/ProblemList.txt', 'r'):
newline = line.rstrip('0:00:00')
output.write(newline)
output.close()
This result is a copy of "ProblemList" without any changes made. What
am I doing wrong? Thanks for any help.
It works kind of for me - but it's actually a bit more than you want. Take a
close look on what rstrip _really_ does. Small hint:

print "foobar".rstrip("rab")

Diez
Dec 11 '07 #2

P: n/a
On Dec 11, 7:25 pm, barronmo <barro...@gmail.comwrote:
I'm having difficulty getting the following code to work. All I want
to do is remove the '0:00:00' from the end of each line. Here is part
of the original file:

3,3,"Dyspepsia NOS",9/12/2003 0:00:00
4,3,"OA of lower leg",9/12/2003 0:00:00
5,4,"Cholera NOS",9/12/2003 0:00:00
6,4,"Open wound of ear NEC*",9/12/2003 0:00:00
7,4,"Migraine with aura",9/12/2003 0:00:00
8,6,"HTN [Hypertension]",10/15/2003 0:00:00
10,3,"Imerslund syndrome",10/27/2003 0:00:00
12,4,"Juvenile neurosyphilis",11/4/2003 0:00:00
13,4,"Benign paroxysmal positional nystagmus",11/4/2003 0:00:00
14,3,"Salmonella infection, unspecified",11/7/2003 0:00:00
20,3,"Bubonic plague",11/11/2003 0:00:00

output = open('my/path/ProblemListFixed.txt', 'w')
for line in open('my/path/ProblemList.txt', 'r'):
newline = line.rstrip('0:00:00')
output.write(newline)
output.close()

This result is a copy of "ProblemList" without any changes made. What
am I doing wrong? Thanks for any help.

Mike
rstrip() won't do what you think it should do.
you could either use .replace('0:00:00','') directly on the input
string or if it might occur in one of the other elements as well then
just split the line on delimeters.
In the first case you can do.

for line in input_file:
output_file.write( line.replace('0:00:00','') )

in the latter rather.

for line in input_file:
tmp = line.split( ',' )
tmp[3] = tmp[3].replace('0:00:00')
output_file.write( ','.join( tmp ) )

Hope that helps,
Chris
Dec 11 '07 #3

P: n/a
This result is a copy of "ProblemList" without any changes made. What
am I doing wrong? Thanks for any help.
rstrip doesn't work the way you think it does
>>help(str.rstrip)
Help on method_descriptor:

rstrip(...)
S.rstrip([chars]) -string or unicode

Return a copy of the string S with trailing whitespace removed.
If chars is given and not None, remove characters in chars
instead.
If chars is unicode, S will be converted to unicode before
stripping
>>'abcdef'.rstrip('e')
'abcdef'
>>'abcdef'.rstrip('ef')
'abcd'
>>'abcdef'.rstrip('efc')
'abcd'
>>'abcdef'.rstrip('efcd')
'ab'
>>'20,3,"Bubonic plague",11/11/2003 0:00:00\n'.rstrip('0:00:00')
'20,3,"Bubonic plague",11/11/2003 0:00:00\n'
>>'20,3,"Bubonic plague",11/11/2003 0:00:00\n'.rstrip('0:00:00\n')
'20,3,"Bubonic plague",11/11/2003 '
>>'20,3,"Bubonic plague",11/11/2003 0:00:00\n'.rstrip('0:\n')
'20,3,"Bubonic plague",11/11/2003 '

You probably just want to use slicing though:
>>'20,3,"Bubonic plague",11/11/2003 0:00:00\n'[:-9]
'20,3,"Bubonic plague",11/11/2003'

But don't forget to re-attach a newline before writing out. That goes
for the first method also.

Matt
Dec 11 '07 #4

P: n/a
barronmo wrote:
I'm having difficulty getting the following code to work. All I want
to do is remove the '0:00:00' from the end of each line. Here is part
of the original file:

3,3,"Dyspepsia NOS",9/12/2003 0:00:00
...
20,3,"Bubonic plague",11/11/2003 0:00:00

output = open('my/path/ProblemListFixed.txt', 'w')
for line in open('my/path/ProblemList.txt', 'r'):
newline = line.rstrip('0:00:00')
output.write(newline)
output.close()
This result is a copy of "ProblemList" without any changes made. What
am I doing wrong? Thanks for any help.
You should feel lucky that it didn't work :-) If you would have used
newline = line.rstrip('0:00:00\n')
you would not have found the problem, nor the solution.
--
Kees

Dec 12 '07 #5

P: n/a
Thanks everyone. I learned several things on this one. I ended up
using the .replace() method and got the results I wanted.

Thanks again,

Michael Barron
Dec 13 '07 #6

This discussion thread is closed

Replies have been disabled for this discussion.