problem parsing lines in a file

barronmo

I'm having difficulty getting the following code to work. All I want
to do is remove the '0:00:00' from the end of each line. Here is part
of the original file:

3,3,"Dyspepsia NOS",9/12/2003 0:00:00
4,3,"OA of lower leg",9/12/2003 0:00:00
5,4,"Cholera NOS",9/12/2003 0:00:00
6,4,"Open wound of ear NEC*",9/12/2003 0:00:00
7,4,"Migraine with aura",9/12/2003 0:00:00
8,6,"HTN [Hypertension]",10/15/2003 0:00:00
10,3,"Imerslund syndrome",10/27/2003 0:00:00
12,4,"Juvenile neurosyphilis",11/4/2003 0:00:00
13,4,"Benign paroxysmal positional nystagmus",11/4/2003 0:00:00
14,3,"Salmonella infection, unspecified",11/7/2003 0:00:00
20,3,"Bubonic plague",11/11/2003 0:00:00

output = open('my/path/ProblemListFixed.txt', 'w')
for line in open('my/path/ProblemList.txt', 'r'):
newline = line.rstrip('0:00:00')
output.write(newline)
output.close()
This result is a copy of "ProblemList" without any changes made. What
am I doing wrong? Thanks for any help.

Mike

Dec 11 '07 #1

Subscribe Post Reply

1392

Diez B. Roggisch

barronmo wrote:

I'm having difficulty getting the following code to work. All I want
to do is remove the '0:00:00' from the end of each line. Here is part
of the original file:

3,3,"Dyspepsia NOS",9/12/2003 0:00:00
4,3,"OA of lower leg",9/12/2003 0:00:00
5,4,"Cholera NOS",9/12/2003 0:00:00
6,4,"Open wound of ear NEC*",9/12/2003 0:00:00
7,4,"Migraine with aura",9/12/2003 0:00:00
8,6,"HTN [Hypertension]",10/15/2003 0:00:00
10,3,"Imerslund syndrome",10/27/2003 0:00:00
12,4,"Juvenile neurosyphilis",11/4/2003 0:00:00
13,4,"Benign paroxysmal positional nystagmus",11/4/2003 0:00:00
14,3,"Salmonella infection, unspecified",11/7/2003 0:00:00
20,3,"Bubonic plague",11/11/2003 0:00:00

output = open('my/path/ProblemListFixed.txt', 'w')
for line in open('my/path/ProblemList.txt', 'r'):
newline = line.rstrip('0:00:00')
output.write(newline)
output.close()
This result is a copy of "ProblemList" without any changes made. What
am I doing wrong? Thanks for any help.

It works kind of for me - but it's actually a bit more than you want. Take a
close look on what rstrip _really_ does. Small hint:

print "foobar".rstrip("rab")

Diez

Dec 11 '07 #2

Chris

On Dec 11, 7:25 pm, barronmo <barro...@gmail.comwrote:

I'm having difficulty getting the following code to work. All I want
to do is remove the '0:00:00' from the end of each line. Here is part
of the original file:

3,3,"Dyspepsia NOS",9/12/2003 0:00:00
4,3,"OA of lower leg",9/12/2003 0:00:00
5,4,"Cholera NOS",9/12/2003 0:00:00
6,4,"Open wound of ear NEC*",9/12/2003 0:00:00
7,4,"Migraine with aura",9/12/2003 0:00:00
8,6,"HTN [Hypertension]",10/15/2003 0:00:00
10,3,"Imerslund syndrome",10/27/2003 0:00:00
12,4,"Juvenile neurosyphilis",11/4/2003 0:00:00
13,4,"Benign paroxysmal positional nystagmus",11/4/2003 0:00:00
14,3,"Salmonella infection, unspecified",11/7/2003 0:00:00
20,3,"Bubonic plague",11/11/2003 0:00:00

output = open('my/path/ProblemListFixed.txt', 'w')
for line in open('my/path/ProblemList.txt', 'r'):
newline = line.rstrip('0:00:00')
output.write(newline)
output.close()

This result is a copy of "ProblemList" without any changes made. What
am I doing wrong? Thanks for any help.

Mike

rstrip() won't do what you think it should do.
you could either use .replace('0:00:00','') directly on the input
string or if it might occur in one of the other elements as well then
just split the line on delimeters.
In the first case you can do.

for line in input_file:
output_file.write( line.replace('0:00:00','') )

in the latter rather.

for line in input_file:
tmp = line.split( ',' )
tmp[3] = tmp[3].replace('0:00:00')
output_file.write( ','.join( tmp ) )

Hope that helps,
Chris

Dec 11 '07 #3

Matimus

This result is a copy of "ProblemList" without any changes made. What
am I doing wrong? Thanks for any help.

rstrip doesn't work the way you think it does

>>help(str.rstrip)

Help on method_descriptor:

rstrip(...)
S.rstrip([chars]) -string or unicode

Return a copy of the string S with trailing whitespace removed.
If chars is given and not None, remove characters in chars
instead.
If chars is unicode, S will be converted to unicode before
stripping

>>'abcdef'.rstrip('e')

'abcdef'

>>'abcdef'.rstrip('ef')

'abcd'

>>'abcdef'.rstrip('efc')

'abcd'

>>'abcdef'.rstrip('efcd')

'ab'

>>'20,3,"Bubonic plague",11/11/2003 0:00:00\n'.rstrip('0:00:00')

'20,3,"Bubonic plague",11/11/2003 0:00:00\n'

>>'20,3,"Bubonic plague",11/11/2003 0:00:00\n'.rstrip('0:00:00\n')

'20,3,"Bubonic plague",11/11/2003 '

>>'20,3,"Bubonic plague",11/11/2003 0:00:00\n'.rstrip('0:\n')

'20,3,"Bubonic plague",11/11/2003 '

You probably just want to use slicing though:

>>'20,3,"Bubonic plague",11/11/2003 0:00:00\n'[:-9]

'20,3,"Bubonic plague",11/11/2003'

But don't forget to re-attach a newline before writing out. That goes
for the first method also.

Matt

Dec 11 '07 #4

Kees Bakker

barronmo wrote:

I'm having difficulty getting the following code to work. All I want
to do is remove the '0:00:00' from the end of each line. Here is part
of the original file:

3,3,"Dyspepsia NOS",9/12/2003 0:00:00
...
20,3,"Bubonic plague",11/11/2003 0:00:00

output = open('my/path/ProblemListFixed.txt', 'w')
for line in open('my/path/ProblemList.txt', 'r'):
newline = line.rstrip('0:00:00')
output.write(newline)
output.close()
This result is a copy of "ProblemList" without any changes made. What
am I doing wrong? Thanks for any help.

You should feel lucky that it didn't work :-) If you would have used
newline = line.rstrip('0:00:00\n')
you would not have found the problem, nor the solution.
--
Kees

Dec 12 '07 #5

barronmo

Thanks everyone. I learned several things on this one. I ended up
using the .replace() method and got the results I wanted.

Thanks again,

Michael Barron

Dec 13 '07 #6

Similar topics

Serious problem with Shelve

by: Rami A. Kishek | last post by:

Hi - this mysterious behavior with shelve is just about to kill me. I hope someone here can shed some light. First of all, I have this piece of code which uses shelve to save instances of some...

Python

A problem while using urllib

by: Johnny Lee | last post by:

Hi, I was using urllib to grab urls from web. here is the work flow of my program: 1. Get base url and max number of urls from user 2. Call filter to validate the base url 3. Read the source...

Python

Performance File Parsing

by: Thomas Kowalski | last post by:

Hi, I have to parse a plain, ascii text file (on local HD). Since the file might be many millions lines long I want to improve the efficiency of my parsing process. The resulting data structure...

C / C++

an interesting problem in c

by: sam_cit | last post by:

Hi Everyone, I have the following structure in my program struct sample { char *string; int string_len; };

C / C++

Need help with parsing a multilined log file into objects

by: Paulers | last post by:

Hello, I have a log file that contains many multi-line messages. What is the best approach to take for extracting data out of each message and populating object properties to be stored in an...

Visual Basic .NET

Building several parsing modules

by: Robert Neville | last post by:

Basically, I want to create a table in html, xml, or xslt; with any number of regular expressions; a script (Perl or Python) which reads each table row (regex and replacement); and performs the...

Python

problem with seekg

by: Julian | last post by:

Hi, I am having problems with a function that I have been using in my program to read sentences from a 'command file' and parse them into commands. the surprising thing is that the program works...

C / C++

Command language parsing - how formal to get?

by: Chris Carlen | last post by:

Hi: Having completed enough serial driver code for a TMS320F2812 microcontroller to talk to a terminal, I am now trying different approaches to command interpretation. I have a very simple...

C / C++

Parsing text file with #include and #define directives

by: python | last post by:

I'm parsing a text file for a proprietary product that has the following 2 directives: #include <somefile> #define <name<value> Defined constants are referenced via <#name#syntax. I'm...

Python

Newbie problem: PHP script not running in browser

by: V S Rawat | last post by:

(bringing the discussion here for php.general) I am on xpsp3, wampserver 2.0, having apache 2.2.8, php 5.2.6, MySQL 5.0.51b. http://localhost/ is E:\wamp\www\ I have put the first php script...

PHP

Easy Steps to Fix "Canon Printer Won't Connect to WiFi Network"

by: taylorcarr | last post by:

A Canon printer is a smart device known for being advanced, efficient, and reliable. It is designed for home, office, and hybrid workspace use and can also be used for a variety of purposes. However,...

General

Migrating Website to Cloud - Emmanuel Katto

by: emmanuelkatto | last post by:

Hi All, I am Emmanuel katto from Uganda. I want to ask what challenges you've faced while migrating a website to cloud. Please let me know. Thanks! Emmanuel

General

Looking to do Android software development, any suggestions? Is flutter better?

by: nemocccc | last post by:

hello, everyone, I want to develop a software for my android phone for daily needs, any suggestions?

General

Is that possible of reading the .csv file in column wise and the column have different lengths ?

by: Sonnysonu | last post by:

This is the data of csv file 1 2 3 1 2 3 1 2 3 1 2 3 2 3 2 3 3 the lengths should be different i have to store the data by column-wise with in the specific length. suppose the i have to...

C / C++

What is ONU?

by: marktang | last post by:

ONU (Optical Network Unit) is one of the key components for providing high-speed Internet services. Its primary function is to act as an endpoint device located at the user's premises. However,...

General

Changing the language in Windows 10

by: Hystou | last post by:

Most computers default to English, but sometimes we require a different language, especially when relocating. Forgot to request a specific language before your computer shipped? No problem! You can...

Windows Server

Problem With Comparison Operator <=> in G++

by: Oralloy | last post by:

Hello folks, I am unable to find appropriate documentation on the type promotion of bit-fields when using the generalised comparison operator "<=>". The problem is that using the GNU compilers,...

C / C++

Maximizing Business Potential: The Nexus of Website Design and Digital Marketing

by: jinu1996 | last post by:

In today's digital age, having a compelling online presence is paramount for businesses aiming to thrive in a competitive landscape. At the heart of this digital strategy lies an intricately woven...

Online Marketing

The easy way to turn off automatic updates for Windows 10/11

by: Hystou | last post by:

Overview: Windows 11 and 10 have less user interface control over operating system update behaviour than previous versions of Windows. In Windows 11 and 10, there is no way to turn off the Windows...

Windows Server