471,348 Members | 1,942 Online
Bytes | Software Development & Data Engineering Community
Post +

Home Posts Topics Members FAQ

Join Bytes to post your question to a community of 471,348 software developers and data experts.

Delete spaces

If I have a text file that is delimited by spaces, how do I import it
and get to comma delimited? Here is a row of data from the text file:

1 1 10:55:14 2 65 8.5
1.4+1.1 2.5 Class-2 0

I tried a few examples from the group and it didn't work, since the
file also has a header row and a row of seperators ( -------). The
lengths of each row is something like 130, so there are extra spaces
after the last value as well. I have tried joining and other things,
but I couldn't figure out how to get the values to come together.
Thanks.

Kou

Sep 28 '07 #1
4 2834
ko****@hotmail.com a écrit :
If I have a text file that is delimited by spaces,
spaces or tabs ?
how do I import it
and get to comma delimited? Here is a row of data from the text file:

1 1 10:55:14 2 65 8.5
1.4+1.1 2.5 Class-2 0
I tried a few examples from the group and it didn't work, since the
file also has a header row and a row of seperators ( -------). The
lengths of each row is something like 130, so there are extra spaces
after the last value as well. I have tried joining and other things,
but I couldn't figure out how to get the values to come together.
Thanks.
This should answer your question - but certainly not solve your problem
(cf below):

f = open('/path/to/file.txt');
file.readline(); # skip headers
for line in f:
# skip separators
if line.startswith('---'):
continue
parts = filter(line.rstrip().split())
print ';'.join(parts)

f.close()
Now the problem is that, obviously, the position of a group of data in a
line is meaningfull, so just filtering out spaces isn't the solution.
Did you check that it's not really a tab-delimited file ? If yes, doing
line.split('\t') might help. Or just trying with the csv module FWIW.

My 2 cents...
Sep 28 '07 #2
ko****@hotmail.com wrote:
If I have a text file that is delimited by spaces, how do I import it
and get to comma delimited? Here is a row of data from the text file:

1 1 10:55:14 2 65 8.5
1.4+1.1 2.5 Class-2 0

I tried a few examples from the group and it didn't work, since the
file also has a header row and a row of seperators ( -------). The
lengths of each row is something like 130, so there are extra spaces
after the last value as well. I have tried joining and other things,
but I couldn't figure out how to get the values to come together.
Thanks.

Kou

After you recognize and handle the header and separator lines, the
remaining lines can be handled this way:
>>l = '1 1 10:55:14 2 65 8.5'
f = l.split()
print f
['1', '1', '10:55:14', '2', '65', '8.5']
>>','.join(f)
'1,1,10:55:14,2,65,8.5'

or
>>', '.join(f)
'1, 1, 10:55:14, 2, 65, 8.5'
>>>
Gary Herron
Sep 28 '07 #3
On Sep 29, 1:43 am, kou...@hotmail.com wrote:
If I have a text file that is delimited by spaces, how do I import it
and get to comma delimited? Here is a row of data from the text file:

1 1 10:55:14 2 65 8.5
1.4+1.1 2.5 Class-2 0

I tried a few examples from the group and it didn't work, since the
file also has a header row and a row of seperators ( -------). The
lengths of each row is something like 130, so there are extra spaces
after the last value as well. I have tried joining and other things,
but I couldn't figure out how to get the values to come together.
Thanks.

Kou
It would help enormously if you could show us UNAMBIGUOUSLY what is in
say the first 3 lines after the headings and separators -- do this:

print repr(open("thefile", "rb").read()[:400])

The other thing you need is to know enough about the file format to
show us what is the CSV output that you require from the sample input
-- we don't have crystal balls, and are likely to make half-donkeyed
guesses, like these:

If the spaces are really tabs, use line.split('\t')

Otherwise: the file has fixed column widths, and any use of line.split
will mangle it.

The clumsy way to handle this is to count column positions, and write
something ugly like:
field1 = line[0:8]
field2 = line[8:20]
etc

"a row of seperators ( -------)" sounds suspiciously like the "column
aligned" format that can be produced by running a SQL query on a SQL
Server database using MS's "Query Analyser". It looks like this:

RecordType ID1 ID2 Description
----------- -------------------- ----------- ----------------------
1 12345678 123456 Widget
4 87654321 654321 Gizmoid
etc

Does your file look something like that? If so, then all you have to
do is leverage off the fact that the second line has one-space gaps
between each bunch of dashes, and you can write a little module that
will read any file like that, just as though it were a CSV file.

Over to you ....

Cheers,
John

Sep 29 '07 #4

John Machin wrote:
On Sep 29, 1:43 am, kou...@hotmail.com wrote:
If I have a text file that is delimited by spaces, how do I import it
and get to comma delimited? Here is a row of data from the text file:

1 1 10:55:14 2 65 8.5
1.4+1.1 2.5 Class-2 0

I tried a few examples from the group and it didn't work, since the
file also has a header row and a row of seperators ( -------). The
lengths of each row is something like 130, so there are extra spaces
after the last value as well. I have tried joining and other things,
but I couldn't figure out how to get the values to come together.
Thanks.

Kou

It would help enormously if you could show us UNAMBIGUOUSLY what is in
say the first 3 lines after the headings and separators -- do this:

print repr(open("thefile", "rb").read()[:400])

The other thing you need is to know enough about the file format to
show us what is the CSV output that you require from the sample input
-- we don't have crystal balls, and are likely to make half-donkeyed
guesses, like these:

If the spaces are really tabs, use line.split('\t')

Otherwise: the file has fixed column widths, and any use of line.split
will mangle it.

The clumsy way to handle this is to count column positions, and write
something ugly like:
field1 = line[0:8]
field2 = line[8:20]
etc

"a row of seperators ( -------)" sounds suspiciously like the "column
aligned" format that can be produced by running a SQL query on a SQL
Server database using MS's "Query Analyser". It looks like this:

RecordType ID1 ID2 Description
----------- -------------------- ----------- ----------------------
1 12345678 123456 Widget
4 87654321 654321 Gizmoid
etc

Does your file look something like that? If so, then all you have to
do is leverage off the fact that the second line has one-space gaps
between each bunch of dashes, and you can write a little module that
will read any file like that, just as though it were a CSV file.
If the fields are separated by whitespace, this Awk program can
handle the situation:

awk 'BEGIN{OFS=","} {$1=$1} 1' oldfile >newfile

If the fields are fixed-width and the 2nd line is a guide to those
widths,
then this Ruby program should work (not optimized for speed):

lines = IO.readlines( 'data2' )
# Dump header.
lines.shift
# Save column guide.
guide = lines.shift.scan( /-+ */ )
for line in lines do
puts guide.map{|s| line.slice!(0,s.size).strip}.join(",")
end

Sep 30 '07 #5

This discussion thread is closed

Replies have been disabled for this discussion.

Similar topics

7 posts views Thread by Mindy Geac | last post: by
2 posts views Thread by jim Bob | last post: by
10 posts views Thread by felixnielsen | last post: by
5 posts views Thread by wo20051223 | last post: by
4 posts views Thread by - HAL9000 | last post: by
1 post views Thread by Ronak mishra | last post: by

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.