473,320 Members | 2,164 Online
Bytes | Software Development & Data Engineering Community
Post Job

Home Posts Topics Members FAQ

Join Bytes to post your question to a community of 473,320 software developers and data experts.

How to read space separated file in python?

Hi all,

I want to read file which is mapping file. Used in to map character from ttf
to unicode.
eg

Map file contain data in the following way:

0 ०
1 १
2 २
3 ३
4 ४
5 ५
6 ६
7 à¥*
8 ८
9 ९

Like this. Please use any unicode editor to view the text if it not properly
shown.

Now i want to read both the character separately like:

str[0]=0 and str2[0]=०

How can i do this?

please give me solution?

Regards,
Ginovation
Nov 21 '08 #1
7 23263
On Fri, 21 Nov 2008 14:16:13 +0530, ganesh gajre wrote:
Hi all,

I want to read file which is mapping file. Used in to map character from
ttf to unicode.
eg

Map file contain data in the following way:

0 ०
1 १
2 २
3 ३
4 ४
5 ५
6 ६
7 à¥*
8 ८
9 ९

Like this. Please use any unicode editor to view the text if it not
properly shown.

Now i want to read both the character separately like:

str[0]=0 and str2[0]=०

How can i do this?

please give me solution?
Well, because you said please...

I assume the encoding of the second column is utf-8. You need something
like this:
# Untested.
column0 = []
column1 = []
for line in open('somefile', 'r'):
a, b = line.split()
column0.append(a)
column1.append(b.decode('utf-8'))

--
Steven
Nov 21 '08 #2
ganesh gajre wrote:
Hi all,

I want to read file which is mapping file. Used in to map character from
ttf to unicode.
eg

Map file contain data in the following way:

0 ०
1 १
2 २
3 ३
4 ४
5 ५
6 ६
7 à¥*
8 ८
9 ९

Like this. Please use any unicode editor to view the text if it not
properly shown.

Now i want to read both the character separately like:

str[0]=0 and str2[0]=०

How can i do this?

please give me solution?
Read the file:
>>import codecs
pairs = [line.split() for line in codecs.open("ganesh.txt",
encoding="utf-8")]
>>pairs[0]
[u'0', u'\u0966']

Create the conversion dictionary:
>>trans = dict((ord(s), t) for s, t in pairs)
Do the translation:
>>print u"01109876".translate(trans)
०११०९८à¥*६

You may have to use int(s) instead of ord(s) in your actual conversion code:
>>trans = dict((int(s), t) for s, t in pairs)
print u"\x00\x01\x09".translate(trans)
०१९

Peter
Nov 21 '08 #3
On Nov 21, 2008, at 2:08 AM, Steven D'Aprano wrote:
a, b = line.split()
Note that in a case like this, you may want to consider using
partition instead of split:

a, sep, b = line.partition(' ')

This way, if there happens to be more than one space (for example,
because the Unicode character you're mapping to happens to be a
space), it'll still work. It also better encodes the intention, which
is to split only on the first space in the line, rather than on every
space.

(It so happens I ran into exactly this issue yesterday, though my
delimiter was a colon.)

Cheers,
- Joe

Nov 21 '08 #4
Joe Strout wrote:
On Nov 21, 2008, at 2:08 AM, Steven D'Aprano wrote:
> a, b = line.split()

Note that in a case like this, you may want to consider using partition
instead of split:

a, sep, b = line.partition(' ')

This way, if there happens to be more than one space (for example,
because the Unicode character you're mapping to happens to be a space),
it'll still work. It also better encodes the intention, which is to
split only on the first space in the line, rather than on every space.

(It so happens I ran into exactly this issue yesterday, though my
delimiter was a colon.)
Joe:

In the special case of the None first argument (the default for the
str.split() method) runs of whitespace *are* treated as single
delimiters. So line.split() is not the same as line.split(' ').

regards
Steve
--
Steve Holden +1 571 484 6266 +1 800 494 3119
Holden Web LLC http://www.holdenweb.com/

Nov 21 '08 #5
On Nov 21, 2008, at 9:00 AM, Steve Holden wrote:
Joe Strout wrote:
>On Nov 21, 2008, at 2:08 AM, Steven D'Aprano wrote:
>> a, b = line.split()

Note that in a case like this, you may want to consider using
partition
instead of split:

a, sep, b = line.partition(' ')

This way, if there happens to be more than one space (for example,
because the Unicode character you're mapping to happens to be a
space),
it'll still work. It also better encodes the intention, which is to
split only on the first space in the line, rather than on every
space.
In the special case of the None first argument (the default for the
str.split() method) runs of whitespace *are* treated as single
delimiters. So line.split() is not the same as line.split(' ').
Right -- so using split() gives you the wrong answer for two different
reasons. Try these:
>>line = "1 x"
a, b = line.split() # b == "x", which is correct
>>line = "2 "
a, b = line.split() # correct answer would be b == " "
ValueError: need more than 1 value to unpack
>>line = "3 x and here is some extra stuff"
a, b = line.split() # correct answer would be b == "x and here
is some extra stuff"
ValueError: too many values to unpack

Partition handles these cases correctly (at least, within the OP's
specification that the value of "b" should be whatever comes after the
first space).

Cheers,
- Joe

Nov 21 '08 #6
En Fri, 21 Nov 2008 14:13:23 -0200, Joe Strout <jo*@strout.netescribió:
Right -- so using split() gives you the wrong answer for two different
reasons. Try these:
>>line = "1 x"
>>a, b = line.split() # b == "x", which is correct
>>line = "2 "
>>a, b = line.split() # correct answer would be b == " "
ValueError: need more than 1 value to unpack
>>line = "3 x and here is some extra stuff"
>>a, b = line.split() # correct answer would be b == "x and here is
some extra stuff"
ValueError: too many values to unpack

Partition handles these cases correctly (at least, within the OP's
specification that the value of "b" should be whatever comes after the
first space).
split takes an additional argument too:

pyline = "3 x and here is some extra stuff"
pya, b = line.split(None, 1)
pya
'3'
pyb
'x and here is some extra stuff'

But it still fails if the line contains no spaces. partition is more
robust in those cases

--
Gabriel Genellina

Nov 21 '08 #7
Joe Strout wrote:
[...]
Partition handles these cases correctly (at least, within the OP's
specification that the value of "b" should be whatever comes after the
first space).
I believe if you read the OP's post again you will see that he specified
two non-space items per line.

You really *love* being right, don't you? ;-) You say partition "...
better encodes the intention, which is to split only on the first space
in the line, rather than on every space". Your mind-reading abilities
are clearly superior to mine.

Anyway, sorry to have told you something you already knew. It's true
that partition has its place, and is too often overlooked. Particularly
by me.

regards
Steve
--
Steve Holden +1 571 484 6266 +1 800 494 3119
Holden Web LLC http://www.holdenweb.com/

Nov 21 '08 #8

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

6
by: Ruben | last post by:
Hello. I am trying to read a small text file using the readline statement. I can only read the first 2 records from the file. It stops at the blank lines or at lines with only spaces. I have a...
14
by: Marcin Ciura | last post by:
Here is a pre-PEP about print that I wrote recently. Please let me know what is the community's opinion on it. Cheers, Marcin PEP: XXX Title: Print Without Intervening Space Version:...
3
by: jrlen balane | last post by:
how would i read a tab delimited file? at the same time put what i read in an array, say for example that i know that the file is an array with column= 5 and row=unknown.
7
by: ms_chika | last post by:
hi to all! I just want to ask if how could i validate a text or string if it has a space between characters? Thanks in advance. *** Sent via Developersdex http://www.developersdex.com ***...
6
by: Karthik | last post by:
I am trying to read Http request body values NOT form values The content type is text/xml and its POST I am posting data from Infopath to an asp.net page. In ASP it works by just writing...
5
by: Denis Petronenko | last post by:
Hello, how can i read into strings from ifstream? file contains values in following format: value11; val ue12; value 13; valu e21;value22; value23; etc. i need to read like file >string,...
6
by: gonzlobo | last post by:
I've been using Python for a few days. It's such the perfect language for parsing data! I really like it so far, but I'm having a hard time reading a file, reading the first few hex characters &...
5
by: Helmut Jarausch | last post by:
Hi, I'm looking for an elegant solution to the following (quite common) problem: Given a string of substrings separated by white space, split this into tuple/list of elements. The problem...
3
by: askalottaqs | last post by:
i have a file that has all those values, each 3 in a line separated by a single space, i want to do some processing over these numbers, whats the most efficient way to read them? (all i could think...
0
by: DolphinDB | last post by:
The formulas of 101 quantitative trading alphas used by WorldQuant were presented in the paper 101 Formulaic Alphas. However, some formulas are complex, leading to challenges in calculation. Take...
0
by: ryjfgjl | last post by:
ExcelToDatabase: batch import excel into database automatically...
1
isladogs
by: isladogs | last post by:
The next Access Europe meeting will be on Wednesday 6 Mar 2024 starting at 18:00 UK time (6PM UTC) and finishing at about 19:15 (7.15PM). In this month's session, we are pleased to welcome back...
0
by: Vimpel783 | last post by:
Hello! Guys, I found this code on the Internet, but I need to modify it a little. It works well, the problem is this: Data is sent from only one cell, in this case B5, but it is necessary that data...
0
by: jfyes | last post by:
As a hardware engineer, after seeing that CEIWEI recently released a new tool for Modbus RTU Over TCP/UDP filtering and monitoring, I actively went to its official website to take a look. It turned...
0
by: ArrayDB | last post by:
The error message I've encountered is; ERROR:root:Error generating model response: exception: access violation writing 0x0000000000005140, which seems to be indicative of an access violation...
1
by: PapaRatzi | last post by:
Hello, I am teaching myself MS Access forms design and Visual Basic. I've created a table to capture a list of Top 30 singles and forms to capture new entries. The final step is a form (unbound)...
0
by: CloudSolutions | last post by:
Introduction: For many beginners and individual users, requiring a credit card and email registration may pose a barrier when starting to use cloud servers. However, some cloud server providers now...
0
by: af34tf | last post by:
Hi Guys, I have a domain whose name is BytesLimited.com, and I want to sell it. Does anyone know about platforms that allow me to list my domain in auction for free. Thank you

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.