By using this site, you agree to our updated Privacy Policy and our Terms of Use. Manage your Cookies Settings.
445,928 Members | 1,842 Online
Bytes IT Community
+ Ask a Question
Need help? Post your question and get tips & solutions from a community of 445,928 IT Pros & Developers. It's quick & easy.

text file parsing

P: 3
hi,

i have text file of the form

atom_trace('emotion_response_level(a1, 1.56072)', emotion_response_level(a1, 1.56072), [range(49, 50, true), range(0, 49, unknown)]).

and

atom_trace('goto(a1, a3)', goto(a1, a3), [range(1, 51, true), range(0, 1, unknown)]).

i heve to parse this file . Basically what i required is , i want to extract, emotion response level, range , true and . please tell, how i can parse it.

awaiting your reply.
thanks.

Ghazanfar

removed
Mar 14 '07 #1
Share this Question
Share on Google+
10 Replies


bvdet
Expert Mod 2.5K+
P: 2,851
hi,

i have text file of the form

atom_trace('emotion_response_level(a1, 1.56072)', emotion_response_level(a1, 1.56072), [range(49, 50, true), range(0, 49, unknown)]).

and

atom_trace('goto(a1, a3)', goto(a1, a3), [range(1, 51, true), range(0, 1, unknown)]).

i heve to parse this file . Basically what i required is , i want to extract, emotion response level, range , true and . please tell, how i can parse it.

awaiting your reply.
thanks.

Ghazanfar

s_ghazanfar@hotmail.com
We can help you with this, but we need some more information. Can you show us exactly how the data should appear after extraction?
e.g.:
Expand|Select|Wrap|Line Numbers
  1. 'emotion_response_level_a1 = 1.56072; range = (49, 50); True'
Should the data be in a dictionary, list or tuple? What about the other ranges?
Mar 14 '07 #2

dshimer
Expert 100+
P: 136
Guys like GhostDog74 and bvdet really give me a clue as to how elementary my understanding of modules really is, so I'd like to take this opportunity to learn something based on how I first approached it.
In this case my first inclination because the input seems so structured and consistant would be to just grab all the data into a list by replacing all the non data characters with a space then splitting the result, then act on the appropriate index values converting them to numbers or whatever as I go. For example something like
Expand|Select|Wrap|Line Numbers
  1. >>> txt="atom_trace('emotion_response_level(a1, 1.56072)', emotion_response_level(a1, 1.56072), [range(49, 50, true), range(0, 49, unknown)])."
  2. >>> print txt.replace('[',' ').replace(']',' ').replace('(',' ').replace(')',' ').replace(',',' ').replace("'",' ').split()
  3. ['atom_trace', 'emotion_response_level', 'a1', '1.56072', 'emotion_response_level', 'a1', '1.56072', 'range', '49', '50', 'true', 'range', '0', '49', 'unknown', '.']
  4.  
I know there are ways to create multiple replace functions using re, but if that were my only goal (replace stuff with spaces and split, or just split on the non data characters) is there an easier way to do it?
Mar 14 '07 #3

P: 3
i have text file like this.

LeadsTo trace
atom_trace('emotion_response_level(a1, 1.56072)', emotion_response_level(a1, 1.76072), [range(49, 50, true), range(0, 49, unknown)]).

i can describe above line of text in this way:
a1(agent ) has emotion_response_level=1.56072 and that is correspond to range(49,50,true), similarly

a1(agent) has emotion_response_level=1.76072 and that is correspond to range(0,49,unknown), now later on, i have to use in my program that,the agent a1 has emotion_ response_level= 1.56072, because it is true in rage (49,50,true)
i.e. i have to select the portion related to true range values.
and then i have also use the vale of range in the program.

atom_trace('goto(a1, a3)', goto(a1, a3), [range(1, 51, true), range(0, 1, unknown)]).

atom_trace('agents_have_conversation(a1, a3, ajax)', agents_have_conversation(a1, a3, ajax), [range(3, 11, true), range(0, 3, unknown)]).

ghazanfar
Mar 14 '07 #4

bartonc
Expert 5K+
P: 6,596
i have text file like this.

LeadsTo trace
atom_trace('emotion_response_level(a1, 1.56072)', emotion_response_level(a1, 1.76072), [range(49, 50, true), range(0, 49, unknown)]).

i can describe above line of text in this way:
a1(agent ) has emotion_response_level=1.56072 and that is correspond to range(49,50,true), similarly

a1(agent) has emotion_response_level=1.76072 and that is correspond to range(0,49,unknown), now later on, i have to use in my program that,the agent a1 has emotion_ response_level= 1.56072, because it is true in rage (49,50,true)
i.e. i have to select the portion related to true range values.
and then i have also use the vale of range in the program.

atom_trace('goto(a1, a3)', goto(a1, a3), [range(1, 51, true), range(0, 1, unknown)]).

atom_trace('agents_have_conversation(a1, a3, ajax)', agents_have_conversation(a1, a3, ajax), [range(3, 11, true), range(0, 3, unknown)]).

ghazanfar
Please read the "POSTING GUIDELINES" for this site. Double posting and email address in posts are two things that are not allowed.
Mar 14 '07 #5

Expert 100+
P: 511
Can you give an exact sample of the input file , and how you want your output to be like. i can't really understand what you are describing.
Mar 15 '07 #6

P: 3
The general format is like

atom_trace(seed, seed, [range(860.0, 1000, false), range(840.0, 860.0, true), range(580.0, 840.0, false), range(560.0, 580.0, true), range(300.0, 560.0, false), range(280, 300.0, true), range(20, 280, false), range(0, 20, true)]).

But I required result as I send u.

Here is the sample.

atom_trace('emotion_response_level(a1, 1.56072)', emotion_response_level(a1, 1.56072), [range(49, 50, true), range(0, 49, unknown)]).

atom_trace('emotion_response_level(a2, 1.81894)', emotion_response_level(a2, 1.81894), [range(49, 50, true), range(0, 49, unknown)]).

atom_trace('emotion_response_level(a3, 1.51193)', emotion_response_level(a3, 1.51193), [range(49, 50, true), range(0, 49, unknown)]).

atom_trace('emotion_response_level(a1, 1.85)', emotion_response_level(a1, 1.85), [range(1, 50, unknown), range(0, 1, true)]).

atom_trace('emotion_response_level(a2, 1.2)', emotion_response_level(a2, 1.2), [range(1, 50, unknown), range(0, 1, true)]).
atom_trace('emotion_response_level(a3, 0.7)', emotion_response_level(a3, 0.7), [range(1, 50, unknown), range(0, 1, true)]).
atom_trace('emotion_response_level(a1, 1.84775)', emotion_response_level(a1, 1.84775), [range(2, 50, unknown), range(1, 2, true), range(0, 1, unknown)]).
atom_trace('emotion_response_level(a2, 1.39275)', emotion_response_level(a2, 1.39275), [range(2, 50, unknown), range(1, 2, true), range(0, 1, unknown)]).
atom_trace('emotion_response_level(a3, 1.04275)', emotion_response_level(a3, 1.04275), [range(2, 50, unknown), range(1, 2, true), range(0, 1, unknown)]).
atom_trace('has_emotional_value(a1, aspect, 1.8)', has_emotional_value(a1, aspect, 1.8), [range(3, 50, unknown), range(0, 3, true)]).
atom_trace('has_emotional_value(a3, aspect, 1.8)', has_emotional_value(a3, aspect, 1.8), [range(3, 50, unknown), range(0, 3, true)]).


Ghazanfar
Mar 15 '07 #7

bvdet
Expert Mod 2.5K+
P: 2,851
The general format is like

atom_trace(seed, seed, [range(860.0, 1000, false), range(840.0, 860.0, true), range(580.0, 840.0, false), range(560.0, 580.0, true), range(300.0, 560.0, false), range(280, 300.0, true), range(20, 280, false), range(0, 20, true)]).

But I required result as I send u.

Here is the sample.

atom_trace('emotion_response_level(a1, 1.56072)', emotion_response_level(a1, 1.56072), [range(49, 50, true), range(0, 49, unknown)]).

atom_trace('emotion_response_level(a2, 1.81894)', emotion_response_level(a2, 1.81894), [range(49, 50, true), range(0, 49, unknown)]).

atom_trace('emotion_response_level(a3, 1.51193)', emotion_response_level(a3, 1.51193), [range(49, 50, true), range(0, 49, unknown)]).

atom_trace('emotion_response_level(a1, 1.85)', emotion_response_level(a1, 1.85), [range(1, 50, unknown), range(0, 1, true)]).

atom_trace('emotion_response_level(a2, 1.2)', emotion_response_level(a2, 1.2), [range(1, 50, unknown), range(0, 1, true)]).
atom_trace('emotion_response_level(a3, 0.7)', emotion_response_level(a3, 0.7), [range(1, 50, unknown), range(0, 1, true)]).
atom_trace('emotion_response_level(a1, 1.84775)', emotion_response_level(a1, 1.84775), [range(2, 50, unknown), range(1, 2, true), range(0, 1, unknown)]).
atom_trace('emotion_response_level(a2, 1.39275)', emotion_response_level(a2, 1.39275), [range(2, 50, unknown), range(1, 2, true), range(0, 1, unknown)]).
atom_trace('emotion_response_level(a3, 1.04275)', emotion_response_level(a3, 1.04275), [range(2, 50, unknown), range(1, 2, true), range(0, 1, unknown)]).
atom_trace('has_emotional_value(a1, aspect, 1.8)', has_emotional_value(a1, aspect, 1.8), [range(3, 50, unknown), range(0, 3, true)]).
atom_trace('has_emotional_value(a3, aspect, 1.8)', has_emotional_value(a3, aspect, 1.8), [range(3, 50, unknown), range(0, 3, true)]).


Ghazanfar
I would suggest creating a dictionary of dictionaries in this format:
Expand|Select|Wrap|Line Numbers
  1. {record1: {seed: emotion....., agent: ax, value: x.xxxx, rangeTrue: [low, high], rangeFalse: [low,high]}
  2.  record2: {....}
  3.  record3: {....}
  4. }
Mar 15 '07 #8

bvdet
Expert Mod 2.5K+
P: 2,851
Since there may be multiple ranges, maybe this format would be better:
Expand|Select|Wrap|Line Numbers
  1. rec0 = {'ranges': [(49, 50, True), (0, 49, False)], 'seed': 'emotion_response_level', 'value': 1.5607200000000001, 'agent': 'a1'}
Sample dictionary listing:
Expand|Select|Wrap|Line Numbers
  1. rec4: ranges = [(1, 50, False), (0, 1, True)]
  2. rec4: seed = emotion_response_level
  3. rec4: value = 1.2
  4. rec4: agent = a2
  5. rec5: ranges = [(1, 50, False), (0, 1, True)]
  6. rec5: seed = emotion_response_level
  7. rec5: value = 0.7
  8. rec5: agent = a3
  9. rec6: ranges = [(2, 50, False), (1, 2, True), (0, 1, False)]
  10. rec6: seed = emotion_response_level
  11. rec6: value = 1.84775
  12. rec6: agent = a1
Mar 15 '07 #9

bvdet
Expert Mod 2.5K+
P: 2,851
Since there may be multiple ranges, maybe this format would be better:
Expand|Select|Wrap|Line Numbers
  1. rec0 = {'ranges': [(49, 50, True), (0, 49, False)], 'seed': 'emotion_response_level', 'value': 1.5607200000000001, 'agent': 'a1'}
Sample dictionary listing:
Expand|Select|Wrap|Line Numbers
  1. rec4: ranges = [(1, 50, False), (0, 1, True)]
  2. rec4: seed = emotion_response_level
  3. rec4: value = 1.2
  4. rec4: agent = a2
  5. rec5: ranges = [(1, 50, False), (0, 1, True)]
  6. rec5: seed = emotion_response_level
  7. rec5: value = 0.7
  8. rec5: agent = a3
  9. rec6: ranges = [(2, 50, False), (1, 2, True), (0, 1, False)]
  10. rec6: seed = emotion_response_level
  11. rec6: value = 1.84775
  12. rec6: agent = a1
Ghazanfar,

Would the data compiled in this format work for you? Are you going to write the code yourself, or do you need help? We are here if you need help.
Mar 16 '07 #10

bvdet
Expert Mod 2.5K+
P: 2,851
I guess ghazanfar solved his problem since he has not been back. For those of you interested, here is the code I came up with:
Expand|Select|Wrap|Line Numbers
  1. fname = r'your_file'
  2.  
  3. def atomtraceParse(fn):
  4.     fileList = [x[x.find(")', ")+4:].replace(', aspect', '').strip() for x in open(fn).readlines()]
  5.     dd = {}
  6.     cnt = 0
  7.     for item in fileList:
  8.         itemList = item.split('), ')
  9.         a = itemList[0].split('(')
  10.         a1 = a[1].split(', ')
  11.         rangeList = [eval(x.strip('[]()').split('(')[1].replace('unknown', 'False').replace('true', 'True')) for x in itemList[1:]]
  12.         dd['rec'+str(cnt)]=dict(seed=a[0], agent=a1[0], value=float(a1[1]), ranges=[x for x in rangeList])
  13.         cnt += 1
  14.     return dd
  15.  
  16. dataDict = atomtraceParse(fname)
  17.  
  18. keys = dataDict.keys()
  19. keys.sort()
  20. for key in keys:
  21.     for item in dataDict[key]:
  22.         print '%s: %s = %s' % (key, item, dataDict[key][item])
Mar 18 '07 #11

Post your reply

Sign in to post your reply or Sign up for a free account.