File Parsing

440 256MB

Hi ,

Below is the file format ,which has Keywords in the file.I would like to store the data in the different variables ( Parameters,Points ,Lines ,Circle)

Expand|Select|Wrap|Line Numbers

 Sample.txt
 
$$$$Header$$$$$$$$$$$$

$$$$Parameter$$$$$$$$$
 
/Parameter_Value/ 1.0
 
/Point/ 

10.0 10.0 10.0 $ Comment: Point Data

20.0 20.0 20.0 
 
$$$$$$Line$$$$$$$$
 
/Line/ $Line Data
 
10.0 15.0 0.0

20.0 10.0 0.0 
 
$$$$$$Circle$$$$$$$$

/Circle/
 
10.0 $Radius
 
0.0 0.0 0.0  $Center

Can body help me in the best way ( Oprtimized way - interms of lines of code) of writing the code.

Thanks
PSB

Aug 10 '07 #1

Subscribe Post Reply

1680

psbasha

440

256MB

Hi ,

Below is the file format ,which has Keywords in the file.I would like to store the data in the different variables ( Parameters,Points ,Lines ,Circle)

Expand|Select|Wrap|Line Numbers

Sample.txt

$$$$Header$$$$$$$$$$$$

$$$$Parameter$$$$$$$$$

/Parameter_Value/ 1.0

/Point/

10.0 10.0 10.0 $ Comment: Point Data

20.0 20.0 20.0

$$$$$$Line$$$$$$$$

/Line/ $Line Data

10.0 15.0 0.0

20.0 10.0 0.0

$$$$$$Circle$$$$$$$$

/Circle/

10.0 $Radius

0.0 0.0 0.0 $Center

Can body help me in the best way ( Oprtimized way - interms of lines of code) of writing the code.

Thanks
PSB

The above is a sample data only.I have to read different Unique Geometry elements data in that file format having the different and unique key word

-PSB

Aug 10 '07 #2

bvdet

2,851

Expert Mod 2GB

The above is a sample data only.I have to read different Unique Geometry elements data in that file format having the different and unique key word

-PSB

It's probably not the best way, but it seems to work. All dictionary values are lists:

Expand|Select|Wrap|Line Numbers

 import re
 
key_patt = re.compile(r'/([A-Za-z_]+)/')

data_patt = re.compile(r'\d+.\d+')

fn = 'data.txt'
 
f = open(fn)

key = None

dd = {}

lineList = [line.strip() for line in open(fn).readlines() \

            if line != '\n' and not line.startswith('$')]

for line in lineList:

    try:

        line = line[:line.index('$')]

    except:

        pass

    m = key_patt.search(line)

    if m:

        key = m.group(1)

        dd[key] = []

        if data_patt.search(line):

            dd[key] = [float(data_patt.search(line).group(0))]

        else:

            dd[key] = []

    else:

        m1 = data_patt.search(line)

        if m1:

            dd[key].append([float(n) for n in data_patt.findall(line)])
 
for key in dd:

    print '%s = %s' % (key, dd[key])

Did you ever resolve the point translation issue (this thread )? You never responded after I posted what I thought was a solution for you. A little feedback would be appreciated. Here's the output:

>>> Line = [[10.0, 15.0, 0.0], [20.0, 10.0, 0.0]]
Parameter_Value = [1.0]
Circle = [[10.0], [0.0, 0.0, 0.0]]
Point = [[10.0, 10.0, 10.0], [20.0, 20.0, 20.0]]
>>>

Aug 11 '07 #3

psbasha

440

256MB

It's probably not the best way, but it seems to work. All dictionary values are lists:

Expand|Select|Wrap|Line Numbers

import re

key_patt = re.compile(r'/([A-Za-z_]+)/')

data_patt = re.compile(r'\d+.\d+')

fn = 'data.txt'

f = open(fn)

key = None

dd = {}

lineList = [line.strip() for line in open(fn).readlines() \

            if line != '\n' and not line.startswith('$')]

for line in lineList:

    try:

        line = line[:line.index('$')]

    except:

        pass

    m = key_patt.search(line)

    if m:

        key = m.group(1)

        dd[key] = []

        if data_patt.search(line):

            dd[key] = [float(data_patt.search(line).group(0))]

        else:

            dd[key] = []

    else:

        m1 = data_patt.search(line)

        if m1:

            dd[key].append([float(n) for n in data_patt.findall(line)])

for key in dd:

    print '%s = %s' % (key, dd[key])

Did you ever resolve the point translation issue (this thread )? You never responded after I posted what I thought was a solution for you. A little feedback would be appreciated. Here's the output:
>>> Line = [[10.0, 15.0, 0.0], [20.0, 10.0, 0.0]]
Parameter_Value = [1.0]
Circle = [[10.0], [0.0, 0.0, 0.0]]
Point = [[10.0, 10.0, 10.0], [20.0, 20.0, 20.0]]
>>>

Thanks BV for the solution.
The Point translation problem I have took the portion of the code snippet and solved with your approach.But if you have better approach than previous one,you can post the solution.So that I can use that approach.

-PSB

Aug 11 '07 #4

psbasha

440

256MB

Hi,

I have the below file format,how to read in a concise way,
The file looke like this

Expand|Select|Wrap|Line Numbers

 Sample Data

4 Types

_up,1

_low,2

_left,5

_right,6
 
2Flags

_low,no

_up,yes
 
1 Data

x, 10
 
4 Values

1,0,0

1,1,0

1,1,1

1,1,0
 
2 Planes Type-1

1,0,0

0,0,0

0,1,0

0,0,0

0,1,0

0,0,1

In case of plane there are 2 planes defined,we have to have 2plane data seperate.

Thanks
PSB

Aug 24 '07 #5

psbasha

440

256MB

It's probably not the best way, but it seems to work. All dictionary values are lists:

Expand|Select|Wrap|Line Numbers

import re

key_patt = re.compile(r'/([A-Za-z_]+)/')

data_patt = re.compile(r'\d+.\d+')

fn = 'data.txt'

f = open(fn)

key = None

dd = {}

lineList = [line.strip() for line in open(fn).readlines() \

            if line != '\n' and not line.startswith('$')]

for line in lineList:

    try:

        line = line[:line.index('$')]

    except:

        pass

    m = key_patt.search(line)

    if m:

        key = m.group(1)

        dd[key] = []

        if data_patt.search(line):

            dd[key] = [float(data_patt.search(line).group(0))]

        else:

            dd[key] = []

    else:

        m1 = data_patt.search(line)

        if m1:

            dd[key].append([float(n) for n in data_patt.findall(line)])

for key in dd:

    print '%s = %s' % (key, dd[key])

Did you ever resolve the point translation issue (this thread )? You never responded after I posted what I thought was a solution for you. A little feedback would be appreciated. Here's the output:
>>> Line = [[10.0, 15.0, 0.0], [20.0, 10.0, 0.0]]
Parameter_Value = [1.0]
Circle = [[10.0], [0.0, 0.0, 0.0]]
Point = [[10.0, 10.0, 10.0], [20.0, 20.0, 20.0]]
>>>

Hi BV,

I have tried with above piece of code for reading some more filed formats as mentioned below ,the peice of code is not supporting this field format.Can you please suggest how to group for digits and alphanumeric values for the below scenarios.

Expand|Select|Wrap|Line Numbers

 Sample.txt

Sample.txt
 
$$$$Header$$$$$$$$$$$$

$$$$Parameter$$$$$$$$$
 
/Parameter_Value/ 1.0
 
/Point/ 

10.0 10.0 10.0 $ Comment: Point Data

20.0 20.0 20.0 
 
$$$$$$Line$$$$$$$$
 
/Line/ $Line Data
 
10.0 15.0 0.0

20.0 10.0 0.0 
 
$$$$$$Circle$$$$$$$$

/Circle/
 
10.0 $Radius
 
0.0 0.0 0.0  $Center
 
/DashedLineType/            21 $Dashed Line
 
/XMin_XMax_YMin_YMax/        1 27 1 37 $ Min and Max value
 
/LineFlag/        yes $ Flag to update
 
/XY-Plane/ 'Planes'

1,0,0

0,1,0

0,0,0
 
/XY-Plane/ 'Planes'

2,0,0

0,2,0

0,0,0
 
/Format/

$Values    

    3     3     1    50    25    28   'Yes'  1

Thanks
PSB

Sep 14 '07 #6

bvdet

2,851

Expert Mod 2GB

Hi BV,

I have tried with above piece of code for reading some more filed formats as mentioned below ,the peice of code is not supporting this field format.Can you please suggest how to group for digits and alphanumeric values for the below scenarios.

Expand|Select|Wrap|Line Numbers

Sample.txt

Sample.txt

$$$$Header$$$$$$$$$$$$

$$$$Parameter$$$$$$$$$

/Parameter_Value/ 1.0

/Point/

10.0 10.0 10.0 $ Comment: Point Data

20.0 20.0 20.0

$$$$$$Line$$$$$$$$

/Line/ $Line Data

10.0 15.0 0.0

20.0 10.0 0.0

$$$$$$Circle$$$$$$$$

/Circle/

10.0 $Radius

0.0 0.0 0.0  $Center

/DashedLineType/            21 $Dashed Line

/XMin_XMax_YMin_YMax/        1 27 1 37 $ Min and Max value

/LineFlag/        yes $ Flag to update

/XY-Plane/ 'Planes'

1,0,0

0,1,0

0,0,0

/XY-Plane/ 'Planes'

2,0,0

0,2,0

0,0,0

/Format/

$Values

    3     3     1    50    25    28   'Yes'  1

Thanks
PSB

When I write data to a file, I always set up a structured format that is easy to parse. You should try it. This code seems to work:

Expand|Select|Wrap|Line Numbers

 import re
 
# thanks ilikepython!

def indexList(s, item, start = 0):

    return [i + start for (i, obj) in enumerate(s[start:]) if obj == item]
 
def convertType(s):

    for func in (int, float, eval):

        try:

            n = func(s)

            return n

        except:

            pass

    return s
 
key_patt = re.compile(r'/([A-Za-z_-]+)/')

data_patt = re.compile(r'\d+\.\d+|\d+|\w+')

fn = 'parameter.txt'
 
key = None

dd = {}

lineList = [line.strip() for line in open(fn).readlines() if line != '\n' and not line.startswith('$')]

for line in lineList:

    try:

        line = line[:line.index('$')]

    except:

        pass

    m = key_patt.search(line)

    if m:

        key = m.group(1)

        line1 = line[indexList(line, '/')[1]+1:]

        if data_patt.search(line1):

            if dd.has_key(key):

                dd[key] = dd[key]+[convertType(item) for item in data_patt.findall(line1)]

            else:

                dd[key] = [convertType(item) for item in data_patt.findall(line1)]

        else:

            dd[key] = []

    else:

        m1 = data_patt.search(line)

        if m1:

            dd[key].append([convertType(n) for n in data_patt.findall(line)])
 
for key in dd:

    print '%s = %s' % (key, dd[key])

>>> DashedLineType = [21]
Parameter_Value = [1.0]
Point = [[10.0, 10.0, 10.0], [20.0, 20.0, 20.0]]
XY-Plane = ['Planes', [1, 0, 0], [0, 1, 0], [0, 0, 0], 'Planes', [2, 0, 0], [0, 2, 0], [0, 0, 0]]
Format = [[3, 3, 1, 50, 25, 28, 'Yes', 1]]
XMin_XMax_YMin_YMax = [1, 27, 1, 37]
LineFlag = ['yes']
Line = [[10.0, 15.0, 0.0], [20.0, 10.0, 0.0]]
Circle = [[10.0], [0.0, 0.0, 0.0]]
>>>

Sep 14 '07 #7

psbasha

440

256MB

Expand|Select|Wrap|Line Numbers

 SampleTest
 
$$$$Header$$$$$$$$$$$$

$$$$Parameter$$$$$$$$$
 
/Parameter_range/ 1 1
 
/Flag1/ 1

/Flag2/ 1

/DummyFlag1/ 1
 
/STOP/ Line and Circle
 
$$$$
 
/LineThick/ 0.1 $$$Line Thickness
 
$$$$
 
/Top1/ 10 $$Value1

/Top2/ 11 $$Value2
 
 $$$

/Bot1/  20 $$Comment

/Bot2/ 30 $$Comment

/Bot4/ 40 $$Comment
 
$$

/TOl1/ -0.05

/TOl2/ 0.01
 
$$$$$$Line IDs$$$$$$$$
 
/NOT/  10 11 12 1

/NOT/  10 11 12 2

/Ok/   11 12 1  3
 
/MAT/ $$

1 $Begin

100.    40.    30.    2.0    0 ****22 ksdas

2

200.    40.    60.    2.0    0 ****22 ksdas

3

600.    40.    30.    5.0    0 ****22 ksdas

4

500.    40.    70.    2.0    0 ****22 ksdas

0 $End

2 ***Values $Begin  

1000.  .1

2000.  .2

3000.  .3

4000.  .6

   0.  .0 $End
 
3 ***Values $Begin  

3000.  .1

5000.  .2

6000.  .3

7000.  .6

   0.  .0 $End

0 $End
 
2 ***Values $Begin  

1000.  .1

2000.  .2

3000.  .3

4000.  .6

   0.  .0 $End
 
3 ***Values $Begin  

13000.  .1

45000.  .2

56000.  .3

87000.  .6

    0.  .0 $End

0 $End

2 $Begin

    2.0 .00

    2.0 .210

    3.0 .235

    0.  .0 $End

3 $Begin

    2.0 .00

    2.0 .210

    3.0 .235

    0.  .0 $End

0 $End

/4*ALL/ $ ***

 11       1       1       1     69716.   1000

 11       1       1       5     76296.   1000

 31       1       1       6     74926.   1000

 31       1       1       7     74653.   1000

I have using the above sameple code for reading and storing the data.But I am getting the following error as mentioned below.How to cutomize the above piece of code for reading the above sample file?

PythonWin 2.4.2 (#67, Sep 28 2005, 12:41:11) [MSC v.1310 32 bit (Intel)] on win32.
File "C:\Sample-Mat.py", line 42, in ?
dd[key].append([convertType(n) for n in data_patt.findall(line)])

Thanks
PSB

Jan 5 '08 #8

bvdet

2,851

Expert Mod 2GB

You will have to explain how you need the data tabulated. I have no idea what most of the data is.

Jan 5 '08 #9

psbasha

440

256MB

You will have to explain how you need the data tabulated. I have no idea what most of the data is.

Expand|Select|Wrap|Line Numbers

 Description

Hi BV,
 
we have the kewords in the '/ /'.The respective data is available beside or below the keywords for some cases.
 
The data should be stored as shwon below ,but using the dict and list using regular expression.
 
parameter_range = [1,1]
 
Flag1 = 1

.....
 
STOP = 'Line and Circel'
 
Top1 = 10

Top2 = 11

...
 
Bot1 = 20

Bot2 = 30

....
 
Tol1 = -0.05

Tol2 = 0.01
 
NOT = [[ 10,11,12,1],[10,11,12,2]]

OK = [[ 11,12,1,3]]

MAT = { 1:[100.,40.,30.,20.,0],2:[200.,40.,60.,2.0,0],3:[600,40.,30.,5.0,0],4:[500.,40.,70.,2.0,0]}
 
# 2-integer number  is the start for the block and '0. .0' is the end

MATc = {2:[[1000., 0.1],[2000. ,.2],[3000,0.3],[4000.,0.6]],3:[ [3000.0,0.1],[5000.,0.2],[6000.,.4],[7000.,.6]}

# 0. .0 is the end of the sub block

# o is the end of the block
 
#Similarly for the other block

# 2-integer number  is the start for the block and '0. .0' is the end

MATT = {2:[[1000., 0.1],[2000. ,.5],[3000,0.3],[4000.,0.6]],3:[ [3000.0,0.9],[5000.,0.2],[6000.,.4],[7000.,.6]}

# 0. .0 is the end of the sub block

# o is the end of the block
 
#Similarly for the other block

# 2-integer number  is the start for the block and '0. .0' is the end
 
Factor = {2:[[0.00,2.0],[.210,2.0],[0.235,3.0]],3:[[0.00,2.0],[.2110,2.0],[0.2135,3.0]]}

# 0. .0 is the end of the sub block

# o is the end of the block
 
ALL = [[ 11,1,1,1,69716.,1000],[ 11,1,1,5,76296.,1000],[ 31,1,1,6,74926.,1000],[ 31, 1,1,7,74653.,1000]]

Jan 5 '08 #10

bvdet

2,851

Expert Mod 2GB

Where does MATT, MATc, and Factor come from?

Jan 6 '08 #11

bvdet

2,851

Expert Mod 2GB

BTW, your script fails on your data because all comment lines must begin with '$'. It fails on the first line of data.

Jan 6 '08 #12

psbasha

440

256MB

Where does MATT, MATc, and Factor come from?

Sorry I am explaining how the data can be stored in the dictonary variables or over all data..Its only example to store the data.

MATT,MATc and Factor are variables.

Thanks
PSB

Jan 6 '08 #13

psbasha

440

256MB

BTW, your script fails on your data because all comment lines must begin with '$'. It fails on the first line of data.

Sorry ,all the commnets start with '$' sign.

Jan 6 '08 #14

bvdet

2,851

Expert Mod 2GB

I made a few minor changes to the code in the earlier solution. Following is the entire source code and output from your data file (with the first line commented out):

Expand|Select|Wrap|Line Numbers

 import re
 
def indexList(s, item, i=0):

    i_list = []

    while True:

        try:

            i = s.index(item, i)

            i_list.append(i)

            i += 1

        except:

            break

    return i_list
 
def convertType(s):

    for func in (int, float, eval):

        try:

            n = func(s)

            return n

        except:

            pass

    return s
 
key_patt = re.compile(r'/([A-Za-z_\-0-9]+)/')

data_patt = re.compile(r'\d+\.\d+|\d+|\w+')
 
# function to strip comments

def strip_comments(s):

    if '$' in s:

        return s[:s.index('$')]

    elif '*' in s:

        return s[:s.index('*')]

    return s
 
def parse_data(fn):

    key = None

    dd = {}

    lineList = [strip_comments(line.strip()) for line in open(fn).readlines()\

                if line != '\n' and not line.startswith('$')]

    for line in lineList:

        m = key_patt.search(line)

        if m:

            key = m.group(1)

            line1 = line[indexList(line, '/')[1]+1:]

            if data_patt.search(line1):

                if dd.has_key(key):

                    dd[key] = dd[key]+[convertType(item) for item in \

                                       data_patt.findall(line1)]

                else:

                    dd[key] = [convertType(item) for item in \

                               data_patt.findall(line1)]

            else:

                dd[key] = []

        else:

            m1 = data_patt.search(line)

            if m1:

                dd[key].append([convertType(n) for n in \

                                data_patt.findall(line)])

    return dd
 
if __name__ == '__main__':

    #fn = r'H:\TEMP\temsys\parameter.txt'

    fn = r'H:\TEMP\temsys\sample_data1.txt'

    dataDict = parse_data(fn)

    for key in dataDict:

        print '%s = %s' % (key, dataDict[key])
 
>>> Ok = [11, 12, 1, 3]

MAT = [[1], [100, 40, 30, 2.0, 0], [2], [200, 40, 60, 2.0, 0], [3], [600, 40, 30, 5.0, 0], [4], [500, 40, 70, 2.0, 0], [0], [2, 'Values'], [1000, 1], [2000, 2], [3000, 3], [4000, 6], [0, 0], [3, 'Values'], [3000, 1], [5000, 2], [6000, 3], [7000, 6], [0, 0], [0], [2, 'Values'], [1000, 1], [2000, 2], [3000, 3], [4000, 6], [0, 0], [3, 'Values'], [13000, 1], [45000, 2], [56000, 3], [87000, 6], [0, 0], [0], [2], [2.0, 0], [2.0, 210], [3.0, 235], [0, 0], [3], [2.0, 0], [2.0, 210], [3.0, 235], [0, 0], [0], [4, 'ALL'], [11, 1, 1, 1, 69716, 1000], [11, 1, 1, 5, 76296, 1000], [31, 1, 1, 6, 74926, 1000], [31, 1, 1, 7, 74653, 1000]]

LineThick = [0.10000000000000001]

TOl2 = [0.01]

TOl1 = [0.050000000000000003]

STOP = ['Line', 'and', 'Circle']

Top2 = [11]

Top1 = [10]

Bot4 = [40]

Bot1 = [20]

NOT = [10, 11, 12, 1, 10, 11, 12, 2]

Flag2 = [1]

Flag1 = [1]

Parameter_range = [1, 1]

Bot2 = [30]

DummyFlag1 = [1]

>>>

I understand that this is not your final solution. Maybe you can come up with a way to parse the 'MAT' data.

Jan 6 '08 #15

psbasha

440

256MB

BTW, your script fails on your data because all comment lines must begin with '$'. It fails on the first line of data.

Expand|Select|Wrap|Line Numbers

 SampleInputData
 
$$$$Header$$$$$$$$$$$$

$$$$Parameter$$$$$$$$$
 
/Parameter_range/ 1 1
 
/Flag1/ 1

/Flag2/ 1

/DummyFlag1/ 1
 
/STOP/ Line and Circle
 
$$$$
 
/LineThick/ 0.1 $$$Line Thickness
 
$$$$
 
/Top1/ 10 $$Value1

/Top2/ 11 $$Value2
 
 $$$

/Bot1/  20 $$Comment

/Bot2/ 30 $$Comment

/Bot4/ 40 $$Comment
 
$$

/TOl1/ -0.05

/TOl2/ 0.01
 
$$$$$$Line IDs$$$$$$$$
 
/NOT/  10 11 12 1

/NOT/  10 11 12 2

/Ok/   11 12 1  3
 
/MAT/ $$

1 $Begin

100.    40. 30.  2.0   0 ****22 ksdas

2

200.    40. 60.  2.0   0 ****22 ksdas

3

600.    40. 30.  5.0   0 ****22 ksdas

4

500.    40. 70.  2.0   0 ****22 ksdas

0 $End

2 ***Values $Begin  

1000.  .1

2000.  .2

3000.  .3

4000.  .6

   0.  .0 $End
 
3 ***Values $Begin  

3000.  .1

5000.  .2

6000.  .3

7000.  .6

   0.  .0 $End

0 $End
 
2 ***Values $Begin  

1000.  .1

2000.  .2

3000.  .3

4000.  .6

   0.  .0 $End
 
3 ***Values $Begin  

13000.  .1

45000.  .2

56000.  .3

87000.  .6

    0.  .0 $End

0 $End

2 $Begin

    2.0 .00

    2.0 .210

    3.0 .235

    0.  .0 $End

3 $Begin

    2.0 .00

    2.0 .210

    3.0 .235

    0.  .0 $End

0 $End

/4*ALL/ $ ***

 11       1       1       1     69716.   1000

 11       1       1       5     76296.   1000

 31       1       1       6     74926.   1000

 31       1       1       7     74653.   1000

Jan 6 '08 #16

psbasha

440

256MB

>>> Ok = [11, 12, 1, 3]
MAT = [[1], [100, 40, 30, 2.0, 0], [2], [200, 40, 60, 2.0, 0], [3], [600, 40, 30, 5.0, 0], [4], [500, 40, 70, 2.0, 0], [0], [2, 'Values'], [1000, 1], [2000, 2], [3000, 3], [4000, 6], [0, 0], [3, 'Values'], [3000, 1], [5000, 2], [6000, 3], [7000, 6], [0, 0], [0], [2, 'Values'], [1000, 1], [2000, 2], [3000, 3], [4000, 6], [0, 0], [3, 'Values'], [13000, 1], [45000, 2], [56000, 3], [87000, 6], [0, 0], [0], [2], [2.0, 0], [2.0, 210], [3.0, 235], [0, 0], [3], [2.0, 0], [2.0, 210], [3.0, 235], [0, 0], [0], [4, 'ALL'], [11, 1, 1, 1, 69716, 1000], [11, 1, 1, 5, 76296, 1000], [31, 1, 1, 6, 74926, 1000], [31, 1, 1, 7, 74653, 1000]]
LineThick = [0.10000000000000001]
TOl2 = [0.01]
TOl1 = [0.050000000000000003]
STOP = ['Line', 'and', 'Circle']
Top2 = [11]
Top1 = [10]
Bot4 = [40]
Bot1 = [20]
NOT = [10, 11, 12, 1, 10, 11, 12, 2]
Flag2 = [1]
Flag1 = [1]
Parameter_range = [1, 1]
Bot2 = [30]
DummyFlag1 = [1]
>>> [/code]I understand that this is not your final solution. Maybe you can come up with a way to parse the 'MAT' data.[/quote]

Expand|Select|Wrap|Line Numbers

 Description

>>> 

------------------------------------------------------------------

Ok = [11, 12, 1, 3]  # This should be [[11, 12, 1, 3] ].In this we have one or more 
 
MAT = [[1], [100, 40, 30, 2.0, 0], [2], [200, 40, 60, 2.0, 0], [3], [600, 40, 30, 5.0, 0], [4], [500, 40, 70, 2.0, 0], [0], [2, 'Values'], [1000, 1], [2000, 2], [3000, 3], [4000, 6], [0, 0], [3, 'Values'], [3000, 1], [5000, 2], [6000, 3], [7000, 6], [0, 0], [0], [2, 'Values'], [1000, 1], [2000, 2], [3000, 3], [4000, 6], [0, 0], [3, 'Values'], [13000, 1], [45000, 2], [56000, 3], [87000, 6], [0, 0], [0], [2], [2.0, 0], [2.0, 210], [3.0, 235], [0, 0], [3], [2.0, 0], [2.0, 210], [3.0, 235], [0, 0], [0], [4, 'ALL'], [11, 1, 1, 1, 69716, 1000], [11, 1, 1, 5, 76296, 1000], [31, 1, 1, 6, 74926, 1000], [31, 1, 1, 7, 74653, 1000]]
 
#The mat should have only these values

MAT = { 1:[100.,40.,30.,20.,0],2:[200.,40.,60.,2.0,0],3:[600,40.,30.,5.0,0],4:[500.,40.,70.,2.0,0]}

-------------------------------------------------------------------------------

#Other  block data should be

{2:[[1000., 0.1],[2000. ,.2],[3000,0.3],[4000.,0.6]],3:[ [3000.0,0.1],[5000.,0.2],[6000.,.4],[7000.,.6]}
 
 {2:[[1000., 0.1],[2000. ,.5],[3000,0.3],[4000.,0.6]],3:[ [3000.0,0.9],[5000.,0.2],[6000.,.4],[7000.,.6]}
 
{2:[[0.00,2.0],[.210,2.0],[0.235,3.0]],3:[[0.00,2.0],[.2110,2.0],[0.2135,3.0]]}
 
ALL = [[ 11,1,1,1,69716.,1000],[ 11,1,1,5,76296.,1000],[ 31,1,1,6,74926.,1000],[ 31, 1,1,7,74653.,1000]]
 
-------------------------------------------------------------------------------

LineThick = [0.10000000000000001]

TOl2 = [0.01]

TOl1 = [0.050000000000000003]

----------------------------------------------------------

STOP = ['Line', 'and', 'Circle']

#It should be  STOP = 'Line and Circle'

----------------------------------------------------------

Top2 = [11]

Top1 = [10]

Bot4 = [40]

Bot1 = [20]

----------------------------------------------------------

NOT = [10, 11, 12, 1, 10, 11, 12, 2]
 
#it should be stored as 

NOT = [[ 10,11,12,1],[10,11,12,2]]
 
----------------------------------------------------------

Flag2 = [1]

Flag1 = [1]

Parameter_range = [1, 1]

Bot2 = [30]

DummyFlag1 = [1]

The above is the description for some of the variables to be stored and the for some cases where we have single data ,need not be created as list.

Help me in fixing to get the data as mentioned above description

Jan 6 '08 #17

bvdet

2,851

Expert Mod 2GB

Expand|Select|Wrap|Line Numbers

Description

>>>

------------------------------------------------------------------

Ok = [11, 12, 1, 3]  # This should be [[11, 12, 1, 3] ].In this we have one or more

MAT = [[1], [100, 40, 30, 2.0, 0], [2], [200, 40, 60, 2.0, 0], [3], [600, 40, 30, 5.0, 0], [4], [500, 40, 70, 2.0, 0], [0], [2, 'Values'], [1000, 1], [2000, 2], [3000, 3], [4000, 6], [0, 0], [3, 'Values'], [3000, 1], [5000, 2], [6000, 3], [7000, 6], [0, 0], [0], [2, 'Values'], [1000, 1], [2000, 2], [3000, 3], [4000, 6], [0, 0], [3, 'Values'], [13000, 1], [45000, 2], [56000, 3], [87000, 6], [0, 0], [0], [2], [2.0, 0], [2.0, 210], [3.0, 235], [0, 0], [3], [2.0, 0], [2.0, 210], [3.0, 235], [0, 0], [0], [4, 'ALL'], [11, 1, 1, 1, 69716, 1000], [11, 1, 1, 5, 76296, 1000], [31, 1, 1, 6, 74926, 1000], [31, 1, 1, 7, 74653, 1000]]

#The mat should have only these values

MAT = { 1:[100.,40.,30.,20.,0],2:[200.,40.,60.,2.0,0],3:[600,40.,30.,5.0,0],4:[500.,40.,70.,2.0,0]}

-------------------------------------------------------------------------------

#Other  block data should be

{2:[[1000., 0.1],[2000. ,.2],[3000,0.3],[4000.,0.6]],3:[ [3000.0,0.1],[5000.,0.2],[6000.,.4],[7000.,.6]}

{2:[[1000., 0.1],[2000. ,.5],[3000,0.3],[4000.,0.6]],3:[ [3000.0,0.9],[5000.,0.2],[6000.,.4],[7000.,.6]}

{2:[[0.00,2.0],[.210,2.0],[0.235,3.0]],3:[[0.00,2.0],[.2110,2.0],[0.2135,3.0]]}

ALL = [[ 11,1,1,1,69716.,1000],[ 11,1,1,5,76296.,1000],[ 31,1,1,6,74926.,1000],[ 31, 1,1,7,74653.,1000]]

-------------------------------------------------------------------------------

LineThick = [0.10000000000000001]

TOl2 = [0.01]

TOl1 = [0.050000000000000003]

----------------------------------------------------------

STOP = ['Line', 'and', 'Circle']

#It should be  STOP = 'Line and Circle'

----------------------------------------------------------

Top2 = [11]

Top1 = [10]

Bot4 = [40]

Bot1 = [20]

----------------------------------------------------------

NOT = [10, 11, 12, 1, 10, 11, 12, 2]

#it should be stored as

NOT = [[ 10,11,12,1],[10,11,12,2]]

----------------------------------------------------------

Flag2 = [1]

Flag1 = [1]

Parameter_range = [1, 1]

Bot2 = [30]

DummyFlag1 = [1]

The above is the description for some of the variables to be stored and the for some cases where we have single data ,need not be created as list.

Help me in fixing to get the data as mentioned above description

What you are saying is that the parsing must be customized for certain keywords. The code as is will parse all the data in a consistent manner. You must make an effort to solve this problem yourself. Post back your solution and we can try to help you from there.

Jan 6 '08 #18

psbasha

440

256MB

What you are saying is that the parsing must be customized for certain keywords. The code as is will parse all the data in a consistent manner. You must make an effort to solve this problem yourself. Post back your solution and we can try to help you from there.

Expand|Select|Wrap|Line Numbers

 SampleCode

key_patt = re.compile(r'/([A-Za-z_\-0-9]+)/')

data_patt = re.compile(r'\d+\.\d+|\d+|-\.\d+|\w+') 

def parse_data(fn):

    key = None

    bMFlag = False

    iCount = 0

    dataList = []

    dd = {}
 
    matDataDict = {}

    matCDict = {}

    matTDict = {}

    comFactDict = {}

    bMCFlag = False

    bMTFlag = False

    bMatDataBlockFlag = False

    bDataFlag = False

    otherFList =[]

    bmatStartFlag = True

    bmatEndFlag = False

    dataListList = []
 
    lineList = [strip_comments(line.strip()) for line in open(fn).readlines()\

                if line != '\n' and not line.startswith('$')]
 
    for line in lineList:

        m = key_patt.search(line)

        if m:

            key = m.group(1)

            line1 = line[indexList(line, '/')[1]+1:]

            if key == 'NOT':

                dataList = [convertType(item) for item in \

                                data_patt.findall(line1)]

                dataListList.append(dataList)                

            else:

                if data_patt.search(line1):                

                    if dd.has_key(key):

                        dd[key] = dd[key]+[convertType(item) for item in \

                                           data_patt.findall(line1)]

                    else:

                        dd[key] = [convertType(item) for item in \

                                   data_patt.findall(line1)]

                else:

                    dd[key] = []

                    bMFlag = True

                    bMatDataBlockFlag = True

        else:

            if 'ALL' in line:

                bDataFlag = True

                bMatDataBlockFlag = False

            elif bDataFlag:

                if bDataFlag and line != '\n':

                    line1 = line.split()

                    otherFList.append(line1)

                elif bDataFlag and '\n':

                    bDataFlag = False

            elif  bMatDataBlockFlag:                               

                if line.startswith('0') and  '0.  .0' != line and not line.startswith('0.  .0'):                

                    bMFlag = False                

                    if bMCFlag :

                        bMCFlag = False

                        bMTFlag = True                    

                    else:

                        if bMTFlag :

                            bMTFlag = False

                            iCount =0

                        elif not bMCFlag :

                            bMCFlag = True                            

                            iCount =0                                                                

                        else:

                            pass                            

                else:

                    if bMFlag:

                        m1 = data_patt.search(line)

                        if m1:

                            if bmatStartFlag:

                                dataList = []

                                list1 = [convertType(n) for n in \

                                            data_patt.findall(line)]

                                matID = list1[0]

                                bmatStartFlag = False

                            else:

                                dataList = [convertType(n) for n in \

                                            data_patt.findall(line)]

                                matDataDict[matID] = dataList

                                dataList = []

                                bmatStartFlag = True                                

                    elif bMCFlag:                    

                        if iCount ==0:

                            dataList =[]

                            line1 = line.split()

                            matID = int(line1[0])

                            iCount = iCount + 1                        

                        #elif '0.  .0' != line and not line.startswith('0.  .0'):

                        elif not line.startswith('0.  .0'):

                            line1 = line.split()

                            dataList.append([float( line1[0]),float(line1[1])])                        

                        elif  '0.  .0' == line or line.startswith('0.  .0') :

                            matCDict[matID] = dataList

                            iCount =0

                    elif bMTFlag:

                        if iCount ==0:

                            dataList = []

                            line1 = line.split()

                            matID = int(line1[0])

                            iCount = iCount + 1                                                

                        elif not line.startswith('0.  .0'):

                            line1 = line.split()

                            dataList.append([float( line1[0]),float(line1[1])])                                                

                        elif '0.  .0' == line or line.startswith('0.  .0'):

                            matTDict[matID] = dataList

                            iCount =0

                    elif not bMCFlag and not bMTFlag:

                        if iCount ==0:

                            dataList = []

                            line1 = line.split()

                            matID = int(line1[0])

                            iCount = iCount + 1                                                

                        elif not line.startswith('0.  .0'):

                            line1 = line.split()                        

                            dataList.append([float( line1[1]),float( line1[0])])                        

                        elif  '0.  .0' == line or line.startswith('0.  .0'):

                            comFactDict[matID] = dataList

                            iCount =0                    
 
    dd['NOT'] =dataListList

    print 'matDataDict',matDataDict                            

    print 'matCDict',matCDict

    print 'matTDict',matTDict

    print 'comFactDict',comFactDict

    print ',otherFList',otherFList

    return dd

Please find the solution.Let me know whether this can be done in more precise and better way.

Thanks
PSB

Jan 11 '08 #19

psbasha

440

256MB

Expand|Select|Wrap|Line Numbers

 Output

matDataDict {1: [100, 40, 30, 2.0, 0], 2: [200, 40, 60, 2.0, 0], 3: [600, 40, 30, 5.0, 0], 4: [500, 40, 70, 2.0, 0]}

matCDict {2: [[1000.0, 0.10000000000000001], [2000.0, 0.20000000000000001], [3000.0, 0.29999999999999999], [4000.0, 0.59999999999999998]], 3: [[3000.0, 0.10000000000000001], [5000.0, 0.20000000000000001], [6000.0, 0.40000000000000002], [7000.0, 0.59999999999999998]]}

matTDict {2: [[1000.0, 0.10000000000000001], [2000.0, 0.5], [3000.0, 0.29999999999999999], [4000.0, 0.59999999999999998]], 3: [[13000.0, 0.90000000000000002], [45000.0, 0.20000000000000001], [56000.0, 0.29999999999999999], [87000.0, 0.59999999999999998]]}

comFactDict {2: [[0.0, 2.0], [0.20999999999999999, 2.0], [0.23499999999999999, 3.0]], 3: [[0.0, 2.0], [0.21099999999999999, 2.0], [0.23150000000000001, 3.0]]}

,otherFList [['11', '1', '1', '1', '69716.', '1000'], ['11', '1', '1', '5', '76296.', '1000'], ['31', '1', '1', '6', '74926.', '1000'], ['31', '1', '1', '7', '74653.', '1000']]

Ok = [11, 12, 1, 3]

MAT = []

LineThick = [0.10000000000000001]

TOl2 = [0.01]

TOl1 = [0.050000000000000003]

STOP = ['Line', 'and', 'Circle']

Top2 = [11]

Top1 = [10]

Bot4 = [40]

Bot1 = [20]

NOT = [[10, 11, 12, 1], [10, 11, 12, 2]]

Flag2 = [1]

Flag1 = [1]

Parameter_range = [1, 1]

Bot2 = [30]

DummyFlag1 = [1]

Jan 11 '08 #20

psbasha

440

256MB

BV,

Your suggestion is required.

Thanks
PSB

Jan 11 '08 #21

bvdet

2,851

Expert Mod 2GB

BV,

Your suggestion is required.

Thanks
PSB

I don't have time to do it right now, as work deadlines are approaching. I will try to look at it later.

Jan 12 '08 #22

Similar topics