Hi,
Is it necessary in Python to close the File after reading or writing the data to file?.While refering to Python material ,I saw some where mentioning that no need to close the file.Correct me if I am wrong.
If possible could anybody help me with sample code for reading and writing a simple text file.I have seen there are many ways to read /write the data in Python.But I want to use the effective way of reading or writing the data from and to file.
Thanks in advance
PSB
42 4812 bvdet 2,851
Expert Mod 2GB
Hi,
Is it necessary in Python to close the File after reading or writing the data to file?.While refering to Python material ,I saw some where mentioning that no need to close the file.Correct me if I am wrong.
If possible could anybody help me with sample code for reading and writing a simple text file.I have seen there are many ways to read /write the data in Python.But I want to use the effective way of reading or writing the data from and to file.
Thanks in advance
PSB
This thread shows how to read and write data: http://www.thescripts.com/forum/thre...7166-1-10.html
There are several other theads on file I/O that I have participated in.
Python will close an open file in its garbage collection routine when the file object reference is reassigned or decreases to None. It is good practice to close every file that is opened - especially when a file object was created. This brings up a subject that has puzzled me. Open a file like this: - lineLst = open('file_name').readlines()
Does Python close the file? No file object is created, so the file is closed when the end of file is reached (I think).
This brings up a subject that has puzzled me. Open a file like this: - lineLst = open('file_name').readlines()
Does Python close the file? No file object is created, so the file is closed when the end of file is reached (I think).
yes Python does close the file when the file object gets garbage collected.
I have a file data in this format
Employee # Employee Name Salary Location
---------------------------------------------------------------------------------------------
121111 Sam 10,000 NJ
121311 Paul 20,000 NY
111111 Jim 10,000 TX
The data is in Xls and we are copying manually into text file.After copying into text file ,the data is not organized as we see in xls.
So how to read this file data (without using slicing concept) and store in the respective fields.
Could anybody provide a sample piece of code.
Thanks
PSB
bvdet 2,851
Expert Mod 2GB
I have a file data in this format
Employee # Employee Name Salary Location
---------------------------------------------------------------------------------------------
121111 Sam 10,000 NJ
121311 Paul 20,000 NY
111111 Jim 10,000 TX
The data is in Xls and we are copying manually into text file.After copying into text file ,the data is not organized as we see in xls.
So how to read this file data (without using slicing concept) and store in the respective fields.
Could anybody provide a sample piece of code.
Thanks
PSB
You can save the Excel worksheet as a text file. The text file will be tab delimited which can be easily parsed. - """
-
Read a tab delimited file
-
"""
-
-
fn = 'your_file'
-
-
f = open(fn, 'r')
-
labelLst = f.readline().strip().split('\t')
-
lineLst = []
-
-
for line in f:
-
if not line.startswith('#'):
-
lineLst.append(line.strip().split('\t'))
-
-
f.close()
-
-
print labelLst
-
print lineLst
Could anybody help me in reading this data.How to seperate the line data and read.
SET 10 = 1101 1106 1107 1108 1109 1110 1111,
1112 1113 1114 1115 1116 1117 1118,
1119 1120 1121 1122 1123 1124 1125
If anybody provide a sample code for the above it will be helpful.
Thanks in advance
PSB
bvdet 2,851
Expert Mod 2GB
Could anybody help me in reading this data.How to seperate the line data and read.
SET 10 = 1101 1106 1107 1108 1109 1110 1111,
1112 1113 1114 1115 1116 1117 1118,
1119 1120 1121 1122 1123 1124 1125
If anybody provide a sample code for the above it will be helpful.
Thanks in advance
PSB
- s = 'SET 10 = 1101 1106 1107 1108 1109 1110 1111,\n 1112 1113 1114 1115 1116 1117 1118,\n 1119 1120 1121 1122 1123 1124 1125'
-
sList = s.split('=')
-
label = sList[0].strip()
-
data = sList[1].strip().split(',\n')
-
datastr = ''.join(data)
-
-
print '%s = %s' % (label, datastr)
-
'''
-
>>> SET 10 = 1101 1106 1107 1108 1109 1110 1111 1112 1113 1114 1115 1116 1117 1118 1119 1120 1121 1122 1123 1124 1125
-
'''
Thanks for the reply,
This is the input file :
$
$ SET 10
$
$ hjdsahclaladsalkjls
$PTITLE = SET 10 = SET_110
SET 10 = 1101 1106 1107 1108 1109 1110 1111,
1112 1113 1114 1115 1116 1117 1118,
1119 1120 1121 1122 1123 1124 1125
$ END OF SET 110
$
I have to get the SET # 10 and all the integers starting from the 1101 to 1125.How can I read all the integer numbers from the 1101 to 1125.
Is it possible with the solution provided by you?.If possible what is the modifications has to be done to that piece of code
bvdet 2,851
Expert Mod 2GB
Thanks for the reply,
This is the input file :
$
$ SET 10
$
$ hjdsahclaladsalkjls
$PTITLE = SET 10 = SET_110
SET 10 = 1101 1106 1107 1108 1109 1110 1111,
1112 1113 1114 1115 1116 1117 1118,
1119 1120 1121 1122 1123 1124 1125
$ END OF SET 110
$
I have to get the SET # 10 and all the integers starting from the 1101 to 1125.How can I read all the integer numbers from the 1101 to 1125.
Is it possible with the solution provided by you?.If possible what is the modifications has to be done to that piece of code
This will give you a list of integers from the data string in my earlier post: - map(int, datastr.split())
I understand the piece of code what you have posted.But how to capture the data in between the Key Words "SET" and "END".My program should be generic enough to read this data in between this key words
SET 10 = 1101 1106 1107 1108 1109 1110 1111,
1112 1113 1114 1115 1116 1117 1118,
1119 1120 1121 1122 1123 1124 1125
$ END OF SET 110
Thanks for the reply,
This is the input file :
$
$ SET 10
$
$ hjdsahclaladsalkjls
$PTITLE = SET 10 = SET_110
SET 10 = 1101 1106 1107 1108 1109 1110 1111,
1112 1113 1114 1115 1116 1117 1118,
1119 1120 1121 1122 1123 1124 1125
$ END OF SET 110
$
I have to get the SET # 10 and all the integers starting from the 1101 to 1125.How can I read all the integer numbers from the 1101 to 1125.
Is it possible with the solution provided by you?.If possible what is the modifications has to be done to that piece of code
another way -
>>> import re
-
>>> data = open("file").read()
-
>>> re.compile("SET 10 = (\d+.*)\$ END OF SET 110",re.M|re.DOTALL).findall(data)[0].replace("\n","").replace(","," ")
-
'1101 1106 1107 1108 1109 1110 1111 1112 1113 1114 1115 1116 1117 1118 1119 1120 1121 1122 1123 1124 1125'
-
>>>
-
bvdet 2,851
Expert Mod 2GB
I understand the piece of code what you have posted.But how to capture the data in between the Key Words "SET" and "END".My program should be generic enough to read this data in between this key words
SET 10 = 1101 1106 1107 1108 1109 1110 1111,
1112 1113 1114 1115 1116 1117 1118,
1119 1120 1121 1122 1123 1124 1125
$ END OF SET 110
Iterate on the file for line in file: Untested: - if line.startswith('SET'):
-
in_set = True
-
s = line
-
elif line.startswith('$'):
-
in_set = False
-
elif in_set:
-
s += line
-
return s
bvdet 2,851
Expert Mod 2GB
another way -
>>> import re
-
>>> data = open("file").read()
-
>>> re.compile("SET 10 = (\d+.*)\$ END OF SET 110",re.M|re.DOTALL).findall(data)[0].replace("\n","").replace(","," ")
-
'1101 1106 1107 1108 1109 1110 1111 1112 1113 1114 1115 1116 1117 1118 1119 1120 1121 1122 1123 1124 1125'
-
>>>
-
I like that ghostdog! :)
another way -
>>> import re
-
>>> data = open("file").read()
-
>>> re.compile("SET 10 = (\d+.*)\$ END OF SET 110",re.M|re.DOTALL).findall(data)[0].replace("\n","").replace(","," ")
-
'1101 1106 1107 1108 1109 1110 1111 1112 1113 1114 1115 1116 1117 1118 1119 1120 1121 1122 1123 1124 1125'
-
>>>
-
What this piece of code will do ?.I am not able to understand what these piece of code will do ?
What this piece of code will do ?.I am not able to understand what these piece of code will do ?
I will just briefly explain as my english is not good. hope you will understand
The 're' module is regular expression module. More information can be found here at python docs
>>> data = open("file").read()
reads in the whole file as a string to be fed to re module's findall() method
>>> re.compile("SET 10 = (\d+.*)\$ END OF SET 110",re.M|re.DOTALL).findall(data)[0].replace("\n","").replace(","," ")
'1101 1106 1107 1108 1109 1110 1111 1112 1113 1114 1115 1116 1117 1118 1119 1120 1121 1122 1123 1124 1125'
>>>
you wanted to find the numbers between "SET 10 ="
and "$END OF SET 110" . \d+ means find more that one digits. \d+.* means go on and find more of those digits. by putting brackets between (\d+.*), the results of findall() will return these digit groups. re.compile() sets up the pattern that i want to find and re.M means to search for the pattern in multiline mode. re.DOTALL means to make the "." match a newline. In other words, the "." in (\d+.*) will match newline because your numbers are split into multiline. findall() method will do the searching, and will output the results in one list. In this list, there are redundant \n and "," , so got to get rid of them through replace()
Anyway , this is just another method. I suggest you use bvdet's method if you are not familiar with regular expression.
Is the above piece of code can be made generic for the reading the following input data
SET 10 = 1101 1106 1107 1108 1109 1110 1111,
1112 1113 1114 1115 1116 1117 1118,
1119 1120 1121 1122 1123 1124 1125
$ END OF SET 10
SET 11 = 11031 11036 11037 11038 11040 11050 11051,
11052 11053 11054 11055 11056 11057 11058,
$ END OF SET 11
SET 15 = 110131 110136 110137 110138 110410 110510 110511,
110512 110513 110514 110515 110516 110517 110518,
$ END OF SET 15
.................
Is the above piece of code can be made generic for the reading the following input data
SET 10 = 1101 1106 1107 1108 1109 1110 1111,
1112 1113 1114 1115 1116 1117 1118,
1119 1120 1121 1122 1123 1124 1125
$ END OF SET 10
SET 11 = 11031 11036 11037 11038 11040 11050 11051,
11052 11053 11054 11055 11056 11057 11058,
$ END OF SET 11
SET 15 = 110131 110136 110137 110138 110410 110510 110511,
110512 110513 110514 110515 110516 110517 110518,
$ END OF SET 15
.................
sure. one way -
>>> import re
-
>>> data = open("file").read()
-
>>> pat = re.compile("SET \d+ = (\d+.*?)\$ END OF SET",re.M|re.DOTALL)
-
>>> for result in pat.findall(data):
-
... print result.replace("\n","").replace(","," ")
-
...
-
1101 1106 1107 1108 1109 1110 1111 1112 1113 1114 1115 1116 1117 1118 1119 1120 1121 1122 1123 1124 1125
-
11031 11036 11037 11038 11040 11050 11051 11052 11053 11054 11055 11056 11057 11058
-
110131 110136 110137 110138 110410 110510 110511 110512 110513 110514 110515 110516 110517 110518
-
>>>
-
bvdet 2,851
Expert Mod 2GB
Is the above piece of code can be made generic for the reading the following input data
SET 10 = 1101 1106 1107 1108 1109 1110 1111,
1112 1113 1114 1115 1116 1117 1118,
1119 1120 1121 1122 1123 1124 1125
$ END OF SET 10
SET 11 = 11031 11036 11037 11038 11040 11050 11051,
11052 11053 11054 11055 11056 11057 11058,
$ END OF SET 11
SET 15 = 110131 110136 110137 110138 110410 110510 110511,
110512 110513 110514 110515 110516 110517 110518,
$ END OF SET 15
.................
ghostdog has provided an excellent solution. Here is another, less elegant one: - ### READ FILE DATA
-
-
''' File Data
-
$
-
$ SET 10
-
$
-
$ hjdsahclaladsalkjls
-
$PTITLE = SET 10 = SET_110
-
SET 10 = 1101 1106 1107 1108 1109 1110 1111,
-
1112 1113 1114 1115 1116 1117 1118,
-
1119 1120 1121 1122 1123 1124 1125
-
$ END OF SET 110
-
$
-
$
-
-
SET 11 = 11031 11036 11037 11038 11040 11050 11051,
-
11052 11053 11054 11055 11056 11057 11058,
-
$ END OF SET 11
-
$
-
$
-
SET 15 = 110131 110136 110137 110138 110410 110510 110511,
-
110512 110513 110514 110515 110516 110517 110518,
-
$ END OF SET 15
-
$
-
$
-
'''
-
-
def file_data(s):
-
outStr = ''
-
in_set = False
-
for line in s:
-
if line.startswith('SET'):
-
in_set = True
-
outStr += line.strip('\n').strip(',')
-
elif 'END OF SET' in line:
-
in_set = False
-
outStr += '\n'
-
elif in_set:
-
outStr += ' ' + line.strip('\n').strip(',')
-
return outStr.strip()
-
-
data = file_data(open('your_file').readlines())
-
print data, '\n'
-
dataDict = {}
-
for line in data.strip().split('\n'):
-
dataDict[line.split('=')[0].strip()] = map(int, line.split('=')[1].strip().split())
-
for key in dataDict:
-
print '%s = %s' % (key, dataDict[key])
-
-
'''>>> SET 10 = 1101 1106 1107 1108 1109 1110 1111 1112 1113 1114 1115 1116 1117 1118 1119 1120 1121 1122 1123 1124 1125
-
SET 11 = 11031 11036 11037 11038 11040 11050 11051 11052 11053 11054 11055 11056 11057 11058
-
SET 15 = 110131 110136 110137 110138 110410 110510 110511 110512 110513 110514 110515 110516 110517 110518
-
-
SET 15 = [110131, 110136, 110137, 110138, 110410, 110510, 110511, 110512, 110513, 110514, 110515, 110516, 110517, 110518]
-
SET 11 = [11031, 11036, 11037, 11038, 11040, 11050, 11051, 11052, 11053, 11054, 11055, 11056, 11057, 11058]
-
SET 10 = [1101, 1106, 1107, 1108, 1109, 1110, 1111, 1112, 1113, 1114, 1115, 1116, 1117, 1118, 1119, 1120, 1121, 1122, 1123, 1124, 1125]
-
>>>
-
>>> sum(dataDict['SET 10'])
-
23411
-
>>>
-
'''
bvdet 2,851
Expert Mod 2GB
ghostdog has provided an excellent solution. Here is another, less elegant one: - ............................
-
data = file_data(open('your_file').readlines())
-
print data, '\n'
-
dataDict = {}
-
for line in data.strip().split('\n'):
-
dataDict[line.split('=')[0].strip()] = map(int, line.split('=')[1].strip().split())
-
for key in dataDict:
-
print '%s = %s' % (key, dataDict[key])
-
.............................
-
- data = file_data(open('H:/TEMP/temsys/strdata.txt').readlines())
-
-
dataDict = {}
-
for line in data.strip().split('\n'):
-
dataDict[line.split('=')[0].strip()] = [int(x) for x in line.split('=')[1].strip().split()]
-
for key in dataDict:
-
print '%s = %s' % (key, dataDict[key])
In the above snippet, 'map' has been replaced by a list comprehension.
Thanks for the reply.Some times the input data is give in this format
''' File Data
$
$ SET 10
$
$ hjdsahclaladsalkjls
$PTITLE = SET 10 = SET_110
SET 10 = 1101 1106 1107 1108 1109 1110 1111,
1112 1113 1114 1115 1116 1117 1118,
1119 1120 1121 1122 1123 1124 1125
$ END OF SET 110
$
$
SET 11 = 11031 11036 11037 11038 11040 11050 11051,
11052 11053 11054 11055 11056 11057 11058,
$ END OF SET 11
$
$
SET 15 = 110131 110136 110137 110138 110410 110510 110511,
110512 110513 110514 110515 110516 110517 110518,
$ END OF SET 15
$
$
SET 1 = 1 THRU 897
$HMSET
SET 2 = 1 THRU 932
$HMSET
How to handle the above problem to make it generic
-PSB
Thanks for the reply.Some times the input data is give in this format
''' File Data
$
$ SET 10
$
$ hjdsahclaladsalkjls
$PTITLE = SET 10 = SET_110
SET 10 = 1101 1106 1107 1108 1109 1110 1111,
1112 1113 1114 1115 1116 1117 1118,
1119 1120 1121 1122 1123 1124 1125
$ END OF SET 110
$
$
SET 11 = 11031 11036 11037 11038 11040 11050 11051,
11052 11053 11054 11055 11056 11057 11058,
$ END OF SET 11
$
$
SET 15 = 110131 110136 110137 110138 110410 110510 110511,
110512 110513 110514 110515 110516 110517 110518,
$ END OF SET 15
$
$
SET 1 = 1 THRU 897
$HMSET
SET 2 = 1 THRU 932
$HMSET
How to handle the above problem to make it generic
-PSB
what have you done so far?
bvdet 2,851
Expert Mod 2GB
Thanks for the reply.Some times the input data is give in this format
''' File Data
$
$ SET 10
$
$ hjdsahclaladsalkjls
$PTITLE = SET 10 = SET_110
SET 10 = 1101 1106 1107 1108 1109 1110 1111,
1112 1113 1114 1115 1116 1117 1118,
1119 1120 1121 1122 1123 1124 1125
$ END OF SET 110
$
$
SET 11 = 11031 11036 11037 11038 11040 11050 11051,
11052 11053 11054 11055 11056 11057 11058,
$ END OF SET 11
$
$
SET 15 = 110131 110136 110137 110138 110410 110510 110511,
110512 110513 110514 110515 110516 110517 110518,
$ END OF SET 15
$
$
SET 1 = 1 THRU 897
$HMSET
SET 2 = 1 THRU 932
$HMSET
How to handle the above problem to make it generic
-PSB
- if line.startswith('SET'):
-
if not re.findall("[^0-9 ,\n]", line.split('=')[1]):
-
.....................................
Anything else?
what have you done so far?
The solution what ypou have posted really helped me.Now I am trying to implement for the the other lines which appear the input file
SET 1 = 1 THRU 897
$HMSET
SET 2 = 1 THRU 932
$HMSET
I am working on this.I will post you tomorrow ,what I have done so far.Since we know that different developers have thier own way of reading the file data.But I want to have the optimized code with less lines to read the code.So I am loking from the forum.
I am not sure know whether the approach what I am following will looks tedious or round about the way.
If anybody has the idea to have in a better way ,that will help me.
Thanks in advance
PSB
The solution what ypou have posted really helped me.Now I am trying to implement for the the other lines which appear the input file
SET 1 = 1 THRU 897
$HMSET
SET 2 = 1 THRU 932
$HMSET
I am working on this.I will post you tomorrow ,what I have done so far.Since we know that different developers have thier own way of reading the file data.But I want to have the optimized code with less lines to read the code.So I am loking from the forum.
I am not sure know whether the approach what I am following will looks tedious or round about the way.
If anybody has the idea to have in a better way ,that will help me.
Thanks in advance
PSB
both outputs should be "1 THRU 897" and "1 THRU 932" that is in between SET and $HMSET? right? ie don't want SET.. and $HMSET..?
I want tthe values 1 and 897 from "1 THRU 897" and
1 and 932 from "1 THRU 932".
So the output should be [1,897] and [1,932] including the output for the previous result.
I want tthe values 1 and 897 from "1 THRU 897" and
1 and 932 from "1 THRU 932".
So the output should be [1,897] and [1,932] including the output for the previous result.
not to be overly complicated with regexp, you can try this. -
import re
-
data = open("file").read()
-
pat = re.compile("SET \d+ = (\d+.*?)(?:\$ END OF SET| THRU (\d+.*?))",re.M|re.DOTALL)
-
for result in pat.findall(data):
-
print result ##do your manipulations here.
-
- """ def read_Sets_file_data(self,strSetsFile):
-
-
fSets = open(strSetsFile,'r')
-
strTemp = fSets.readlines()
-
elementList = []
-
-
outStr = ''
-
bFlag = False
-
startVal =0
-
endVal = 0
-
-
### Yet to implement the "THRU" elements reading
-
for line in strTemp:
-
if line.startswith('SET'):
-
bFlag = True
-
outStr += line.strip('\n').strip(',')
-
labelLst = line.strip().split(" ")
-
for i in range(0,labelLst.__len__()):
-
if ( labelLst[i] == '=' and labelLst[i+1].isdigit()) :
-
startVal = labelLst[i+1]
-
-
if(labelLst[i].isalnum()):
-
if( labelLst[i] == "THRU"):
-
endVal = labelLst[i+1]
-
-
#print startVal,endVal
-
if( int(startVal) > 0 and int(endVal) > 0):
-
list1 = self.get_THRU_elements(startVal,endVal)
-
print list1
-
break
-
-
elif 'END OF SET' in line:
-
bFlag = False
-
outStr += '\n'
-
elif bFlag:
-
outStr += ' ' + line.strip('\n').strip(',')
-
-
-
-
data = outStr.strip()
-
#print data"""
-
-
dataDict = {}
-
-
for line in data.strip().split('\n'):
-
dataDict[line.split('=')[0].strip()] = map(int, line.split('=')[1].strip().split())
-
-
#return (dataDict)
-
#return {}
-
what have you done so far?
I have done the above one.Please correct me if I have done in a wrong way.Yet to completet the iteratice of SETS file.Only one SETS I am reading.But looking for generic.
bvdet 2,851
Expert Mod 2GB - """ def read_Sets_file_data(self,strSetsFile):
-
-
fSets = open(strSetsFile,'r')
-
strTemp = fSets.readlines()
-
elementList = []
-
-
outStr = ''
-
bFlag = False
-
startVal =0
-
endVal = 0
-
-
### Yet to implement the "THRU" elements reading
-
for line in strTemp:
-
if line.startswith('SET'):
-
bFlag = True
-
outStr += line.strip('\n').strip(',')
-
labelLst = line.strip().split(" ")
-
for i in range(0,labelLst.__len__()):
-
if ( labelLst[i] == '=' and labelLst[i+1].isdigit()) :
-
startVal = labelLst[i+1]
-
-
if(labelLst[i].isalnum()):
-
if( labelLst[i] == "THRU"):
-
endVal = labelLst[i+1]
-
-
#print startVal,endVal
-
if( int(startVal) > 0 and int(endVal) > 0):
-
list1 = self.get_THRU_elements(startVal,endVal)
-
print list1
-
break
-
-
elif 'END OF SET' in line:
-
bFlag = False
-
outStr += '\n'
-
elif bFlag:
-
outStr += ' ' + line.strip('\n').strip(',')
-
-
-
-
data = outStr.strip()
-
#print data"""
-
-
dataDict = {}
-
-
for line in data.strip().split('\n'):
-
dataDict[line.split('=')[0].strip()] = map(int, line.split('=')[1].strip().split())
-
-
#return (dataDict)
-
#return {}
-
I have done the above one.Please correct me if I have done in a wrong way.Yet to completet the iteratice of SETS file.Only one SETS I am reading.But looking for generic.
This problem is similar to a function I wrote recently to extract data from XML files. Instead of going over your code, it's much easier for me to post the following code (hopefully it will do what you want): - def file_data(s):
-
outStr = ''
-
in_set = False
-
for line in s:
-
if line.startswith('SET'):
-
if 'THRU' in line:
-
in_set = False
-
outStr += line.replace('THRU ', '').strip('\n').strip(',')+'\n'
-
else:
-
in_set = True
-
outStr += line.strip('\n').strip(',')
-
elif 'END OF SET' in line:
-
in_set = False
-
outStr += '\n'
-
elif in_set:
-
outStr += ' ' + line.strip('\n').strip(',')
-
dataDict = {}
-
for line in outStr.strip().split('\n'):
-
dataDict[line.split('=')[0].strip()] = [int(x) for x in line.split('=')[1].strip().split()]
-
return dataDict
-
-
dd = file_data(open('your_file').readlines())
-
for key in dd:
-
print '%s = %s' % (key, dd[key])
- Sample.txt
-
$
-
$ SET 10
-
$
-
$ hjdsahclaladsalkjls
-
$PTITLE = SET 10 = SET_110
-
SET 10 = 1101 1106 1107 1108 1109 1110 1111,
-
1112 1113 1114 1115 1116 1117 1118,
-
1119 1120 1121 1122 1123 1124 1125
-
$ END OF SET 110
-
$
-
$
-
-
SET 11 = 11031 11036 11037 11038 11040 11050 11051,
-
11052 11053 11054 11055 11056 11057 11058,
-
$ END OF SET 11
-
$
-
$
-
SET 15 = 110131 110136 110137 110138 110410 110510 110511,
-
110512 110513 110514 110515 110516 110517 110518,
-
$ END OF SET 15
-
$
-
$
-
SET 1 = 1 THRU 897
-
$HMSET
-
SET 2 = 1 THRU 932
-
-
SET 102 = 1001323 THRU 1001331,1001343 THRU 1001349,
-
1001359 THRU 1001365,1001375 THRU 1001381,
-
1001391 THRU 1001397,1001407 THRU 1001413,
-
1001415 THRU 1001429,1001439 THRU 1001445,
-
1001455 THRU 1001461,1001471 THRU 1001477,
-
1001479 THRU 1001490,1001500 THRU 1001506,
-
1001516 THRU 1001522,1001532 THRU 1001538,
-
1001540 THRU 1001554,1001564 THRU 1001570,
-
1001580 THRU 1001586,1001596 THRU 1001602,
-
1001612 THRU 1001618,1001620 THRU 1001634,
-
1001644 THRU 1001650,1001660 THRU 1001666,
-
1009990,1009992,1009994,1009996,1009998,1010000,1010002,
-
1010004,1010006,1010009,1010010,1010012,1010014,
-
1010066 THRU 1010081
-
Could anybody help me in making more generic.
Add one more thing I forgot to mention in my previous dicussion
SET 1 = 1 THRU 897 ,when I come accross THRU word ,I have to take he value before THRU and after THRU i.e 1 and 897 and create a list of number for this range
say
dataList[]
for i in range(1,897):
dataList.append(i)
O/P:
get the list of all the numbers
[1,2,.......896,897]
Thanks
PSB
- Sample.txt
-
$
-
$ SET 10
-
$
-
$ hjdsahclaladsalkjls
-
$PTITLE = SET 10 = SET_110
-
SET 10 = 1101 1106 1107 1108 1109 1110 1111,
-
1112 1113 1114 1115 1116 1117 1118,
-
1119 1120 1121 1122 1123 1124 1125
-
$ END OF SET 110
-
$
-
$
-
-
SET 11 = 11031 11036 11037 11038 11040 11050 11051,
-
11052 11053 11054 11055 11056 11057 11058,
-
$ END OF SET 11
-
$
-
$
-
SET 15 = 110131 110136 110137 110138 110410 110510 110511,
-
110512 110513 110514 110515 110516 110517 110518,
-
$ END OF SET 15
-
$
-
$
-
SET 1 = 1 THRU 897
-
$HMSET
-
SET 2 = 1 THRU 932
-
-
SET 102 = 1001323 THRU 1001331,1001343 THRU 1001349,
-
1001359 THRU 1001365,1001375 THRU 1001381,
-
1001391 THRU 1001397,1001407 THRU 1001413,
-
1001415 THRU 1001429,1001439 THRU 1001445,
-
1001455 THRU 1001461,1001471 THRU 1001477,
-
1001479 THRU 1001490,1001500 THRU 1001506,
-
1001516 THRU 1001522,1001532 THRU 1001538,
-
1001540 THRU 1001554,1001564 THRU 1001570,
-
1001580 THRU 1001586,1001596 THRU 1001602,
-
1001612 THRU 1001618,1001620 THRU 1001634,
-
1001644 THRU 1001650,1001660 THRU 1001666,
-
1009990,1009992,1009994,1009996,1009998,1010000,1010002,
-
1010004,1010006,1010009,1010010,1010012,1010014,
-
1010066 THRU 1010081
-
Could anybody help me in making more generic.
Add one more thing I forgot to mention in my previous dicussion
SET 1 = 1 THRU 897 ,when I come accross THRU word ,I have to take he value before THRU and after THRU i.e 1 and 897 and create a list of number for this range
say
dataList[]
for i in range(1,897):
dataList.append(i)
O/P:
get the list of all the numbers
[1,2,.......896,897]
Thanks
PSB
say you have already gotten the values.
you can just use
this will create a list for you
Thanks for the solution.
But the file format what I have mentioned above is failing to get the "set " numbers ,when we have "THRU" key word in between the numbers.I am looking for reading of this sets file in more generic way to handle the above different sets format.
Can body help me in fixing the above code for more generic approach for reading the different sets file format as mentioned above.
-PSB
you have quite some experience now with Python, so I just give you some ideas and you do the rest. Just simple string manipulations will get you want you want eventually.
one idea for a sample line with THRU: -
>>> a = "1001323 THRU 1001331,1001343 THRU 1001349"
-
>>> a.split(",")
-
['1001323 THRU 1001331', '1001343 THRU 1001349']
-
>>> for items in a.split(","):
-
... print items.split("THRU")
-
...
-
['1001323 ', ' 1001331']
-
['1001343 ', ' 1001349']
-
you said you want the numbers on the left and right of THRU right?, the above seem to get what you want. Of course, for other redundant words on the line, you can just use replace() , string slices, etc etc to get rid of them.
bvdet 2,851
Expert Mod 2GB
you have quite some experience now with Python, so I just give you some ideas and you do the rest. Just simple string manipulations will get you want you want eventually.
one idea for a sample line with THRU: -
>>> a = "1001323 THRU 1001331,1001343 THRU 1001349"
-
>>> a.split(",")
-
['1001323 THRU 1001331', '1001343 THRU 1001349']
-
>>> for items in a.split(","):
-
... print items.split("THRU")
-
...
-
['1001323 ', ' 1001331']
-
['1001343 ', ' 1001349']
-
you said you want the numbers on the left and right of THRU right?, the above seem to get what you want. Of course, for other redundant words on the line, you can just use replace() , string slices, etc etc to get rid of them.
I have another suggestion: - line = '1001612 THRU 1001618,1001620 THRU 1001634, 1001644 THRU 1001650,1001660 THRU 1001666, 1009990,1009992,1009994,1009996,1009998,1010000,1010002'
-
outStr = ''
-
if 'THRU' in line:
-
lineList = re.findall('\d+ THRU \d+|\d+', line)
-
lst = []
-
for item in lineList:
-
if 'THRU' in item:
-
tem = item.split(' THRU ')
-
lst += range(int(tem[0]), int(tem[1])+1)
-
else:
-
lst.append(int(item))
-
outStr += ' '.join([str(i) for i in lst if i != '']) + ' '
-
else:
-
outStr += ' ' + line.strip('\n').strip(',')
-
-
'''
-
>>> outStr
-
1001612 1001613 1001614 1001615 1001616 1001617 1001618 1001620 1001621 1001622 1001623 1001624 1001625 1001626 1001627 1001628 1001629 1001630 1001631 1001632 1001633 1001634 1001644 1001645 1001646 1001647 1001648 1001649 1001650 1001660 1001661 1001662 1001663 1001664 1001665 1001666 1009990 1009992 1009994 1009996 1009998 1010000 1010002
-
'''
-
It's not very pretty though. You should be able to do the rest from here.
looking at only these set of input data provided by OP,(though the whole file may not be the same) -
SET 1 = 1 THRU 897
-
$HMSET
-
SET 2 = 1 THRU 932
-
SET 102 = 1001323 THRU 1001331,1001343 THRU 1001349,
-
1001359 THRU 1001365,1001375 THRU 1001381,
-
1001391 THRU 1001397,1001407 THRU 1001413,
-
1001415 THRU 1001429,1001439 THRU 1001445,
-
1001455 THRU 1001461,1001471 THRU 1001477,
-
1001479 THRU 1001490,1001500 THRU 1001506,
-
1001516 THRU 1001522,1001532 THRU 1001538,
-
1001540 THRU 1001554,1001564 THRU 1001570,
-
1001580 THRU 1001586,1001596 THRU 1001602,
-
1001612 THRU 1001618,1001620 THRU 1001634,
-
1001644 THRU 1001650,1001660 THRU 1001666,
-
1009990,1009992,1009994,1009996,1009998,1010000,10 10002,
-
1010004,1010006,1010009,1010010,1010012,1010014,
-
1010066 THRU 1010081
-
this little piece of code will get what he wants. (if i didn't interpret wrongly) -
data = open("file").read()
-
pat = re.compile("(\d+) THRU (\d+)",re.M|re.DOTALL)
-
for items in pat.findall(data):
-
print items
-
output: -
# ./test.py
-
('1', '897')
-
('1', '932')
-
('1001323', '1001331')
-
('1001343', '1001349')
-
('1001359', '1001365')
-
('1001375', '1001381')
-
('1001391', '1001397')
-
('1001407', '1001413')
-
('1001415', '1001429')
-
('1001439', '1001445')
-
('1001455', '1001461')
-
('1001471', '1001477')
-
('1001479', '1001490')
-
('1001500', '1001506')
-
('1001516', '1001522')
-
('1001532', '1001538')
-
('1001540', '1001554')
-
('1001564', '1001570')
-
('1001580', '1001586')
-
('1001596', '1001602')
-
('1001612', '1001618')
-
('1001620', '1001634')
-
('1001644', '1001650')
-
('1001660', '1001666')
-
('1010066', '1010081')
-
bvdet 2,851
Expert Mod 2GB
Taking it a step farther: - import re
-
-
def getThruData(s):
-
sList = re.findall('\d+ THRU \d+|\d+', s)
-
for item in sList:
-
if 'THRU' in item:
-
tem = item.split(' THRU ')
-
for num in range(int(tem[0]), int(tem[1])+1):
-
yield num
-
else:
-
yield int(item.strip())
- >>> s
-
'1001359 THRU 1001365,1001375 THRU 1001381,\n1010004,1010006,1010009,1010010,1010012,1010014,1010066 THRU 1010081'
-
>>> sList = [i for i in getThruData(s)]
-
>>> sList
-
[1001359, 1001360, 1001361, 1001362, 1001363, 1001364, 1001365, 1001375, 1001376, 1001377, 1001378, 1001379, 1001380, 1001381, 1010004, 1010006, 1010009, 1010010, 1010012, 1010014, 1010066, 1010067, 1010068, 1010069, 1010070, 1010071, 1010072, 1010073, 1010074, 1010075, 1010076, 1010077, 1010078, 1010079, 1010080, 1010081]
-
>>>
....,1009990,1009992,1009994,1009996,1009998,10100 00,10 10002,
1010004,1010006,1010009,1010010,1010012,1010014,.. ...
I need this numbers also to be taken care while reading the data other than the key word "THRU"
Thanks
PSB
....,1009990,1009992,1009994,1009996,1009998,10100 00,10 10002,
1010004,1010006,1010009,1010010,1010012,1010014,.. ...
I need this numbers also to be taken care while reading the data other than the key word "THRU"
Thanks
PSB
up until now, have you actually have some code yet? show us what you did to make you "unable to handle these numbers. "
bvdet 2,851
Expert Mod 2GB
....,1009990,1009992,1009994,1009996,1009998,10100 00,10 10002,
1010004,1010006,1010009,1010010,1010012,1010014,.. ...
I need this numbers also to be taken care while reading the data other than the key word "THRU"
Thanks
PSB
My post 'Taking it a step farther' extracts all the numbers. If you had bothered to look at my post, you could see that. I thought it was pretty neat. Let me show you AGAIN: - >>> s1 = '1009990,1009992,1009994,1009996,1009998,1010000,1010002,1010004,1010006,1010009,1010010,1010012,1010014,10010016 THRU 10010035'
-
>>> for i in getThruData(s1):
-
... print i
-
...
-
1009990
-
1009992
-
1009994
-
1009996
-
1009998
-
1010000
-
1010002
-
1010004
-
1010006
-
1010009
-
1010010
-
1010012
-
1010014
-
10010016
-
10010017
-
10010018
-
10010019
-
10010020
-
10010021
-
10010022
-
10010023
-
10010024
-
10010025
-
10010026
-
10010027
-
10010028
-
10010029
-
10010030
-
10010031
-
10010032
-
10010033
-
10010034
-
10010035
-
>>>
Please note that ALL the numbers are in the output. We have done most of the work. You don't expect us to write the total solution, do you?
Thanks to all,for providing the solution for different file reading formats.
No BV ,I dont expect from you to give the whole output/solution for the problem.Just I would like to know the approach/ concept.
Thanks BV.You are really Guru to us in the forum
- PSB
bvdet 2,851
Expert Mod 2GB
Thanks to all,for providing the solution for different file reading formats.
No BV ,I dont expect from you to give the whole output/solution for the problem.Just I would like to know the approach/ concept.
Thanks BV.You are really Guru to us in the forum
- PSB
The approach I used:
1. Read all lines into a list, initialize 'outStr', and iterate on the list.
2. Look for keyword 'SET' and set variable 'in_set' = True.
3. If 'THRU' is in line, split on '=', send the right side to getThruData(), put back together and append (by concatenation) to 'outStr' - otherwise just append to 'outStr'.
4. While 'in_set' is True, append 'line' to 'outStr' (send to getThruData() if 'THRU' is in 'line') until a condition is seen that indicates the end of the set. Set variable 'in_set' to False and append a newline to 'outStr'.
5. Repeat until the end of the file.
6. Create the dictionary from 'outStr', splitting on '\n'.
It seems simple when described like this. HTH :)
Sorry BV,I typed the message wrongly.
What I mean is ,I understood the concept what you have explained in the earlier discussion?
-PSB
bvdet 2,851
Expert Mod 2GB
Sorry BV,I typed the message wrongly.
What I mean is ,I understood the concept what you have explained in the earlier discussion?
-PSB
You probably solved this problem by now. Here's what I came up with in my spare time: -
import re
-
-
def getThruData(s):
-
sList = re.findall('\d+ THRU \d+|\d+', s)
-
for item in sList:
-
if 'THRU' in item:
-
tem = item.split(' THRU ')
-
for num in range(int(tem[0]), int(tem[1])+1):
-
yield num
-
else:
-
yield int(item.strip())
-
-
def file_data(s):
-
outStr = ''
-
in_set = False
-
for line in s:
-
line = line.replace(',', ' ')
-
if line.startswith('SET'):
-
in_set = True
-
if 'THRU' in line:
-
lineList = line.strip().split('=')
-
lst = [i for i in getThruData(lineList[1])]
-
outStr += '%s=%s ' % (lineList[0], ' '.join([str(i) for i in lst if i != '']))
-
else:
-
outStr += line.strip('\n,')
-
elif (line.startswith('$') or 'END OF SET' in line or line == '\n') and in_set == True:
-
in_set = False
-
outStr += '\n'
-
elif in_set:
-
if 'THRU' in line:
-
lst = [i for i in getThruData(line)]
-
outStr += ' '.join([str(i) for i in lst]) + ' '
-
else:
-
outStr += ' ' + line.strip('\n,')
-
dataDict = {}
-
for line in outStr.strip().split('\n'):
-
dataDict[line.split('=')[0].strip()] = [int(x) for x in line.split('=')[1].strip().split()]
-
return dataDict
-
-
-
dd = file_data(open('H:/TEMP/temsys/strdata.txt').readlines())
-
for key in dd:
-
print '%s = %s' % (key, dd[key])
ya it's neccessary to close the file frm user point of view
becoz on closing we explicitly force the buffer to save data in to the
disk other wise sometimes data loss may be occur becoz wht we perform the
operation that operation is performed on buffer not directly to disk so it's neccessary.
and prog for read and write i send u after some time becoz it will take
lot's of time on typing the code (sorry)
bye dear.
Sign in to post your reply or Sign up for a free account.
Similar topics
by: Kevin T. Ryan |
last post by:
Hi All -
I'm not sure, but I'm wondering if this is a bug, or maybe (more
likely) I'm misunderstanding something...see below:
>>> f = open('testfile', 'w')
>>> f.write('kevin\n')
>>>...
|
by: john smith |
last post by:
Hi, I have a file format that is going to contain some parts in ascii, and
some parts with raw binary data. Should I open this file with ios::bin or
no?
For example:
filename: a.bin
number of...
|
by: Oliver Knoll |
last post by:
According to my ANSI book, tmpfile() creates a file with wb+ mode
(that is just writing, right?). How would one reopen it for reading?
I got the following (which works):
FILE *tmpFile =...
|
by: Jeevan |
last post by:
Hi,
I have an array of data (which I am getting from a socket connection).
I am working on a program which acts on this data but the program is
written to work on data from a file (not from an...
|
by: Need Helps |
last post by:
Hello. I'm writing an application that writes to a file a month, day, year, number of comments, then some strings for the comments. So the format for each record would look like:...
|
by: Alex Buell |
last post by:
I have a small text file which consist of the following data:
]]
And the code I've written is as follows:
]]
The trouble is, I can't work out why it goes into an infinite loop
reading the...
|
by: arne.muller |
last post by:
Hello,
I've come across some problems reading strucutres from binary files.
Basically I've some strutures
typedef struct {
int i;
double x;
int n;
double *mz;
|
by: Clive Green |
last post by:
Hello peeps,
I am using PHP 5.2.2 together with MP3_Id (a PEAR module for reading and
writing MP3 tags). I have been using PHP on the command line (Mac OS X
Unix shell, to be precise), and am...
|
by: pbj2009 |
last post by:
Hello all:
I'm pretty stumped on this one. I'm not looking for Code, I'm just trying to figure out the best way to start this since I am new to reading and writing from files. I can't figure out...
|
by: taylorcarr |
last post by:
A Canon printer is a smart device known for being advanced, efficient, and reliable. It is designed for home, office, and hybrid workspace use and can also be used for a variety of purposes. However,...
|
by: Charles Arthur |
last post by:
How do i turn on java script on a villaon, callus and itel keypad mobile phone
|
by: ryjfgjl |
last post by:
In our work, we often receive Excel tables with data in the same format. If we want to analyze these data, it can be difficult to analyze them because the data is spread across multiple Excel files...
|
by: emmanuelkatto |
last post by:
Hi All, I am Emmanuel katto from Uganda. I want to ask what challenges you've faced while migrating a website to cloud.
Please let me know.
Thanks!
Emmanuel
|
by: nemocccc |
last post by:
hello, everyone, I want to develop a software for my android phone for daily needs, any suggestions?
|
by: Hystou |
last post by:
There are some requirements for setting up RAID:
1. The motherboard and BIOS support RAID configuration.
2. The motherboard has 2 or more available SATA protocol SSD/HDD slots (including MSATA, M.2...
|
by: marktang |
last post by:
ONU (Optical Network Unit) is one of the key components for providing high-speed Internet services. Its primary function is to act as an endpoint device located at the user's premises. However,...
|
by: Hystou |
last post by:
Most computers default to English, but sometimes we require a different language, especially when relocating. Forgot to request a specific language before your computer shipped? No problem! You can...
|
by: jinu1996 |
last post by:
In today's digital age, having a compelling online presence is paramount for businesses aiming to thrive in a competitive landscape. At the heart of this digital strategy lies an intricately woven...
| |