473,394 Members | 1,749 Online
Bytes | Software Development & Data Engineering Community
Post Job

Home Posts Topics Members FAQ

Join Bytes to post your question to a community of 473,394 software developers and data experts.

newb comment request

Hi,

Im a newb to dev and python... my first sefl assigned mission was to read a
pickled file containing a list with DB like data and convert this to
MySQL... So i wrote my first module which reads this pickled file and writes
an XML file with list of tables and fields (... next step will the module
who creates the tables according to details found in the XML file).

If anyone has some minutes to spare, suggestions and comments would be verry
much appreciated to help me make my next modules better.. and not starting
with bad habit :)

Alexandre

######################## <EIPyFormatToXML>
######################################
#pickle.loads a file containing database like data in a python list ->
#Outputs a XML file with "tables" details which will be used in a future
module
#to build MySQL tables. See the comment blocks at the end of the module
#for more details.

import sys
import pickle
import types

infile = open('cached-objects-Python-pickled-sample', 'rb')
_data = pickle.load(infile)
infile.close()

def ExtractTablesFromData(data):
"""Extracts all the table names from the Dumped items data file and
returns the list."""
tablesR = []
for tables in data:#For each tables found in 'data'
tablesR.append([tables[0]])#Appends the list with current table name
return tablesR

def ExtractFieldNamesFromData(data):
"""Extract all fields from data list (the calling function defines for
which table in 'data' argument)."""
fieldsR = []
for fields in data:
fieldsR.append([fields])
return fieldsR

def ExtractFieldValuesFromData(data, indexField):
"""Check each value of the field passed as argument to the function, it
returns [valueType, maxLength, maxValue, minValue, floatPrecision,
NoneValues(bool), sampleValue]."""
valueType, maxLength, maxValue, minValue, values , floatPrecision,
NoneValues = None, None, None, 999999999999, [], None, False
sampleValue = 'numeric value, check min and max values as sample'
for valuesD in data:#for each record...
value = valuesD[indexField]#focus value on required field
if type(value) is not types.NoneType:#if a value other than None is
found
valueType = type(value)#valueType is given the type of the
current value
else:#... if the value is None
NoneValues = True#None values exist for this field in the record
list
if valueType is str:#if type is string
minValue = None#set minValue to None (minValue and maxValue are
only for numeric types)
if len(value) > maxLength:#if current string lenght is bigger
than maxLength
maxLength = len(value)#Set maxLength value to current string
length
sampleValue = value#Sets sampleValue with the longest string
found
else:#... if not string type
if value > maxValue:#if current value bigger than maxValue
maxValue = value#Sets current value to maxValue
if value and value < minValue:#if value is not None AND smaller
than minValue
minValue = value#Sets new minValue with current value
if valueType is float and value != 0:#if value type is float and
not 0
precisionTemp = len(str(value - int(value)))-2
if precisionTemp > floatPrecision:#if the current length
after decimal point is bigger than previous
floatPrecision = precisionTemp#set current value to
precision
if valueType is float and floatPrecision == None:#if float could not be
determined because only 0.0 values were found
floatPrecision = 1#set precision to 1
if valueType is not float and floatPrecision != None:#if last value type
was not float but some float values were found
valueType = type(1.234)#set valueType to float
if valueType is str and maxLength == 0:#if value type found only ''
(empty) records
NoneValues = True#allow null values
if minValue == 999999999999:#if minValue was not set
minValue = None#then minValue is None
values[:] = [valueType, maxLength, maxValue, minValue, floatPrecision,
NoneValues, sampleValue]
return values

def AddFieldsPerTable():
"""Appends field list to each table."""
tables = ExtractTablesFromData(_data) #First extract list of tables
for i, table in enumerate(tables): #Then for each table in the list
fields = ExtractFieldNamesFromData(_data[i][1][0])#get field list
([i] as table index, [1][0] to reach field list)
tables[i].append(fields) #Appends the returned field list to current
table
return tables

def AddFieldsDetailsPerField():
"""Extend field list with details for each field."""
tables = AddFieldsPerTable()#First get table list
for iTable, table in enumerate(tables):#Then for each table
for iField, field in enumerate(table[1]):#...for each field in the
current table
values = ExtractFieldValuesFromData(_data[iTable][1][1],
iField)#Get field's details([iTable] as table index, [1][1] to reach records
list, iField to focus search on current field)
field.extend(values)#Extends the tables list with returned field
details
return tables

def AddNbOfRecordsPerTable():#Insert number of records per table.
"""Extend 'tables' details with number of records per table."""
tables = AddFieldsDetailsPerField()#get tables
for i, table in enumerate(tables):#for each table
nbOfRecords = len(_data[i][1][1])#get number of records ([i]=table
index, [1][1] = record list)
table.insert(1, nbOfRecords)#inserts the number of records in tables
list
return tables

def WriteFileTableFormat(fileName):#Creates the XML with 'tables' list
tables = AddNbOfRecordsPerTable()#get tables detailed list
f = open(fileName, 'w')
f.write("""<?xml version="1.0" encoding="ISO-8859-1"?>\n""")
f.write("<Root>\n")

for table in tables:
f.write("\t<table>\n")
f.write("\t\t<name>%s</name>\n" % table[0])
f.write("\t\t<nbOfRecords>%s</nbOfRecords>\n" % table[1])
for field in table[2][:]:
f.write("\t\t<field>\n")
f.write("\t\t\t<name>%s</name>\n" % field[0])
if str(field[1])[:7] == "<type '":
field[1] = str(field[1])[7:-2]
f.write("\t\t\t<pythonType>%s</pythonType>\n" % str(field[1]))
f.write("\t\t\t<maxLength>%s</maxLength>\n" % str(field[2]))
f.write("\t\t\t<maxValue>%s</maxValue>\n" % str(field[3]))
f.write("\t\t\t<minValue>%s</minValue>\n" % str(field[4]))
f.write("\t\t\t<floatPrecision>%s</floatPrecision>\n" %
str(field[5]))
f.write("\t\t\t<NoneValues>%s</NoneValues>\n" % str(field[6]))
f.write("\t\t\t<sampleValue>%s</sampleValue>\n" % str(field[7]))
f.write("\t\t\t<mysqlFieldType></mysqlFieldType>\n")
f.write("\t\t</field>\n")
f.write("\t</table>\n")

f.write("</Root>")
f.close

WriteFileTableFormat('EITablesFormat.xml')

############ <Help to understand '_data' structure>
#
# [['FirstTableName', (['FirstFieldName', 'nFieldName'],
[['FirstFieldFirstValue', 'nFieldFirstValue'],
# ['FirstFieldnValue', 'nFieldnValue']])], ['nTableName', (['etc..
# print _data[0][0] #[0]=FirstTable, [0]=TableName -> output :
'FirstTableName'
# print len(_data) #number of tables in 'data'
# print _data[0][1] #[0]=FirstTable, [1]=FieldList And Records
# print _data[0][1][0] #[0]=FirstTable, [1]=FieldList, [0]=FieldNames ->
output : ['FirstFieldName', 'nFieldName']
# print len(_data[0][1][0]) #number of fields in first table
# print _data[0][1][1] #[0]=FirstTable, [1]=FieldList, [1]=RecordList
# print len(_data[0][1][1]) #number of records in first table
# print _data[0][1][1][0][2] #[0]=firstTable, [1]=FieldList,
[1]=RecordList, [0] = First Record, [2]=Third Field Value
#
######################## </Help to understand '_data' structure>
############ <Final 'tables' variable format>
#
# The final 'tables' format used to build the XML should look like :
# ([tablename_1, nbOfRecords
# [
# [fieldname_1, pythonType, maxLength, maxValue, minValue,
floatPrecision, NoneValues, sampleValue],
# [fieldname_1, pythonType, maxLength, maxValue, minValue,
floatPrecision, NoneValues, sampleValue]
# ],
# [tablename_n,
# [
# [fieldname_1, ...]
# ]
# ])
#
######################## </Final 'tables' variable format>
#################################### </EIPyFormatToXML>
Jul 18 '05 #1
5 2008
Alexandre wrote:
Hi,

Im a newb to dev and python... my first sefl assigned mission was to read
a pickled file containing a list with DB like data and convert this to
MySQL... So i wrote my first module which reads this pickled file and
writes an XML file with list of tables and fields (... next step will the
module who creates the tables according to details found in the XML file).

If anyone has some minutes to spare, suggestions and comments would be
verry much appreciated to help me make my next modules better.. and not
starting with bad habit :)


I would suggest that you repost the script without the excessive comments.
Most programmers find Python very readable, your comments actually make it
harder to parse for the human eye.
Also, make it work if you can, or point to the actual errors that you cannot
fix yourself. Provide actual test data in your post instead of the
unpickling code, so that others can easily reproduce your errors.
Only then you should ask for improvements.

Random remarks:

Object oriented programming is about programming against interfaces, so
exessive type checking is a strong hint to design errors.

Avoid using global variables in your functions; rather pass them explicitly
as arguments.

Where is valuesD introduced?

"%s" % str(1.23)
is the same as
"%s" % 1.23

type(value) is not types.NoneType
is the same as
value is not None

Module level code is habitually wrapped like so:

if __name__ == "__main__":
infile = open('cached-objects-Python-pickled-sample', 'rb')
_data = pickle.load(infile)
infile.close()
# process _data...

That way, you can import the module as well as use it as a standalone
script.

I'm sure, there is more, but then again, clean up the comments, try to make
it work, and then repost.

Peter
Jul 18 '05 #2

"Peter Otten" <__*******@web.de> a écrit dans le message de news:bq*************@news.t-online.com...
I would suggest that you repost the script without the excessive comments.
Most programmers find Python very readable, your comments actually make it
harder to parse for the human eye.
Done, so your help will also make my next posts better :)
Also, make it work if you can, or point to the actual errors that you cannot
fix yourself. Provide actual test data in your post instead of the
unpickling code, so that others can easily reproduce your errors.
Only then you should ask for improvements.
Also available in my new version :)
Random remarks:

Object oriented programming is about programming against interfaces, so
exessive type checking is a strong hint to design errors.
I'm not sure i understand... well, let's put it that way : i'm sure i don't understand :)
I guess i'll have to read about OO design... this is my first program, although i think i understand what OO means, i'm
not abble to write OO yet :/
Avoid using global variables in your functions; rather pass them explicitly
as arguments.
I'll try that !
Where is valuesD introduced?
Not sure i understand the question... i used this variable name "ValuesD" (meaning Values from Data) not to conflict
with the other variable named "values" in the same function.
"%s" % str(1.23)
is the same as
"%s" % 1.23
Ok

type(value) is not types.NoneType
is the same as
value is not None
Shame on me :)
Module level code is habitually wrapped like so:

if __name__ == "__main__":
infile = open('cached-objects-Python-pickled-sample', 'rb')
_data = pickle.load(infile)
infile.close()
# process _data...

That way, you can import the module as well as use it as a standalone
script.
Oh ? but, i can use it as standalone script ?!
If i type myScriptName.py in my dos prompt the script is working !
I'm sure, there is more, but then again, clean up the comments, try to make
it work, and then repost.


I'm sure there's more :)
Thanks a lot for your comments Peter !
Best regards,
Alexandre
Jul 18 '05 #3
######################## <EIPyFormatToXML> ######################################
import sys
import types

_data = [['Table1',(['Field01','Field02','Field03',],
[['a string', 12345, 1.000123],
['a second string', None, 3406.3],
['', 64654564, 35]])],
['Table2', (['Field04', 'Field05'], [[None, -0.3]])],
['Table3',(['Field06', 'Field07', 'Field08'],
[['', 0, 0.001],
['', None, 646464.0],
['', 6546, 0.1],
['', -6444, 0.2],
['', 0, 0.3]])]]

def ExtractTablesFromData(data):
"""Extracts all the table names from the Dumped items data file and returns the list."""
tablesR = []
for tables in data:
tablesR.append([tables[0]])
return tablesR

def ExtractFieldNamesFromData(data):
"""Extract all fields from data list (the calling function defines for which table in 'data' argument)."""
fieldsR = []
for fields in data:
fieldsR.append([fields])
return fieldsR

def ExtractFieldValuesFromData(data, indexField):
"""Check each value of the field passed as argument to the function."""
values , floatPrecision, NoneValues = [], None, False
valueType, maxLength, maxValue, minValue = None, None, None, 999999999999
sampleValue = 'numeric value, check min and max values as sample'
for valuesD in data:
value = valuesD[indexField]
if type(value) is not types.NoneType:
valueType = type(value)
else:
NoneValues = True
if valueType is str:
minValue = None
if len(value) > maxLength:
maxLength = len(value)
sampleValue = value
else:
if value > maxValue:
maxValue = value
if value and value < minValue:
minValue = value
if valueType is float and value != 0:
precisionTemp = len(str(value - int(value)))-2
if precisionTemp > floatPrecision:
floatPrecision = precisionTemp
if valueType is float and floatPrecision == None:
floatPrecision = 1
if valueType is not float and floatPrecision != None:
valueType = type(1.234)
if valueType is str and maxLength == 0:
NoneValues = True
if minValue == 999999999999:
minValue = None
values[:] = [valueType, maxLength, maxValue, minValue, floatPrecision, NoneValues, sampleValue]
return values

def AddFieldsPerTable():
"""Appends field list to each table."""
tables = ExtractTablesFromData(_data)
for i, table in enumerate(tables):
fields = ExtractFieldNamesFromData(_data[i][1][0])
tables[i].append(fields)
return tables

def AddFieldsDetailsPerField():
"""Extend field list with details for each field."""
tables = AddFieldsPerTable()
for iTable, table in enumerate(tables):
for iField, field in enumerate(table[1]):
values = ExtractFieldValuesFromData(_data[iTable][1][1], iField)
field.extend(values)
return tables

def AddNbOfRecordsPerTable():
"""Extend 'tables' details with number of records per table."""
tables = AddFieldsDetailsPerField()
for i, table in enumerate(tables):
nbOfRecords = len(_data[i][1][1])
table.insert(1, nbOfRecords)
return tables

def WriteFileTableFormat(fileName):
tables = AddNbOfRecordsPerTable()
f = open(fileName, 'w')
f.write("""<?xml version="1.0" encoding="ISO-8859-1"?>\n""")
f.write("<Root>\n")

for table in tables:
f.write("\t<table>\n")
f.write("\t\t<name>%s</name>\n" % table[0])
f.write("\t\t<nbOfRecords>%s</nbOfRecords>\n" % table[1])
for field in table[2][:]:
f.write("\t\t<field>\n")
f.write("\t\t\t<name>%s</name>\n" % field[0])
if str(field[1])[:7] == "<type '":
field[1] = str(field[1])[7:-2]
f.write("\t\t\t<pythonType>%s</pythonType>\n" % str(field[1]))
f.write("\t\t\t<maxLength>%s</maxLength>\n" % str(field[2]))
f.write("\t\t\t<maxValue>%s</maxValue>\n" % str(field[3]))
f.write("\t\t\t<minValue>%s</minValue>\n" % str(field[4]))
f.write("\t\t\t<floatPrecision>%s</floatPrecision>\n" % str(field[5]))
f.write("\t\t\t<NoneValues>%s</NoneValues>\n" % str(field[6]))
f.write("\t\t\t<sampleValue>%s</sampleValue>\n" % str(field[7]))
f.write("\t\t\t<mysqlFieldType></mysqlFieldType>\n")
f.write("\t\t</field>\n")
f.write("\t</table>\n")

f.write("</Root>")
f.close

WriteFileTableFormat('EITablesFormat.xml')

#################################### </EIPyFormatToXML>

-> result xml

<?xml version="1.0" encoding="ISO-8859-1"?>

<Root>

<table>

<name>Table1</name>

<nbOfRecords>3</nbOfRecords>

<field>

<name>Field01</name>

<pythonType>str</pythonType>

<maxLength>15</maxLength>

<maxValue>None</maxValue>

<minValue>None</minValue>

<floatPrecision>None</floatPrecision>

<NoneValues>False</NoneValues>

<sampleValue>a second string</sampleValue>

<mysqlFieldType/>

</field>

<field>

<name>Field02</name>

<pythonType>int</pythonType>

<maxLength>None</maxLength>

<maxValue>64654564</maxValue>

<minValue>12345</minValue>

<floatPrecision>None</floatPrecision>

<NoneValues>True</NoneValues>

<sampleValue>numeric value, check min and max values as sample</sampleValue>

<mysqlFieldType/>

</field>

<field>

<name>Field03</name>

<pythonType>float</pythonType>

<maxLength>None</maxLength>

<maxValue>3406.3</maxValue>

<minValue>1.000123</minValue>

<floatPrecision>6</floatPrecision>

<NoneValues>False</NoneValues>

<sampleValue>numeric value, check min and max values as sample</sampleValue>

<mysqlFieldType/>

</field>

</table>

<table>

<name>Table2</name>

<nbOfRecords>1</nbOfRecords>

<field>

<name>Field04</name>

<pythonType>None</pythonType>

<maxLength>None</maxLength>

<maxValue>None</maxValue>

<minValue>None</minValue>

<floatPrecision>None</floatPrecision>

<NoneValues>True</NoneValues>

<sampleValue>numeric value, check min and max values as sample</sampleValue>

<mysqlFieldType/>

</field>

<field>

<name>Field05</name>

<pythonType>float</pythonType>

<maxLength>None</maxLength>

<maxValue>-0.3</maxValue>

<minValue>-0.3</minValue>

<floatPrecision>2</floatPrecision>

<NoneValues>False</NoneValues>

<sampleValue>numeric value, check min and max values as sample</sampleValue>

<mysqlFieldType/>

</field>

</table>

<table>

<name>Table3</name>

<nbOfRecords>5</nbOfRecords>

<field>

<name>Field06</name>

<pythonType>str</pythonType>

<maxLength>0</maxLength>

<maxValue>None</maxValue>

<minValue>None</minValue>

<floatPrecision>None</floatPrecision>

<NoneValues>True</NoneValues>

<sampleValue/>

<mysqlFieldType/>

</field>

<field>

<name>Field07</name>

<pythonType>int</pythonType>

<maxLength>None</maxLength>

<maxValue>6546</maxValue>

<minValue>-6444</minValue>

<floatPrecision>None</floatPrecision>

<NoneValues>True</NoneValues>

<sampleValue>numeric value, check min and max values as sample</sampleValue>

<mysqlFieldType/>

</field>

<field>

<name>Field08</name>

<pythonType>float</pythonType>

<maxLength>None</maxLength>

<maxValue>646464.0</maxValue>

<minValue>0.001</minValue>

<floatPrecision>3</floatPrecision>

<NoneValues>False</NoneValues>

<sampleValue>numeric value, check min and max values as sample</sampleValue>

<mysqlFieldType/>

</field>

</table>

</Root>
Jul 18 '05 #4
Object oriented programming is about programming against interfaces, so
exessive type checking is a strong hint to design errors.
I'm not sure i understand... well, let's put it that way : i'm sure i don't understand :)
I guess i'll have to read about OO design... this is my first program, although i think i understand what OO means,

i'm not abble to write OO yet :/


ok... a little shower always help, now i understand your remark :)
The thing is, this first module is retrieving '_data' from another app.
This other app is not written to share '_data' with other apps.

So my first module should be the only one which deals with types checking because those data will then be available to
my app from a DB (... once i've written the second module :)

Thx again and regards,
Alexandre
Jul 18 '05 #5
Alexandre wrote:

[Peter]
Object oriented programming is about programming against interfaces, so
exessive type checking is a strong hint to design errors.
[Alexandre]
I'm not sure i understand... well, let's put it that way : i'm sure i
don't understand :) I guess i'll have to read about OO design... this is
my first program, although i think i understand what OO means, i'm not
abble to write OO yet :/


Type checking is generally best avoided, e. g. if you test for

isinstance(mystream, file)

this may fail on mystream objects that have all the methods needed to
replace a file in subsequent code and you thus unnecessarily limit its
usage.
As your script is explicitly in the "type checking business", my remark was
a bit off, though.
Where is valuesD introduced?

Not sure i understand the question... i used this variable name "ValuesD"
(meaning Values from Data) not to conflict with the other variable named
"values" in the same function.


I spotted an error where there was none - nothing to understand here :-(
Module level code is habitually wrapped like so:

if __name__ == "__main__":
infile = open('cached-objects-Python-pickled-sample', 'rb')
_data = pickle.load(infile)
infile.close()
# process _data...

That way, you can import the module as well as use it as a standalone
script.


Oh ? but, i can use it as standalone script ?!
If i type myScriptName.py in my dos prompt the script is working !


Yes, but you can *only* use it as a standalone script. If you import it into
another module, it will try to read the pickled data from the hard-coded
file before you can do anything else.
Peter
Jul 18 '05 #6

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

3
by: Knoppix User | last post by:
Hi folks I still have not been able to solve this one, which didn't manage to garner a response yesterday, so I'm rephrasing it here. If I define an attribute in the __init__ statement of a...
3
by: Walter | last post by:
But I'm stumped..... I've got a windows 2000 server and I am trying to set up PHPBB on it using a mysql database.. I am very inexperienced on this..... Ive installed mysql V4.0.20d and I can...
13
by: Christoph Brunner | last post by:
Hi, on the sun homepage i had submit to the bugparade a request for feature enhancement for the JAXB API. After a period of time sun called me to post my request to a newsgroup an get comments...
20
by: Chad Everett | last post by:
Hi all, I am new to the group. Trying to learn Python programming on my own. I am working through Michael Dawson's Book Python Programming for the absolute beginner. I am tring to write a...
2
by: Eric | last post by:
I have a textbox on a form that is populated from the database when the form loads. When I check textbox.Text when the user clicks my submit button, the value is always what it was when the form...
15
by: manstey | last post by:
Hi, I have a text file called a.txt: # comments I read it using this:
2
by: Carl | last post by:
I'm new to C#, and I have only limited programming experience. I've been doing the video tutorials at MS's website, and they're very helpful, but I decided to experiment with GDI+ and have gotten...
6
by: johnny | last post by:
How do I join two string variables? I want to do: download_dir + filename. download_dir=r'c:/download/' filename =r'log.txt' I want to get something like this: c:/download/log.txt
12
by: joaotsetsemoita | last post by:
Hello everyone, im completly new to vb.net and I was assigned to do a simple piece of software that just had to select from um db in a MS access data base and insert into a SQL server Database....
0
by: ryjfgjl | last post by:
If we have dozens or hundreds of excel to import into the database, if we use the excel import function provided by database editors such as navicat, it will be extremely tedious and time-consuming...
0
by: ryjfgjl | last post by:
In our work, we often receive Excel tables with data in the same format. If we want to analyze these data, it can be difficult to analyze them because the data is spread across multiple Excel files...
0
by: emmanuelkatto | last post by:
Hi All, I am Emmanuel katto from Uganda. I want to ask what challenges you've faced while migrating a website to cloud. Please let me know. Thanks! Emmanuel
1
by: nemocccc | last post by:
hello, everyone, I want to develop a software for my android phone for daily needs, any suggestions?
1
by: Sonnysonu | last post by:
This is the data of csv file 1 2 3 1 2 3 1 2 3 1 2 3 2 3 2 3 3 the lengths should be different i have to store the data by column-wise with in the specific length. suppose the i have to...
0
marktang
by: marktang | last post by:
ONU (Optical Network Unit) is one of the key components for providing high-speed Internet services. Its primary function is to act as an endpoint device located at the user's premises. However,...
0
Oralloy
by: Oralloy | last post by:
Hello folks, I am unable to find appropriate documentation on the type promotion of bit-fields when using the generalised comparison operator "<=>". The problem is that using the GNU compilers,...
0
by: Hystou | last post by:
Overview: Windows 11 and 10 have less user interface control over operating system update behaviour than previous versions of Windows. In Windows 11 and 10, there is no way to turn off the Windows...
0
tracyyun
by: tracyyun | last post by:
Dear forum friends, With the development of smart home technology, a variety of wireless communication protocols have appeared on the market, such as Zigbee, Z-Wave, Wi-Fi, Bluetooth, etc. Each...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.