Parsing String, Dictionary Lookups, Writing to Database Table

Rich Shepard

I need to learn how to process a byte stream from a form reader where each
pair of bytes has meaning according to lookup dictionaries, then use the
values to build an array of rows inserted into a sqlite3 database table.

Here's the context: The OMR card reader sends a stream of 69 bytes over
the serial line; the last byte is a carriage return ('\r') indicating the
end of record. Three pairs (in specific positions at the beginning of the
stream) represent blanks (no value); two other pairs represent character
strings; the values are determined from two dictionaries. The remaining 28
pairs represent values from a third dictionary.

What I'm doing: I think that I have the first part correct; that is, the
three blanks and two string values. The first three dictionaries are:

DATA_MAP_BLANK = {
chr(32)+chr(32): 0 # SP + SP
}
DATA_MAP_2 = {
chr(32)+chr(36): 'nat', # SP + $
chr(96)+chr(32): 'eco', # + SP
chr(36)+chr(32): 'soc' # $ + SP
}
DATA_MAP_5 = {
chr(32)+chr(36): 'pro', # SP + $
chr(96)+chr(32): 'neu', # + SP
chr(36)+chr(32): 'con' # $ + SP
}

I read the data into a string and split that into byte tokens, then start
building the row to be inserted into a database table:

line = ser.readline()
split_line = line.split()

# then pre-pend the record number to the front of the row
row.join(', ', vote_id)

# extract category choice (row 2, bytes 2 and 3); look up value in dictionary
cat = split_line(2:4)
row.join(', ', DATA_MAP_2(cat) # I'm not sure this is a correct lookup

# extract position (row 5, bytes 8 and 9P; look up value in dictionary
pos = split_line(8:10)
row.join(', ', DATA_MAP_5(pos))

Is the above the most 'correct' way of extracting specific byte pairs,
using them as dictionary keys to get values, then build a string of
comma-and-quote values for insertion in the database table?

Then, I've no idea how to get the rest of the data parsed for use as keys
in the last data mapping dictionary. I do not see a skip value for slicing,
other than in Numeric Python, and I'm not yet building an array of data.

Here's the last dictionary:

DATA_MAP_7 = {
chr(32)+chr(16): 1.000, # SP + DLE
chr(32)+chr(8): 2.000, # SP + BS
chr(32)+chr(4): 3.000, # SP + EOT
chr(32)+chr(2): 4.000, # SP + STX
chr(32)+chr(1): 5.000, # SP + SOH
chr(64)+chr(32): 6.000, # @ + SP
chr(16)+chr(32): 7.000, # DLE + SP
chr(8)+chr(32): 8.000, # BS + SP
chr(4)+chr(32): 9.000, # EOT + SP
chr(34)+chr(8): 0.500, # " + BS
chr(34)+chr(4): 0.333, # " + EOT
chr(34)+chr(2): 0.025, # " + STX
chr(34)+chr(1): 0.200, # " + SOH
chr(66)+chr(32): 0.167, # B + SP
chr(18)+chr(32): 0.143, # DC2 + SP
chr(10)+chr(32): 0.125, # LF + SP
chr(6)+chr(32): 0.111 # ACK + SP
}

I know how I'd do all this in C, but since I'm learning python I have not
found how best to accomplish this despite the books and online references
I've read.

TIA,

Rich

--
Richard B. Shepard, Ph.D. | The Environmental Permitting
Applied Ecosystem Services, Inc.(TM) | Accelerator
<http://www.appl-ecosys.com Voice: 503-667-4517 Fax: 503-667-8863

Sep 11 '06 #1

Subscribe Post Reply

2497

George Sakkis

Rich Shepard wrote:

I know how I'd do all this in C, but since I'm learning python I have not
found how best to accomplish this despite the books and online references
I've read.

Can you post one or more examples of expected input-output pairs ? From
your description it's not really clear what's going on. Also, please
post working code; the snippets you posted were out of context (what is
row?) and not always correct syntactically (split_line(2:4)).

George

Sep 11 '06 #2

mensanator

Rich Shepard wrote:

I need to learn how to process a byte stream from a form reader where each
pair of bytes has meaning according to lookup dictionaries, then use the
values to build an array of rows inserted into a sqlite3 database table.

Here's the context: The OMR card reader sends a stream of 69 bytes over
the serial line; the last byte is a carriage return ('\r') indicating the
end of record. Three pairs (in specific positions at the beginning of the
stream) represent blanks (no value); two other pairs represent character
strings; the values are determined from two dictionaries. The remaining 28
pairs represent values from a third dictionary.

What I'm doing: I think that I have the first part correct; that is, the
three blanks and two string values. The first three dictionaries are:

DATA_MAP_BLANK = {
chr(32)+chr(32): 0 # SP + SP
}
DATA_MAP_2 = {
chr(32)+chr(36): 'nat', # SP + $
chr(96)+chr(32): 'eco', # + SP
chr(36)+chr(32): 'soc' # $ + SP
}
DATA_MAP_5 = {
chr(32)+chr(36): 'pro', # SP + $
chr(96)+chr(32): 'neu', # + SP
chr(36)+chr(32): 'con' # $ + SP
}

I read the data into a string and split that into byte tokens, then start
building the row to be inserted into a database table:

line = ser.readline()
split_line = line.split()

# then pre-pend the record number to the front of the row
row.join(', ', vote_id)

# extract category choice (row 2, bytes 2 and 3); look up value in dictionary
cat = split_line(2:4)
row.join(', ', DATA_MAP_2(cat) # I'm not sure this is a correct lookup

# extract position (row 5, bytes 8 and 9P; look up value in dictionary
pos = split_line(8:10)
row.join(', ', DATA_MAP_5(pos))

Is the above the most 'correct' way of extracting specific byte pairs,
using them as dictionary keys to get values, then build a string of
comma-and-quote values for insertion in the database table?

No, if you're going to insert into sqlite3, you don't want a
csv string, you want a list of values (see example code
below).

>
Then, I've no idea how to get the rest of the data parsed for use as keys
in the last data mapping dictionary. I do not see a skip value for slicing,
other than in Numeric Python, and I'm not yet building an array of data.

Here's the last dictionary:

DATA_MAP_7 = {
chr(32)+chr(16): 1.000, # SP + DLE
chr(32)+chr(8): 2.000, # SP + BS
chr(32)+chr(4): 3.000, # SP + EOT
chr(32)+chr(2): 4.000, # SP + STX
chr(32)+chr(1): 5.000, # SP + SOH
chr(64)+chr(32): 6.000, # @ + SP
chr(16)+chr(32): 7.000, # DLE + SP
chr(8)+chr(32): 8.000, # BS + SP
chr(4)+chr(32): 9.000, # EOT + SP
chr(34)+chr(8): 0.500, # " + BS
chr(34)+chr(4): 0.333, # " + EOT
chr(34)+chr(2): 0.025, # " + STX
chr(34)+chr(1): 0.200, # " + SOH
chr(66)+chr(32): 0.167, # B + SP
chr(18)+chr(32): 0.143, # DC2 + SP
chr(10)+chr(32): 0.125, # LF + SP
chr(6)+chr(32): 0.111 # ACK + SP
}

I know how I'd do all this in C, but since I'm learning python I have not
found how best to accomplish this despite the books and online references
I've read.

As I'm now an expert in sqlite3, here's a sample program that
may do what you need. Note, I had to create some test data.
Make sure it's correct with regards to how your input stream
works.

Also, if vote_id is autoincrement, that's one less thing your
program has to do.

import sqlite3
import random

con = sqlite3.connect(":memory:")
cur = con.cursor()

cur.execute("""
CREATE TABLE test (vote_id integer primary key autoincrement,
category text,
position text,
f01 real,
f02 real,
f03 real,
f04 real,
f05 real,
f06 real,
f07 real,
f08 real,
f09 real,
f10 real,
f11 real,
f12 real,
f13 real,
f14 real,
f15 real,
f16 real,
f17 real,
f18 real,
f19 real,
f20 real,
f21 real,
f22 real,
f23 real,
f24 real,
f25 real,
f26 real,
f27 real,
f28 real,
f29 real)
""")

DATA_MAP_BLANK = {chr(32)+chr(32):0}

DATA_MAP_2 = {chr(32)+chr(36): 'nat', \
chr(96)+chr(32): 'eco', \
chr(36)+chr(32): 'soc'}

DATA_MAP_5 = {chr(32)+chr(36): 'pro', \
chr(96)+chr(32): 'neu', \
chr(36)+chr(32): 'con'}

DATA_MAP_7 = {chr(32)+chr(16): 1.000, \
chr(32)+chr(8): 2.000, \
chr(32)+chr(4): 3.000, \
chr(32)+chr(2): 4.000, \
chr(32)+chr(1): 5.000, \
chr(64)+chr(32): 6.000, \
chr(16)+chr(32): 7.000, \
chr(8)+chr(32): 8.000, \
chr(4)+chr(32): 9.000, \
chr(34)+chr(8): 0.500, \
chr(34)+chr(4): 0.333, \
chr(34)+chr(2): 0.025, \
chr(34)+chr(1): 0.200, \
chr(66)+chr(32): 0.167, \
chr(18)+chr(32): 0.143, \
chr(10)+chr(32): 0.125, \
chr(6)+chr(32): 0.111}

# create test data
DM2_key = DATA_MAP_2.keys()
DM5_key = DATA_MAP_5.keys()
DM7_key = DATA_MAP_7.keys()
DM7_key.extend(DM7_key)

for i in xrange(10):
random.shuffle(DM2_key)
random.shuffle(DM5_key)
random.shuffle(DM7_key)

test = ' ' + DM2_key[0] + ' ' + DM5_key[0]
for i in xrange(29):
test = test + DM7_key[i]

values = [DATA_MAP_2[test[2:4]],DATA_MAP_5[test[8:10]]]
for i in xrange(10,68,2):
values.append(DATA_MAP_7[test[i:i+2]])

cur.execute("""
INSERT INTO test ( category,
position,
f01,
f02,
f03,
f04,
f05,
f06,
f07,
f08,
f09,
f10,
f11,
f12,
f13,
f14,
f15,
f16,
f17,
f18,
f19,
f20,
f21,
f22,
f23,
f24,
f25,
f26,
f27,
f28,
f29)
VALUES
(?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,? ,?,?,?,?,?,?)"""
,values)
# read back what was inserted to verify
cur.execute("""SELECT * FROM test""")
d = cur.fetchall()

for i in d:
print
print
print "Record: %d Category: %s Position: %s" % (i[0],i[1],i[2])
print '-'*40,
count = 0
for j in i[3:31]:
if (count % 6)==0: print '\n',
print '%0.3f ' % (j),
count += 1

## Sample run
##
## Record: 1 Category: soc Position: con
## ----------------------------------------
## 0.111 0.125 0.200 3.000 3.000 6.000
## 9.000 5.000 8.000 7.000 0.333 0.500
## 7.000 0.025 0.167 5.000 8.000 4.000
## 2.000 1.000 0.200 0.143 0.025 9.000
## 1.000 0.143 0.111 6.000
##
## Record: 2 Category: soc Position: neu
## ----------------------------------------
## 0.500 0.333 5.000 1.000 0.143 2.000
## 0.500 6.000 0.200 1.000 2.000 0.333
## 8.000 0.125 0.143 0.111 0.200 9.000
## 4.000 7.000 3.000 0.167 9.000 8.000
## 7.000 0.167 0.111 4.000
##
## Record: 3 Category: soc Position: pro
## ----------------------------------------
## 0.025 0.333 0.167 7.000 0.333 1.000
## 0.200 3.000 0.111 1.000 6.000 5.000
## 0.111 9.000 0.143 0.025 7.000 8.000
## 0.143 9.000 0.500 8.000 0.125 2.000
## 4.000 3.000 5.000 0.200
##
## Record: 4 Category: eco Position: neu
## ----------------------------------------
## 2.000 8.000 3.000 4.000 0.125 0.167
## 0.125 6.000 0.143 0.025 0.025 0.500
## 7.000 2.000 0.111 3.000 4.000 9.000
## 0.167 0.500 0.200 8.000 0.333 1.000
## 9.000 0.111 7.000 5.000
##
## Record: 5 Category: nat Position: con
## ----------------------------------------
## 9.000 0.333 1.000 7.000 9.000 0.111
## 0.200 0.333 5.000 0.500 0.500 5.000
## 2.000 8.000 2.000 0.125 4.000 4.000
## 0.025 0.200 0.125 0.111 3.000 6.000
## 7.000 8.000 0.167 0.167
##
## Record: 6 Category: eco Position: pro
## ----------------------------------------
## 2.000 0.200 0.167 5.000 3.000 4.000
## 0.333 8.000 0.200 0.025 5.000 0.143
## 9.000 0.167 0.125 0.111 7.000 0.025
## 0.111 6.000 0.333 1.000 0.500 0.500
## 0.125 0.143 1.000 3.000
##
## Record: 7 Category: soc Position: neu
## ----------------------------------------
## 0.333 1.000 0.500 0.167 0.111 5.000
## 9.000 0.143 6.000 0.125 4.000 6.000
## 3.000 0.143 0.125 7.000 0.200 4.000
## 2.000 0.025 0.167 7.000 8.000 3.000
## 9.000 0.025 5.000 0.111
##
## Record: 8 Category: eco Position: pro
## ----------------------------------------
## 6.000 2.000 0.125 3.000 4.000 0.025
## 0.167 5.000 0.025 0.143 0.111 0.200
## 2.000 1.000 4.000 0.167 7.000 0.111
## 5.000 0.500 6.000 8.000 1.000 0.333
## 0.125 9.000 0.200 9.000
##
## Record: 9 Category: soc Position: pro
## ----------------------------------------
## 3.000 0.500 0.143 7.000 5.000 6.000
## 4.000 0.333 0.025 1.000 0.200 3.000
## 9.000 8.000 6.000 0.500 0.111 1.000
## 0.333 0.167 0.025 5.000 8.000 9.000
## 0.125 0.111 0.200 2.000
##
## Record: 10 Category: soc Position: con
## ----------------------------------------
## 4.000 2.000 0.025 7.000 0.500 0.333
## 5.000 6.000 0.125 0.143 0.500 0.200
## 0.167 0.167 4.000 9.000 9.000 6.000
## 0.111 0.200 0.143 0.333 1.000 2.000
## 0.125 3.000 8.000 3.000

>
TIA,

Rich

--
Richard B. Shepard, Ph.D. | The Environmental Permitting
Applied Ecosystem Services, Inc.(TM) | Accelerator
<http://www.appl-ecosys.com Voice: 503-667-4517 Fax: 503-667-8863

Sep 12 '06 #3

Rich Shepard

On Mon, 11 Sep 2006, me********@aol.com wrote:

No, if you're going to insert into sqlite3, you don't want a csv string,
you want a list of values (see example code below).

Thank you very much. It makes solid sense and I can see the differences
and where I was not on track. I greatly appreciate the time and effort you
put into helping me.

Rich

--
Richard B. Shepard, Ph.D. | The Environmental Permitting
Applied Ecosystem Services, Inc.(TM) | Accelerator
<http://www.appl-ecosys.com Voice: 503-667-4517 Fax: 503-667-8863

Sep 12 '06 #4

Similar topics

dictionary lookup table?

by: John Mudd | last post by:

I must be missing something here. It's clearly faster to lookup an item directly in a dictionary than to scan through a list. So when I have a large lookup table I always load it in the form of a...

Python

Generic database dictionary access

by: Edward Diener | last post by:

Version 2.0 of the Python database API was written over 5 years ago, in 1999. While it has been used successfully by many implementations, there is no generic access into the data dictionary of...

Python

125

Pre-PEP: Dictionary accumulator methods

by: Raymond Hettinger | last post by:

I would like to get everyone's thoughts on two new dictionary methods: def count(self, value, qty=1): try: self += qty except KeyError: self = qty def appendlist(self, key, *values): try:

Python

Getting a function name from string

by: David Rasmussen | last post by:

If I have a string that contains the name of a function, can I call it? As in: def someFunction(): print "Hello" s = "someFunction" s() # I know this is wrong, but you get the idea... ...

Python

Alternative to lookups in tables - questions

by: Russell | last post by:

Hi, I've been reading a lot about not using lookups in tables lately. Thus far, what I've been able to understand from my reading is this: - Do not use lookup fields in tables because they...

Microsoft Access / VBA

how are dictionary literals handled by the interpreter?

by: akameswaran | last post by:

I wrote up a quick little set of tests, I was acutally comparing ways of doing "case" behavior just to get some performance information. Now two of my test cases had almost identical results which...

Python

Convert string to command..

by: Abandoned | last post by:

I want to convert a string to command.. For example i have a string: a="" I want to do this list.. How can i do ?

Python

issues simply parsing a whitespace-delimited textfile in pythonscript

by: Damon Getsman | last post by:

Okay so I'm writing a script in python right now as a dirty fix for a problem we're having at work.. Unfortunately this is the first really non-trivial script that I've had to work with in python...

Python

Relationships, Lookups and Access 2007 tables

by: netnewbie78 | last post by:

Hello All, I don't have a problem (but maybe I will after I explain). I have a question with regards to something I saw in Access 2007. But first, a little backstory: I'm writing a very small...

Microsoft Access / VBA

How to turn on java script in a villaon keypad mobile phone

by: Charles Arthur | last post by:

How do i turn on java script on a villaon, callus and itel keypad mobile phone

Java

Batch import of multiple excel files into the database

by: ryjfgjl | last post by:

If we have dozens or hundreds of excel to import into the database, if we use the excel import function provided by database editors such as navicat, it will be extremely tedious and time-consuming...

Data Management

Migrating Website to Cloud - Emmanuel Katto

by: emmanuelkatto | last post by:

Hi All, I am Emmanuel katto from Uganda. I want to ask what challenges you've faced while migrating a website to cloud. Please let me know. Thanks! Emmanuel

General

Navigating the Data Structures and Algorithms (DSA)

by: BarryA | last post by:

What are the essential steps and strategies outlined in the Data Structures and Algorithms (DSA) roadmap for aspiring data scientists? How can individuals effectively utilize this roadmap to progress...

Algorithms / Advanced Math

Looking to do Android software development, any suggestions? Is flutter better?

by: nemocccc | last post by:

hello, everyone, I want to develop a software for my android phone for daily needs, any suggestions?

General

Is that possible of reading the .csv file in column wise and the column have different lengths ?

by: Sonnysonu | last post by:

This is the data of csv file 1 2 3 1 2 3 1 2 3 1 2 3 2 3 2 3 3 the lengths should be different i have to store the data by column-wise with in the specific length. suppose the i have to...

C / C++

How to build RAID in BIOS?

by: Hystou | last post by:

There are some requirements for setting up RAID: 1. The motherboard and BIOS support RAID configuration. 2. The motherboard has 2 or more available SATA protocol SSD/HDD slots (including MSATA, M.2...

Computer Hardware

Changing the language in Windows 10

by: Hystou | last post by:

Most computers default to English, but sometimes we require a different language, especially when relocating. Forgot to request a specific language before your computer shipped? No problem! You can...

Windows Server

The easy way to turn off automatic updates for Windows 10/11

by: Hystou | last post by:

Overview: Windows 11 and 10 have less user interface control over operating system update behaviour than previous versions of Windows. In Windows 11 and 10, there is no way to turn off the Windows...

Windows Server