Anyone know of a MICR parser algorithm written in Python?

mkppk

MICR = The line of digits printed using magnetic ink at the bottom of
a check.

Does anyone know of a Python function that has been written to parse a
line of MICR data?
Or, some financial package that may contain such a thing?
Or, in general, where I should be looking when looking for a piece of
Python code that may have already been written by someone?

I'm working on a project that involves a check scanner the produces
the raw MICR line as text.

Now, that raw MICR needs to be parsed for the various pieces of info.
The problem with MICR is that there is no standard layout. There are
some general rules for item placement, but beyond that it is up to the
individual banks to define how they choose to position the
information.

I did find an old C program written by someone at IBM... But I've read
it and it is Not code that would nicely convert to Python (maybe its
all the Python I'm used to, be it seems very poorly written).

Here is the link to that C code: ftp://ftp.software.ibm.com/software/...0/4610micr.zip

I've even tried using boost to generate a Python module, but that
didn't go well, and in the end is not going to be a solution for me
anyway.. really need access to the Python source.

Any help at all would be appreciated,

-mkp

Mar 24 '07 #1

Subscribe Post Reply

4169

Paul McGuire

On Mar 24, 2:05 pm, "mkppk" <barnaclej...@gmail.comwrote:

MICR = The line of digits printed using magnetic ink at the bottom of
a check.

Does anyone know of a Python function that has been written to parse a
line of MICR data?
Or, some financial package that may contain such a thing?
Or, in general, where I should be looking when looking for a piece of
Python code that may have already been written by someone?

I'm working on a project that involves a check scanner the produces
the raw MICR line as text.

Now, that raw MICR needs to be parsed for the various pieces of info.
The problem with MICR is that there is no standard layout. There are
some general rules for item placement, but beyond that it is up to the
individual banks to define how they choose to position the
information.

I did find an old C program written by someone at IBM... But I've read
it and it is Not code that would nicely convert to Python (maybe its
all the Python I'm used to, be it seems very poorly written).

Here is the link to that C code:ftp://ftp.software.ibm.com/software/...0/4610micr.zip

I've even tried using boost to generate a Python module, but that
didn't go well, and in the end is not going to be a solution for me
anyway.. really need access to the Python source.

Any help at all would be appreciated,

-mkp

Is there a spec somewhere for this data? Googling for "MICR data
format specification" and similar gives links to specifications for
the MICR character *fonts*, but not for the data content.

And you are right, reverse-engineering this code is more than a 10-
minute exercise. (However, the zip file *does* include a nice set of
test cases, which might be better than the C code as a starting point
for new code.)

-- Paul

Mar 24 '07 #2

mkppk

On Mar 24, 4:55 pm, "Paul McGuire" <p...@austin.rr.comwrote:

On Mar 24, 2:05 pm, "mkppk" <barnaclej...@gmail.comwrote:

MICR = The line of digits printed using magnetic ink at the bottom of
a check.

Does anyone know of a Python function that has been written to parse a
line of MICR data?
Or, some financial package that may contain such a thing?
Or, in general, where I should be looking when looking for a piece of
Python code that may have already been written by someone?

I'm working on a project that involves a check scanner the produces
the raw MICR line as text.

Now, that raw MICR needs to be parsed for the various pieces of info.
The problem with MICR is that there is no standard layout. There are
some general rules for item placement, but beyond that it is up to the
individual banks to define how they choose to position the
information.

I did find an old C program written by someone at IBM... But I've read
it and it is Not code that would nicely convert to Python (maybe its
all the Python I'm used to, be it seems very poorly written).

Here is the link to that C code:ftp://ftp.software.ibm.com/software/...0/4610micr.zip

I've even tried using boost to generate a Python module, but that
didn't go well, and in the end is not going to be a solution for me
anyway.. really need access to the Python source.

Any help at all would be appreciated,

-mkp

Is there a spec somewhere for this data? Googling for "MICR data
format specification" and similar gives links to specifications for
the MICR character *fonts*, but not for the data content.

And you are right, reverse-engineering this code is more than a 10-
minute exercise. (However, the zip file *does* include a nice set of
test cases, which might be better than the C code as a starting point
for new code.)

-- Paul

Well, the problem is that the "specification" is that "there is no
specification", thats just the way the MICR data line has evolved in
the banking industry unfortunately for us developers.. That being
said, there are obviusly enough banking companies out that with enough
example data to have intelligent parsers that handle all the
variations. And the C program appears to have all that built into it.

Its just that I would rather not reinvent the wheel (or read old C
code)..

So, the search continues..

Mar 24 '07 #3

Paul McGuire

On Mar 24, 6:52 pm, "mkppk" <barnaclej...@gmail.comwrote:

>
Its just that I would rather not reinvent the wheel (or read old C
code)..

Wouldn't we all!

Here is the basic structure of a pyparsing solution. The parsing part
isn't so bad - the real problem is the awful ParseONUS routine in C.
Plus things are awkward since the C program parses right-to-left and
then reverses all of the found fields, and the parser I wrote works
left-to-right. Still, this grammar does most of the job. I've left
out my port of ParseONUS since it is *so* ugly, and not really part of
the pyparsing example.

-- Paul

from pyparsing import *

# define values for optional fields
NoAmountGiven = ""
NoEPCGiven = ""
NoAuxOnusGiven = ""

# define delimiters
DOLLAR = Suppress("$")
T_ = Suppress("T")
A_ = Suppress("A")

# field definitions
amt = DOLLAR + Word(nums,exact=10) + DOLLAR
onus = Word("0123456789A- ")
transit = T_ + Word("0123456789-") + T_
epc = oneOf( list(nums) )
aux_onus = A_ + Word("0123456789- ") + A_

# validation parse action
def validateTransitNumber(t):
transit = t[0]
flds = transit.split("-")
if len(flds) 2:
raise ParseException(0, "too many dashes in transit number",
0)
if len(flds) == 2:
if len(flds[0]) not in (3,4):
raise ParseException(0, "invalid dash position in transit
number", 0)
else:
# compute checksum
ti = map(int,transit)
ti.reverse() # original algorithm worked with reversed data
cksum = 3*(ti[8]+ti[5]+ti[2]) + 7*(ti[7]+ti[4]+ti[1]) +
ti[6]+ti[3]+ti[0]
if cksum%10 != 0:
raise ParseException(0, "transit number failed checksum",
0)
return transit

# define overall MICR format, with results names
micrdata =
Optional(aux_onus,default=NoAuxOnusGiven).setResul tsName("aux_onus") +
\
Optional(epc,default=NoEPCGiven).setResultsName("e pc") +\

transit.setParseAction(validateTransitNumber).setR esultsName("transit")
+ \
onus.setResultsName("onus") + \
Optional(amt,default=NoAmountGiven).setResultsName ("amt")
+ \
stringEnd

import re

def parseONUS(tokens):
tokens["csn"] = ""
tokens["tpc"] = ""
tokens["account"] = ""
tokens["amt"] = tokens["amt"][0]
onus = tokens.onus
# remainder omitted out of respect for newsreaders...
# suffice to say that unspeakable acts are performed on
# onus and aux_onus fields to extract account and
# check numbers

micrdata.setParseAction(parseONUS)

testdata = file("checks.csv").readlines()[1:]
tests = [(flds[1],flds) for flds in map(lambda
l:l.split(","),testdata)]
def verifyResults(res,csv):
def match(x,y):
print (x==y and "_" or "X"),x,"=",y
Ex,MICR,Bank,Stat,Amt,AS,TPC,TS,CSN,CS,ACCT,AS,EPC ,ES,ONUS,OS,AUX,AS,Tran,TS
= csv
match(res.amt,Amt)
match(res.account,ACCT)
match(res.csn,CSN)
match(res.onus,ONUS)
match(res.tpc,TPC)
match(res.epc,EPC)
match(res.transit,Tran)

for t,data in tests:
print t
try:
res = micrdata.parseString(t)
print res.dump()
if not(data[0] == "No"):
print "Passed expression that should have failed"
verifyResults(res,data)
except ParseException,pe:
print "<parse failed%s" % pe.msg
if not(data[0] == "Yes"):
print "Failed expression that should have passed"
print

Mar 25 '07 #4

mkppk

On Mar 25, 12:30 am, "Paul McGuire" <p...@austin.rr.comwrote:

On Mar 24, 6:52 pm, "mkppk" <barnaclej...@gmail.comwrote:

Its just that I would rather not reinvent the wheel (or read old C
code)..

Wouldn't we all!

Here is the basic structure of a pyparsing solution. The parsing part
isn't so bad - the real problem is the awful ParseONUS routine in C.
Plus things are awkward since the C program parses right-to-left and
then reverses all of the found fields, and the parser I wrote works
left-to-right. Still, this grammar does most of the job. I've left
out my port of ParseONUS since it is *so* ugly, and not really part of
the pyparsing example.

-- Paul

from pyparsing import *

# define values for optional fields
NoAmountGiven = ""
NoEPCGiven = ""
NoAuxOnusGiven = ""

# define delimiters
DOLLAR = Suppress("$")
T_ = Suppress("T")
A_ = Suppress("A")

# field definitions
amt = DOLLAR + Word(nums,exact=10) + DOLLAR
onus = Word("0123456789A- ")
transit = T_ + Word("0123456789-") + T_
epc = oneOf( list(nums) )
aux_onus = A_ + Word("0123456789- ") + A_

# validation parse action
def validateTransitNumber(t):
transit = t[0]
flds = transit.split("-")
if len(flds) 2:
raise ParseException(0, "too many dashes in transit number",
0)
if len(flds) == 2:
if len(flds[0]) not in (3,4):
raise ParseException(0, "invalid dash position in transit
number", 0)
else:
# compute checksum
ti = map(int,transit)
ti.reverse() # original algorithm worked with reversed data
cksum = 3*(ti[8]+ti[5]+ti[2]) + 7*(ti[7]+ti[4]+ti[1]) +
ti[6]+ti[3]+ti[0]
if cksum%10 != 0:
raise ParseException(0, "transit number failed checksum",
0)
return transit

# define overallMICRformat, with results names
micrdata =
Optional(aux_onus,default=NoAuxOnusGiven).setResul tsName("aux_onus") +
\
Optional(epc,default=NoEPCGiven).setResultsName("e pc") +\

transit.setParseAction(validateTransitNumber).setR esultsName("transit")
+ \
onus.setResultsName("onus") + \
Optional(amt,default=NoAmountGiven).setResultsName ("amt")
+ \
stringEnd

import re

def parseONUS(tokens):
tokens["csn"] = ""
tokens["tpc"] = ""
tokens["account"] = ""
tokens["amt"] = tokens["amt"][0]
onus = tokens.onus
# remainder omitted out of respect for newsreaders...
# suffice to say that unspeakable acts are performed on
# onus and aux_onus fields to extract account and
# check numbers

micrdata.setParseAction(parseONUS)

testdata = file("checks.csv").readlines()[1:]
tests = [(flds[1],flds) for flds in map(lambda
l:l.split(","),testdata)]
def verifyResults(res,csv):
def match(x,y):
print (x==y and "_" or "X"),x,"=",y

Ex,MICR,Bank,Stat,Amt,AS,TPC,TS,CSN,CS,ACCT,AS,EPC ,ES,ONUS,OS,AUX,AS,Tran,TS
= csv
match(res.amt,Amt)
match(res.account,ACCT)
match(res.csn,CSN)
match(res.onus,ONUS)
match(res.tpc,TPC)
match(res.epc,EPC)
match(res.transit,Tran)

for t,data in tests:
print t
try:
res = micrdata.parseString(t)
print res.dump()
if not(data[0] == "No"):
print "Passed expression that should have failed"
verifyResults(res,data)
except ParseException,pe:
print "<parse failed%s" % pe.msg
if not(data[0] == "Yes"):
print "Failed expression that should have passed"
print

Great, thanks for taking a look Paul. I had never tried to use
pyparsing before. Yea, the ONUS field is crazy, don't know why there
is no standard for it.

Mar 25 '07 #5

Similar topics

Python parser generators

by: anton muhin | last post by:

Hello, everybody! Can someone give an overview of existing Python parser generators? I played with TPG and like it a lot. However, I'd like to know more about alternatives. Google shows...

Python

Has anyone implemented BASIC in Python?

by: Leif K-Brooks | last post by:

Has anyone ever tried implementing a simple unstructured BASIC dialect in Python? I'm getting interested in language implementation, and looking at a reasonably simple example like that could be...

Python

python parser

by: tuxlover | last post by:

Hello everyone I have to write a verilog parser in python for a class project. I was wondering if all you folks could advise me on choosing the right python parser module. I am not comfortable...

Python

Can anyone tell me of any optimations I could do to this to make it faster?

by: Extremest | last post by:

I know there are ways to make this a lot faster. Any newsreader does this in seconds. I don't know how they do it and I am very new to c#. If anyone knows a faster way please let me know. All...

C# / C Sharp

import parser does not import parser.py in same dir on win

by: Joel Hedlund | last post by:

Hi! I have a possibly dumb question about imports. I've written two python modules: parser.py ------------------------------------ class Parser(object): "my parser"...

Python

Anyone persuaded by "merits of Lisp vs Python"?

by: Paddy3118 | last post by:

This month there was/is a 1000+ long thread called: "merits of Lisp vs Python" In comp.lang.lisp. If you followed even parts of the thread, AND previously used only one of the languages AND...

Python

RE: F2PY ?? Has anyone worked with the F2PY generator?

by: Blubaugh, David A. | last post by:

Pauli, Yes, I am utilizing the windows environment. I cannot install f2py. I obtain the following error when I try to execute the setup.py file within the f2py folder located within the...

Python

Re: Looking for a Duo - file comparison and a file parser

by: Robert Kern | last post by:

dudeja.rajat@gmail.com wrote: There are a couple of ways to do #3. One would be to use the difflib module from the standard library. The Differ.compare() method will give you a sequence of lines...

Python

[ANN] PyYAML-3.06: YAML parser and emitter for Python

by: Kirill Simonov | last post by:

======================== Announcing PyYAML-3.06 ======================== A new bug fix release of PyYAML is now available: http://pyyaml.org/wiki/PyYAML Changes

Python

How to turn on java script in a villaon keypad mobile phone

by: Charles Arthur | last post by:

How do i turn on java script on a villaon, callus and itel keypad mobile phone

Java

Navigating the Data Structures and Algorithms (DSA)

by: BarryA | last post by:

What are the essential steps and strategies outlined in the Data Structures and Algorithms (DSA) roadmap for aspiring data scientists? How can individuals effectively utilize this roadmap to progress...

Algorithms / Advanced Math

Looking to do Android software development, any suggestions? Is flutter better?

by: nemocccc | last post by:

hello, everyone, I want to develop a software for my android phone for daily needs, any suggestions?

General

Is that possible of reading the .csv file in column wise and the column have different lengths ?

by: Sonnysonu | last post by:

This is the data of csv file 1 2 3 1 2 3 1 2 3 1 2 3 2 3 2 3 3 the lengths should be different i have to store the data by column-wise with in the specific length. suppose the i have to...

C / C++

How to build RAID in BIOS?

by: Hystou | last post by:

There are some requirements for setting up RAID: 1. The motherboard and BIOS support RAID configuration. 2. The motherboard has 2 or more available SATA protocol SSD/HDD slots (including MSATA, M.2...

Computer Hardware

What is ONU?

by: marktang | last post by:

ONU (Optical Network Unit) is one of the key components for providing high-speed Internet services. Its primary function is to act as an endpoint device located at the user's premises. However,...

General

Changing the language in Windows 10

by: Hystou | last post by:

Most computers default to English, but sometimes we require a different language, especially when relocating. Forgot to request a specific language before your computer shipped? No problem! You can...

Windows Server

The easy way to turn off automatic updates for Windows 10/11

by: Hystou | last post by:

Overview: Windows 11 and 10 have less user interface control over operating system update behaviour than previous versions of Windows. In Windows 11 and 10, there is no way to turn off the Windows...

Windows Server

AI Job Threat for Devs

by: agi2029 | last post by:

Let's talk about the concept of autonomous AI software engineers and no-code agents. These AIs are designed to manage the entire lifecycle of a software development project—planning, coding, testing,...

Career Advice