By using this site, you agree to our updated Privacy Policy and our Terms of Use. Manage your Cookies Settings.
435,533 Members | 2,102 Online
Bytes IT Community
+ Ask a Question
Need help? Post your question and get tips & solutions from a community of 435,533 IT Pros & Developers. It's quick & easy.

Data Records Formats Testing Tool

P: n/a
(If these are the wrong groups please suggest the right one(s). Thanks.)

I need to come up with a way to test potentially thousands of data (files /
records / streams) to determine if they match one of about thirty defined
data formats. If a record partially matches one of the formats I need to
log why it failed.

The formats are byte-oriented. Byte 0 is the type, byte 1 is the subtype,
bytes 2-5 give the total record length, etc. There are two wrinkles.
First, some of the formats allow 1..n subrecords, like a person listing her
home phone, cell phone, fax number, ICQ #, the dog's cell phone, etc.
Second, some of the formats allow other formats to be wholly contained in
them, like an "inventory" format being made up of many separate items of
different "item" format types.

In the history of computers this *can't* be the first need for this kind of
program. ;-) New formats are approved periodically so hard-coding
everything in C# or VB.NET is a sub-optimal solution. ISTM it should be
possible write the permissible format "rules" in (XML / ASN.1 / RegEx /
etc.), present the rules to a tried and true program, and smash data files
against the program all day long.

Suggestions? Windows preferred but not required.

Thanks.

-- Mark
Nov 16 '05 #1
Share this Question
Share on Google+
2 Replies


P: n/a
Hi,

Convert the stream to a string and use an regular expressions to
match the format. Not sure how you will be able to tell if the phone number
is a home number, fax, or dog's cell phone.

http://msdn.microsoft.com/library/de...xpressions.asp

Library of regular expressions.
http://www.regexlib.com/
Ken
-------------------------
"Mark Jerde" <ma********@verizon.no.spam.net> wrote in message
news:ue****************@TK2MSFTNGP12.phx.gbl...
(If these are the wrong groups please suggest the right one(s). Thanks.)

I need to come up with a way to test potentially thousands of data (files /
records / streams) to determine if they match one of about thirty defined
data formats. If a record partially matches one of the formats I need to
log why it failed.

The formats are byte-oriented. Byte 0 is the type, byte 1 is the subtype,
bytes 2-5 give the total record length, etc. There are two wrinkles.
First, some of the formats allow 1..n subrecords, like a person listing her
home phone, cell phone, fax number, ICQ #, the dog's cell phone, etc.
Second, some of the formats allow other formats to be wholly contained in
them, like an "inventory" format being made up of many separate items of
different "item" format types.

In the history of computers this *can't* be the first need for this kind of
program. ;-) New formats are approved periodically so hard-coding
everything in C# or VB.NET is a sub-optimal solution. ISTM it should be
possible write the permissible format "rules" in (XML / ASN.1 / RegEx /
etc.), present the rules to a tried and true program, and smash data files
against the program all day long.

Suggestions? Windows preferred but not required.

Thanks.

-- Mark

Nov 16 '05 #2

P: n/a
Ken Tucker [MVP] wrote:
Hi,

Convert the stream to a string and use an regular expressions
to match the format.
Thanks, I'll look into this if we decide to write something. I don't know
much about regular expressions yet but I'm concerned about the calculated
offsets and regex complexity (and validation). See the phones example
below.

There are some advantages for this project to use a commercial or open
source product. A "drag & drop" interface like Visio would be ideal.
Not sure how you will be able to tell if the
phone number is a home number, fax, or dog's cell phone.
(My addition may be off...)
Byte 10 - Length of the phone text description
Bytes 11 to 11+(val(Byte10-1)) - Phone text description
Byte 11+(val(Byte10)) - Length of phone number
Bytes (11+(val(Byte10))) to (11+(val(Byte10)))+(val(11+(val(Byte10)))-1) -
Phone number

-- Mark

http://msdn.microsoft.com/library/de...xpressions.asp
Library of regular expressions.
http://www.regexlib.com/
Ken
-------------------------
"Mark Jerde" <ma********@verizon.no.spam.net> wrote in message
news:ue****************@TK2MSFTNGP12.phx.gbl...
(If these are the wrong groups please suggest the right one(s).
Thanks.)

I need to come up with a way to test potentially thousands of data
(files / records / streams) to determine if they match one of about
thirty defined data formats. If a record partially matches one of
the formats I need to log why it failed.

The formats are byte-oriented. Byte 0 is the type, byte 1 is the
subtype, bytes 2-5 give the total record length, etc. There are two
wrinkles. First, some of the formats allow 1..n subrecords, like a
person listing her home phone, cell phone, fax number, ICQ #, the
dog's cell phone, etc. Second, some of the formats allow other
formats to be wholly contained in them, like an "inventory" format
being made up of many separate items of different "item" format types.

In the history of computers this *can't* be the first need for this
kind of program. ;-) New formats are approved periodically so
hard-coding everything in C# or VB.NET is a sub-optimal solution.
ISTM it should be possible write the permissible format "rules" in
(XML / ASN.1 / RegEx / etc.), present the rules to a tried and true
program, and smash data files against the program all day long.

Suggestions? Windows preferred but not required.

Thanks.

-- Mark

Nov 16 '05 #3

This discussion thread is closed

Replies have been disabled for this discussion.