469,946 Members | 1,787 Online
Bytes | Developer Community
New Post

Home Posts Topics Members FAQ

Post your question to a community of 469,946 developers. It's quick & easy.

Creating ACH-style text files from Datasets in C#

I was wondering what anyone might suggest as "best practice" patterns for
streaming out fixed formatted text files with C#? Let's say we get our data
in a dataset table and we need to iterate over thousands..potentially tens
of thousands of rows to create a properly formatted text file -- like an ACH
file, for example. In there you typically have a header, a body block of
detail rows and a trailer. (The trailers usually contain sums of the
details and counts, but for purposes of my question, we'll ignore that in
the comments below)

Concerning only the detail rows, what's the FASTEST, most effecient way to
get them out of the dataset, validate each dataset field and push into/onto
a fixed length line. For example, some checks might be that numbers must be
validated as numbers (They may be stored in the database as strings, but
must be treated as "numbers" and be left padded with zeros. If a field is
length 5 and my number is 23, my output must be 00023 for that block. Let's
say that I could possibly get text in which is invalid and the entire
Dataset row then skipped or at least handled by some other sub function.
Let's say if the number was too large, say it would evaluate to 6 digits,
that the rules would then truncate the left and leave the 5 right most.

Let's say with text it would always Left justfy and be padded with spaces to
the right if less than the specified length or truncated if too long.

Let's say currency comes in as 15.25 from the database, but must display in
a 10 character fixed lenght, left padded and no decimal, ie, 0000001525.

Date time formats must be MMDDYY.

I did just complete an example of this. It runs pretty fast (On my test, it
can output 45K rows in under a few seconds) -- but I ran into some questions
regarding string handling and getting the data formatted and just wondered
about some best practices/patterns for handling some of the situations above
and whether I might have done things better. (For example, for the dates, I
couldn't find a quick and easy one step process to format MMDDYY. I didn't
feel I could depend on the locale formatted short date, so I broke each part
out with a something like this:

int datePart= Convert.ToInt32(tempDate.Month);

(that gave me an integer, say 2 for Feb, which I then appened to a
streamwriter line like this:

textRowLine.AppendFormat("{0:D2}",datePart);

When it came to the year part, I pulled YEAR out as a string (gives you
2003, but I needed 03), so now a string I used
..Substring(sYear.Length -2,2); to pull it into an integer and then did the
..AppendFormat(... line like above.

In my case I used a streamwriter to create the filelines. I iterated the
dataset from the first record to the last. At the beginning of each record
loop, I created strings to hold the output of each dataset field (This would
be one of my effeciency questions). I also created a stringbuilder to
represent each "completed' fixed format line. For example, I pulled "check
number" from the dataset and inside a try{} block cast it to an int32. If I
don't fall into the catch, I know it was valid (not text). I used the
..substring method to pull/parse text.

Once I had appended in my stringbuilder line, I then wrote it to the
filestream object with the .WriteLine(textRowLine.ToString();

When it came to currency, I pulled the amount out as a decimal from the
dataset (it's datatype) and multipled by 100, then dropped the precision
with a cast to int and then padded into the stringbuilder with this type of
syntax:

int curAmt= Convert.ToInt32(Convert.ToDecimal(amount)*100);
textRowLine.AppendFormat("{0:D10}",curAmt);

What are some alternatives/improvements to capturing errors, manipulating
and padding strings with spaces, validating numeric and padding it with
zeros, and stripping out. Basically it needs to be solid as far as error
handling (can't crash or create corrupt file if it receives, say text in a
numeric place), but have good coding practices., lowest overhead, fastest
code.

For example, above, I did not use a strongly typed dataset, so I couldn't
rely on that to do any checking for types for me. I basically pulled each
field out as a string and validated from there. I would have thought an
easier, more straightforward way for dates to MMDDYY existed, but I couldn't
find one in the short timeframe I had to get this done. It's in retrospect,
I want to review what I did. Let me know. I'd appreciate any feedback
someone has.

Todd
Nov 15 '05 #1
2 5774
Todd,

Are you interested in making ACH files? If so, we've created a complete ACH
class (along with classes for batch and detail records) - we may be looking
to put this out for sale in the near future (it is still undergoing
testing) - but it is a complete set of objects to create NACHA files.

Just FYI - we setup classes to represent the ACH file, batch, and detail
records (abstracted for the differnet types of detail records) - batch and
detail records each having there own collections. Each class member was
represented with the correct datatype (ie: dates are DateTime and money is
decimal) then reformatted as nessacery on file output. Formatting dates is
really just myDate.ToString("YYmmDD") will get you 031211 (make sure to use
lower case m's)

Makes it really easy to create ACH files. Let me know if you might be
interested - it's still got some bugs, but the code is very clean -as for
speed I'll test it and post my results here.

Matt

"Todd_M" <t_**************@yahoo.com> wrote in message
news:vt************@corp.supernews.com...
I was wondering what anyone might suggest as "best practice" patterns for
streaming out fixed formatted text files with C#? Let's say we get our data in a dataset table and we need to iterate over thousands..potentially tens
of thousands of rows to create a properly formatted text file -- like an ACH file, for example. In there you typically have a header, a body block of
detail rows and a trailer. (The trailers usually contain sums of the
details and counts, but for purposes of my question, we'll ignore that in
the comments below)

Concerning only the detail rows, what's the FASTEST, most effecient way to
get them out of the dataset, validate each dataset field and push into/onto a fixed length line. For example, some checks might be that numbers must be validated as numbers (They may be stored in the database as strings, but
must be treated as "numbers" and be left padded with zeros. If a field is
length 5 and my number is 23, my output must be 00023 for that block. Let's say that I could possibly get text in which is invalid and the entire
Dataset row then skipped or at least handled by some other sub function.
Let's say if the number was too large, say it would evaluate to 6 digits,
that the rules would then truncate the left and leave the 5 right most.

Let's say with text it would always Left justfy and be padded with spaces to the right if less than the specified length or truncated if too long.

Let's say currency comes in as 15.25 from the database, but must display in a 10 character fixed lenght, left padded and no decimal, ie, 0000001525.

Date time formats must be MMDDYY.

I did just complete an example of this. It runs pretty fast (On my test, it can output 45K rows in under a few seconds) -- but I ran into some questions regarding string handling and getting the data formatted and just wondered
about some best practices/patterns for handling some of the situations above and whether I might have done things better. (For example, for the dates, I couldn't find a quick and easy one step process to format MMDDYY. I didn't
feel I could depend on the locale formatted short date, so I broke each part out with a something like this:

int datePart= Convert.ToInt32(tempDate.Month);

(that gave me an integer, say 2 for Feb, which I then appened to a
streamwriter line like this:

textRowLine.AppendFormat("{0:D2}",datePart);

When it came to the year part, I pulled YEAR out as a string (gives you
2003, but I needed 03), so now a string I used
.Substring(sYear.Length -2,2); to pull it into an integer and then did the
.AppendFormat(... line like above.

In my case I used a streamwriter to create the filelines. I iterated the
dataset from the first record to the last. At the beginning of each record
loop, I created strings to hold the output of each dataset field (This would be one of my effeciency questions). I also created a stringbuilder to
represent each "completed' fixed format line. For example, I pulled "check
number" from the dataset and inside a try{} block cast it to an int32. If I don't fall into the catch, I know it was valid (not text). I used the
.substring method to pull/parse text.

Once I had appended in my stringbuilder line, I then wrote it to the
filestream object with the .WriteLine(textRowLine.ToString();

When it came to currency, I pulled the amount out as a decimal from the
dataset (it's datatype) and multipled by 100, then dropped the precision
with a cast to int and then padded into the stringbuilder with this type of syntax:

int curAmt= Convert.ToInt32(Convert.ToDecimal(amount)*100);
textRowLine.AppendFormat("{0:D10}",curAmt);

What are some alternatives/improvements to capturing errors, manipulating
and padding strings with spaces, validating numeric and padding it with
zeros, and stripping out. Basically it needs to be solid as far as error
handling (can't crash or create corrupt file if it receives, say text in a
numeric place), but have good coding practices., lowest overhead, fastest
code.

For example, above, I did not use a strongly typed dataset, so I couldn't
rely on that to do any checking for types for me. I basically pulled each
field out as a string and validated from there. I would have thought an
easier, more straightforward way for dates to MMDDYY existed, but I couldn't find one in the short timeframe I had to get this done. It's in retrospect, I want to review what I did. Let me know. I'd appreciate any feedback
someone has.

Todd

Nov 15 '05 #2
No this is not actually an ACH file. It does resemble it though. Perhaps the
lower case "mm's" threw me on the format, I'll try that. Thanks for the tip.
The actual file was in what they call an ARP format, used by (among others,
I think) Wells Fargo. It's much simpler than the ACH batch headers and stuff
like that. Just has a single line header, a bunch of detail records
representing checks written and then a checksum trailer line.

Right now, the part of the application that does ACH is already completed
and uses a VB6 engine and the filescripting object to stream the file.

The speed on mine wasn't bad at all, but I was doing a lot with strings that
my gut just felt could be handled more efficiently. I was able to output
85,000 rows and it took less than a second and a half to stream them out
once the dataset returned. That's pretty fast.

Thanks again for the tips!

Todd

"Matt Retzer" <ma*******@splamzone.com> wrote in message
news:Oo**************@TK2MSFTNGP09.phx.gbl...
Todd,

Are you interested in making ACH files? If so, we've created a complete ACH class (along with classes for batch and detail records) - we may be looking to put this out for sale in the near future (it is still undergoing
testing) - but it is a complete set of objects to create NACHA files.

Just FYI - we setup classes to represent the ACH file, batch, and detail
records (abstracted for the differnet types of detail records) - batch and
detail records each having there own collections. Each class member was
represented with the correct datatype (ie: dates are DateTime and money is
decimal) then reformatted as nessacery on file output. Formatting dates is
really just myDate.ToString("YYmmDD") will get you 031211 (make sure to use lower case m's)

Makes it really easy to create ACH files. Let me know if you might be
interested - it's still got some bugs, but the code is very clean -as for
speed I'll test it and post my results here.

Matt

"Todd_M" <t_**************@yahoo.com> wrote in message
news:vt************@corp.supernews.com...
I was wondering what anyone might suggest as "best practice" patterns for streaming out fixed formatted text files with C#? Let's say we get our data
in a dataset table and we need to iterate over thousands..potentially tens of thousands of rows to create a properly formatted text file -- like an

ACH
file, for example. In there you typically have a header, a body block of
detail rows and a trailer. (The trailers usually contain sums of the
details and counts, but for purposes of my question, we'll ignore that in the comments below)

Concerning only the detail rows, what's the FASTEST, most effecient way to get them out of the dataset, validate each dataset field and push

into/onto
a fixed length line. For example, some checks might be that numbers must

be
validated as numbers (They may be stored in the database as strings, but
must be treated as "numbers" and be left padded with zeros. If a field is length 5 and my number is 23, my output must be 00023 for that block.

Let's
say that I could possibly get text in which is invalid and the entire
Dataset row then skipped or at least handled by some other sub function.
Let's say if the number was too large, say it would evaluate to 6 digits, that the rules would then truncate the left and leave the 5 right most.

Let's say with text it would always Left justfy and be padded with spaces to
the right if less than the specified length or truncated if too long.

Let's say currency comes in as 15.25 from the database, but must display in
a 10 character fixed lenght, left padded and no decimal, ie, 0000001525.

Date time formats must be MMDDYY.

I did just complete an example of this. It runs pretty fast (On my test,

it
can output 45K rows in under a few seconds) -- but I ran into some

questions
regarding string handling and getting the data formatted and just

wondered about some best practices/patterns for handling some of the situations

above
and whether I might have done things better. (For example, for the dates, I
couldn't find a quick and easy one step process to format MMDDYY. I
didn't feel I could depend on the locale formatted short date, so I broke each

part
out with a something like this:

int datePart= Convert.ToInt32(tempDate.Month);

(that gave me an integer, say 2 for Feb, which I then appened to a
streamwriter line like this:

textRowLine.AppendFormat("{0:D2}",datePart);

When it came to the year part, I pulled YEAR out as a string (gives you
2003, but I needed 03), so now a string I used
.Substring(sYear.Length -2,2); to pull it into an integer and then did the .AppendFormat(... line like above.

In my case I used a streamwriter to create the filelines. I iterated the
dataset from the first record to the last. At the beginning of each record loop, I created strings to hold the output of each dataset field (This

would
be one of my effeciency questions). I also created a stringbuilder to
represent each "completed' fixed format line. For example, I pulled "check number" from the dataset and inside a try{} block cast it to an int32. If I
don't fall into the catch, I know it was valid (not text). I used the
.substring method to pull/parse text.

Once I had appended in my stringbuilder line, I then wrote it to the
filestream object with the .WriteLine(textRowLine.ToString();

When it came to currency, I pulled the amount out as a decimal from the
dataset (it's datatype) and multipled by 100, then dropped the precision
with a cast to int and then padded into the stringbuilder with this type

of
syntax:

int curAmt= Convert.ToInt32(Convert.ToDecimal(amount)*100);
textRowLine.AppendFormat("{0:D10}",curAmt);

What are some alternatives/improvements to capturing errors,

manipulating and padding strings with spaces, validating numeric and padding it with
zeros, and stripping out. Basically it needs to be solid as far as error
handling (can't crash or create corrupt file if it receives, say text in a numeric place), but have good coding practices., lowest overhead, fastest code.

For example, above, I did not use a strongly typed dataset, so I couldn't rely on that to do any checking for types for me. I basically pulled each field out as a string and validated from there. I would have thought an
easier, more straightforward way for dates to MMDDYY existed, but I

couldn't
find one in the short timeframe I had to get this done. It's in

retrospect,
I want to review what I did. Let me know. I'd appreciate any feedback
someone has.

Todd


Nov 15 '05 #3

This discussion thread is closed

Replies have been disabled for this discussion.

Similar topics

2 posts views Thread by Pawan | last post: by
2 posts views Thread by LIN | last post: by
7 posts views Thread by Gladiator | last post: by
9 posts views Thread by =?Utf-8?B?YmJn?= | last post: by
By using this site, you agree to our Privacy Policy and Terms of Use.