Connecting Tech Pros Worldwide Help | Site Map

Read Text File with Binary Header - C#

dm3281
Guest
 
Posts: n/a
#1: Jun 27 '08
Hello, I have a text report from a mainframe that I need to parse.

The report has about a 2580 byte header that contains binary information
(garbage for the most part); although there are a couple areas that have
ASCII text that I need to extract. At the end of the 2580 bytes, I can read
the report like a standard text file. It should have CR/LF at the end of
each line.

What is the best way for me to read this report using C#. It is almost like
I need to access the file using seek() or something and then read it using
ReadLine() or something.

I have a sample file here. The extension is .BIN to cause your browser to
prompt for the file download.

http://members.verizon.net/dm3281/misc/TEST.BIN

Any assistance or sample code would be appreciated.


dm3281
Guest
 
Posts: n/a
#2: Jun 27 '08

re: Read Text File with Binary Header - C#


This is what I have so far and it kind of works for ReportID and CustID.
Then I try and do a ReadLine using streamreader and it re-reads the entire
file and prints the garbage at the beginning?

using System;
using System.IO;

namespace sample
{
public class test
{
static void Main()
{
FileStream fs = new FileStream(@"C:\TEMP\TEST.BIN",FileMode.Open);


// get report name
Console.Write("ReportID: ");
fs.Seek(877, SeekOrigin.Begin);
for (int i = 0; i < 43 && i < fs.Length; i++)
{
Console.Write((char) fs.ReadByte());
}

// get customer ID
Console.WriteLine();
Console.Write("CustID: ");
fs.Seek(1178, SeekOrigin.Begin);
for (int i = 0; i < 3 && i < fs.Length; i++)
{
Console.Write((char) fs.ReadByte());
}


StreamReader sr = new StreamReader(fs);

// jump to start of report
Console.WriteLine();
sr.BaseStream.Seek(1171,SeekOrigin.Begin);
//s.Seek(1171, SeekOrigin.Begin);

string str = sr.ReadLine();
while (str != null)
{
Console.WriteLine(str);
str = sr.ReadLine();
}
sr.Close();
fs.Close();
}

}
}






"dm3281" <dm3281@nospam.netwrote in message
news:23DAA8D0-8CF6-4145-A6EA-1A20B59A076F@microsoft.com...
Quote:
Hello, I have a text report from a mainframe that I need to parse.
>
The report has about a 2580 byte header that contains binary information
(garbage for the most part); although there are a couple areas that have
ASCII text that I need to extract. At the end of the 2580 bytes, I can
read the report like a standard text file. It should have CR/LF at the
end of each line.
>
What is the best way for me to read this report using C#. It is almost
like I need to access the file using seek() or something and then read it
using ReadLine() or something.
>
I have a sample file here. The extension is .BIN to cause your browser to
prompt for the file download.
>
http://members.verizon.net/dm3281/misc/TEST.BIN
>
Any assistance or sample code would be appreciated.
>
>
Mach58
Guest
 
Posts: n/a
#3: Jun 27 '08

re: Read Text File with Binary Header - C#


On Tue, 17 Jun 2008 23:37:56 -0400, "dm3281" <dm3281@nospam.net>
wrote:
Quote:
>This is what I have so far and it kind of works for ReportID and CustID.
>Then I try and do a ReadLine using streamreader and it re-reads the entire
>file and prints the garbage at the beginning?
>
>using System;
>using System.IO;
>
>namespace sample
>{
public class test
{
static void Main()
{
FileStream fs = new FileStream(@"C:\TEMP\TEST.BIN",FileMode.Open);
>
>
// get report name
Console.Write("ReportID: ");
fs.Seek(877, SeekOrigin.Begin);
for (int i = 0; i < 43 && i < fs.Length; i++)
{
Console.Write((char) fs.ReadByte());
}
>
// get customer ID
Console.WriteLine();
Console.Write("CustID: ");
fs.Seek(1178, SeekOrigin.Begin);
for (int i = 0; i < 3 && i < fs.Length; i++)
{
Console.Write((char) fs.ReadByte());
}
>
>
StreamReader sr = new StreamReader(fs);
>
// jump to start of report
Console.WriteLine();
sr.BaseStream.Seek(1171,SeekOrigin.Begin);
Shouldn't this be 2581? Might also want a sr.DiscardBufferedData()
here as well. Going back to 1171 gets garbage.
Quote:
string str = sr.ReadLine();
while (str != null)
{
Console.WriteLine(str);
str = sr.ReadLine();
}
sr.Close();
fs.Close();
}
>
}
>}
>
>
>
>
>
=?Utf-8?B?TW9ydGVuIFdlbm5ldmlrIFtDIyBNVlBd?=
Guest
 
Posts: n/a
#4: Jun 27 '08

re: Read Text File with Binary Header - C#


Hi,

As Mach58 pointed out, your report position is wrong and since StreamReader
is reading the the data as text you probably have some binary data making the
StreamReader return a null line prematurely. If you change the position you
should get the text. Alternately you could treat everything as a byte array
and extract necessary text using Encoding.ASCII.

Below is another way to do the same as your method. It uses a StringBuilder
to assemble the string. The reason for this was mainly due to using a
windows application and assembling everything to a single string object
before displaying it. It copies all the binary data to a byte array and uses
the byte array to read from instead of a stream. It isn't necessarily better
or worse reading from a byte array instead of a stream, but using a stream I
would probably use fs.Read and store the data in a byte arrays instead of
using a StreamReader.


StringBuilder sb = new StringBuilder();

int reportIdPosition = 877;
int custIdPosition = 1178;
int reportPosition = 2581;

byte[] data = File.ReadAllBytes(@"C:\TEST.BIN");
byte[] reportId = new byte[43];
byte[] custId = new byte[3];
byte[] report = new byte[data.Length - reportPosition];

// get report name
Array.Copy(data, reportIdPosition, reportId, 0, reportId.Length);
sb.AppendLine("ReportID: " + Encoding.ASCII.GetString(reportId));

// get customer ID
Array.Copy(data, custIdPosition, custId, 0, custId.Length);
sb.AppendLine("CustID: " + Encoding.ASCII.GetString(custId));

// get report
Array.Copy(data, reportPosition, report, 0, report.Length);
sb.AppendLine(Encoding.ASCII.GetString(report));

Console.WriteLine(sb.ToString());


--
Happy Coding!
Morten Wennevik [C# MVP]


"dm3281" wrote:
Quote:
This is what I have so far and it kind of works for ReportID and CustID.
Then I try and do a ReadLine using streamreader and it re-reads the entire
file and prints the garbage at the beginning?
>
using System;
using System.IO;
>
namespace sample
{
public class test
{
static void Main()
{
FileStream fs = new FileStream(@"C:\TEMP\TEST.BIN",FileMode.Open);
>
>
// get report name
Console.Write("ReportID: ");
fs.Seek(877, SeekOrigin.Begin);
for (int i = 0; i < 43 && i < fs.Length; i++)
{
Console.Write((char) fs.ReadByte());
}
>
// get customer ID
Console.WriteLine();
Console.Write("CustID: ");
fs.Seek(1178, SeekOrigin.Begin);
for (int i = 0; i < 3 && i < fs.Length; i++)
{
Console.Write((char) fs.ReadByte());
}
>
>
StreamReader sr = new StreamReader(fs);
>
// jump to start of report
Console.WriteLine();
sr.BaseStream.Seek(1171,SeekOrigin.Begin);
//s.Seek(1171, SeekOrigin.Begin);
>
string str = sr.ReadLine();
while (str != null)
{
Console.WriteLine(str);
str = sr.ReadLine();
}
sr.Close();
fs.Close();
}
>
}
}
>
>
>
>
>
>
"dm3281" <dm3281@nospam.netwrote in message
news:23DAA8D0-8CF6-4145-A6EA-1A20B59A076F@microsoft.com...
Quote:
Hello, I have a text report from a mainframe that I need to parse.

The report has about a 2580 byte header that contains binary information
(garbage for the most part); although there are a couple areas that have
ASCII text that I need to extract. At the end of the 2580 bytes, I can
read the report like a standard text file. It should have CR/LF at the
end of each line.

What is the best way for me to read this report using C#. It is almost
like I need to access the file using seek() or something and then read it
using ReadLine() or something.

I have a sample file here. The extension is .BIN to cause your browser to
prompt for the file download.

http://members.verizon.net/dm3281/misc/TEST.BIN

Any assistance or sample code would be appreciated.
>
=?Utf-8?B?RGF2aWRN?=
Guest
 
Posts: n/a
#5: Jun 27 '08

re: Read Text File with Binary Header - C#


Thanks everyone from the reply.

Morten, regarding your way or the approach I was taking...

How difficult would it be to then parse the report for various columns and
totals? Basically, I will need to scan report looking for the BLOCKED USED
section and then pull out the amounts for the various block numbers.








"Morten Wennevik [C# MVP]" wrote:
Quote:
Hi,
>
As Mach58 pointed out, your report position is wrong and since StreamReader
is reading the the data as text you probably have some binary data making the
StreamReader return a null line prematurely. If you change the position you
should get the text. Alternately you could treat everything as a byte array
and extract necessary text using Encoding.ASCII.
>
Below is another way to do the same as your method. It uses a StringBuilder
to assemble the string. The reason for this was mainly due to using a
windows application and assembling everything to a single string object
before displaying it. It copies all the binary data to a byte array and uses
the byte array to read from instead of a stream. It isn't necessarily better
or worse reading from a byte array instead of a stream, but using a stream I
would probably use fs.Read and store the data in a byte arrays instead of
using a StreamReader.
>
>
StringBuilder sb = new StringBuilder();
>
int reportIdPosition = 877;
int custIdPosition = 1178;
int reportPosition = 2581;
>
byte[] data = File.ReadAllBytes(@"C:\TEST.BIN");
byte[] reportId = new byte[43];
byte[] custId = new byte[3];
byte[] report = new byte[data.Length - reportPosition];
>
// get report name
Array.Copy(data, reportIdPosition, reportId, 0, reportId.Length);
sb.AppendLine("ReportID: " + Encoding.ASCII.GetString(reportId));
>
// get customer ID
Array.Copy(data, custIdPosition, custId, 0, custId.Length);
sb.AppendLine("CustID: " + Encoding.ASCII.GetString(custId));
>
// get report
Array.Copy(data, reportPosition, report, 0, report.Length);
sb.AppendLine(Encoding.ASCII.GetString(report));
>
Console.WriteLine(sb.ToString());
>
>
--
Happy Coding!
Morten Wennevik [C# MVP]
>
>
"dm3281" wrote:
>
Quote:
This is what I have so far and it kind of works for ReportID and CustID.
Then I try and do a ReadLine using streamreader and it re-reads the entire
file and prints the garbage at the beginning?

using System;
using System.IO;

namespace sample
{
public class test
{
static void Main()
{
FileStream fs = new FileStream(@"C:\TEMP\TEST.BIN",FileMode.Open);


// get report name
Console.Write("ReportID: ");
fs.Seek(877, SeekOrigin.Begin);
for (int i = 0; i < 43 && i < fs.Length; i++)
{
Console.Write((char) fs.ReadByte());
}

// get customer ID
Console.WriteLine();
Console.Write("CustID: ");
fs.Seek(1178, SeekOrigin.Begin);
for (int i = 0; i < 3 && i < fs.Length; i++)
{
Console.Write((char) fs.ReadByte());
}


StreamReader sr = new StreamReader(fs);

// jump to start of report
Console.WriteLine();
sr.BaseStream.Seek(1171,SeekOrigin.Begin);
//s.Seek(1171, SeekOrigin.Begin);

string str = sr.ReadLine();
while (str != null)
{
Console.WriteLine(str);
str = sr.ReadLine();
}
sr.Close();
fs.Close();
}

}
}






"dm3281" <dm3281@nospam.netwrote in message
news:23DAA8D0-8CF6-4145-A6EA-1A20B59A076F@microsoft.com...
Quote:
Hello, I have a text report from a mainframe that I need to parse.
>
The report has about a 2580 byte header that contains binary information
(garbage for the most part); although there are a couple areas that have
ASCII text that I need to extract. At the end of the 2580 bytes, I can
read the report like a standard text file. It should have CR/LF at the
end of each line.
>
What is the best way for me to read this report using C#. It is almost
like I need to access the file using seek() or something and then read it
using ReadLine() or something.
>
I have a sample file here. The extension is .BIN to cause your browser to
prompt for the file download.
>
http://members.verizon.net/dm3281/misc/TEST.BIN
>
Any assistance or sample code would be appreciated.
>
>
=?Utf-8?B?TW9ydGVuIFdlbm5ldmlrIFtDIyBNVlBd?=
Guest
 
Posts: n/a
#6: Jun 27 '08

re: Read Text File with Binary Header - C#


Hi David,

If you are looking for the least amount of code lines, it could be done with

string reportString = Encoding.ASCII.GetString(report);
string[] reportLines = reportString.Split(new string[] {
Environment.NewLine }, StringSplitOptions.None);

string searchPhrase = "* * * * * * * * * * B L O C K S U S E
D * * * * * * * * * *";
int startIndex = Array.FindIndex<string>(reportLines, 0,
delegate(string s) { return s.Trim() == searchPhrase; });
int endIndex = Array.FindIndex<string>(reportLines, startIndex,
delegate(string s) { return s.Trim() == ""; });

string totalsLine = reportLines[endIndex - 1];
string[] totals = totalsLine.Split(new string[] { " " },
StringSplitOptions.RemoveEmptyEntries);

string totalDebits = totals[1].Trim();
string totalCredits = totals[2].Trim();


You could manage with even less if there is always a SUSPECT BLOCKS at after
the BLOCKS USED section

string searchPhrase = "**** SUSPECT DUPLICATE BLOCKS ****";
int startIndex = Array.FindIndex<string>(reportLines, 0,
delegate(string s) { return s.Trim() == searchPhrase; });

string totalsLine = reportLines[startIndex - 2];


In the end it all depends on the realiability of the report file. Identify
markers that will always be there and use those to find the sections you need.

--
Happy Coding!
Morten Wennevik [C# MVP]


"DavidM" wrote:
Quote:
Thanks everyone from the reply.
>
Morten, regarding your way or the approach I was taking...
>
How difficult would it be to then parse the report for various columns and
totals? Basically, I will need to scan report looking for the BLOCKED USED
section and then pull out the amounts for the various block numbers.
>
>
>
>
>
>
>
>
"Morten Wennevik [C# MVP]" wrote:
>
Quote:
Hi,

As Mach58 pointed out, your report position is wrong and since StreamReader
is reading the the data as text you probably have some binary data making the
StreamReader return a null line prematurely. If you change the position you
should get the text. Alternately you could treat everything as a byte array
and extract necessary text using Encoding.ASCII.

Below is another way to do the same as your method. It uses a StringBuilder
to assemble the string. The reason for this was mainly due to using a
windows application and assembling everything to a single string object
before displaying it. It copies all the binary data to a byte array and uses
the byte array to read from instead of a stream. It isn't necessarily better
or worse reading from a byte array instead of a stream, but using a stream I
would probably use fs.Read and store the data in a byte arrays instead of
using a StreamReader.


StringBuilder sb = new StringBuilder();

int reportIdPosition = 877;
int custIdPosition = 1178;
int reportPosition = 2581;

byte[] data = File.ReadAllBytes(@"C:\TEST.BIN");
byte[] reportId = new byte[43];
byte[] custId = new byte[3];
byte[] report = new byte[data.Length - reportPosition];

// get report name
Array.Copy(data, reportIdPosition, reportId, 0, reportId.Length);
sb.AppendLine("ReportID: " + Encoding.ASCII.GetString(reportId));

// get customer ID
Array.Copy(data, custIdPosition, custId, 0, custId.Length);
sb.AppendLine("CustID: " + Encoding.ASCII.GetString(custId));

// get report
Array.Copy(data, reportPosition, report, 0, report.Length);
sb.AppendLine(Encoding.ASCII.GetString(report));

Console.WriteLine(sb.ToString());


--
Happy Coding!
Morten Wennevik [C# MVP]


"dm3281" wrote:
Quote:
This is what I have so far and it kind of works for ReportID and CustID.
Then I try and do a ReadLine using streamreader and it re-reads the entire
file and prints the garbage at the beginning?
>
using System;
using System.IO;
>
namespace sample
{
public class test
{
static void Main()
{
FileStream fs = new FileStream(@"C:\TEMP\TEST.BIN",FileMode.Open);
>
>
// get report name
Console.Write("ReportID: ");
fs.Seek(877, SeekOrigin.Begin);
for (int i = 0; i < 43 && i < fs.Length; i++)
{
Console.Write((char) fs.ReadByte());
}
>
// get customer ID
Console.WriteLine();
Console.Write("CustID: ");
fs.Seek(1178, SeekOrigin.Begin);
for (int i = 0; i < 3 && i < fs.Length; i++)
{
Console.Write((char) fs.ReadByte());
}
>
>
StreamReader sr = new StreamReader(fs);
>
// jump to start of report
Console.WriteLine();
sr.BaseStream.Seek(1171,SeekOrigin.Begin);
//s.Seek(1171, SeekOrigin.Begin);
>
string str = sr.ReadLine();
while (str != null)
{
Console.WriteLine(str);
str = sr.ReadLine();
}
sr.Close();
fs.Close();
}
>
}
}
>
>
>
>
>
>
"dm3281" <dm3281@nospam.netwrote in message
news:23DAA8D0-8CF6-4145-A6EA-1A20B59A076F@microsoft.com...
Hello, I have a text report from a mainframe that I need to parse.

The report has about a 2580 byte header that contains binary information
(garbage for the most part); although there are a couple areas that have
ASCII text that I need to extract. At the end of the 2580 bytes, I can
read the report like a standard text file. It should have CR/LF at the
end of each line.

What is the best way for me to read this report using C#. It is almost
like I need to access the file using seek() or something and then read it
using ReadLine() or something.

I have a sample file here. The extension is .BIN to cause your browser to
prompt for the file download.

http://members.verizon.net/dm3281/misc/TEST.BIN

Any assistance or sample code would be appreciated.


>
Closed Thread