By using this site, you agree to our updated Privacy Policy and our Terms of Use. Manage your Cookies Settings.
440,905 Members | 894 Online
Bytes IT Community
+ Ask a Question
Need help? Post your question and get tips & solutions from a community of 440,905 IT Pros & Developers. It's quick & easy.

Read Text File with Binary Header - C#

P: n/a
Hello, I have a text report from a mainframe that I need to parse.

The report has about a 2580 byte header that contains binary information
(garbage for the most part); although there are a couple areas that have
ASCII text that I need to extract. At the end of the 2580 bytes, I can read
the report like a standard text file. It should have CR/LF at the end of
each line.

What is the best way for me to read this report using C#. It is almost like
I need to access the file using seek() or something and then read it using
ReadLine() or something.

I have a sample file here. The extension is .BIN to cause your browser to
prompt for the file download.

http://members.verizon.net/dm3281/misc/TEST.BIN

Any assistance or sample code would be appreciated.
Jun 27 '08 #1
Share this Question
Share on Google+
5 Replies


P: n/a
This is what I have so far and it kind of works for ReportID and CustID.
Then I try and do a ReadLine using streamreader and it re-reads the entire
file and prints the garbage at the beginning?

using System;
using System.IO;

namespace sample
{
public class test
{
static void Main()
{
FileStream fs = new FileStream(@"C:\TEMP\TEST.BIN",FileMode.Open);
// get report name
Console.Write("ReportID: ");
fs.Seek(877, SeekOrigin.Begin);
for (int i = 0; i < 43 && i < fs.Length; i++)
{
Console.Write((char) fs.ReadByte());
}

// get customer ID
Console.WriteLine();
Console.Write("CustID: ");
fs.Seek(1178, SeekOrigin.Begin);
for (int i = 0; i < 3 && i < fs.Length; i++)
{
Console.Write((char) fs.ReadByte());
}
StreamReader sr = new StreamReader(fs);

// jump to start of report
Console.WriteLine();
sr.BaseStream.Seek(1171,SeekOrigin.Begin);
//s.Seek(1171, SeekOrigin.Begin);

string str = sr.ReadLine();
while (str != null)
{
Console.WriteLine(str);
str = sr.ReadLine();
}
sr.Close();
fs.Close();
}

}
}


"dm3281" <dm****@nospam.netwrote in message
news:23**********************************@microsof t.com...
Hello, I have a text report from a mainframe that I need to parse.

The report has about a 2580 byte header that contains binary information
(garbage for the most part); although there are a couple areas that have
ASCII text that I need to extract. At the end of the 2580 bytes, I can
read the report like a standard text file. It should have CR/LF at the
end of each line.

What is the best way for me to read this report using C#. It is almost
like I need to access the file using seek() or something and then read it
using ReadLine() or something.

I have a sample file here. The extension is .BIN to cause your browser to
prompt for the file download.

http://members.verizon.net/dm3281/misc/TEST.BIN

Any assistance or sample code would be appreciated.

Jun 27 '08 #2

P: n/a
On Tue, 17 Jun 2008 23:37:56 -0400, "dm3281" <dm****@nospam.net>
wrote:
>This is what I have so far and it kind of works for ReportID and CustID.
Then I try and do a ReadLine using streamreader and it re-reads the entire
file and prints the garbage at the beginning?

using System;
using System.IO;

namespace sample
{
public class test
{
static void Main()
{
FileStream fs = new FileStream(@"C:\TEMP\TEST.BIN",FileMode.Open);
// get report name
Console.Write("ReportID: ");
fs.Seek(877, SeekOrigin.Begin);
for (int i = 0; i < 43 && i < fs.Length; i++)
{
Console.Write((char) fs.ReadByte());
}

// get customer ID
Console.WriteLine();
Console.Write("CustID: ");
fs.Seek(1178, SeekOrigin.Begin);
for (int i = 0; i < 3 && i < fs.Length; i++)
{
Console.Write((char) fs.ReadByte());
}
StreamReader sr = new StreamReader(fs);

// jump to start of report
Console.WriteLine();
sr.BaseStream.Seek(1171,SeekOrigin.Begin);
Shouldn't this be 2581? Might also want a sr.DiscardBufferedData()
here as well. Going back to 1171 gets garbage.
string str = sr.ReadLine();
while (str != null)
{
Console.WriteLine(str);
str = sr.ReadLine();
}
sr.Close();
fs.Close();
}

}
}


Jun 27 '08 #3

P: n/a
Hi,

As Mach58 pointed out, your report position is wrong and since StreamReader
is reading the the data as text you probably have some binary data making the
StreamReader return a null line prematurely. If you change the position you
should get the text. Alternately you could treat everything as a byte array
and extract necessary text using Encoding.ASCII.

Below is another way to do the same as your method. It uses a StringBuilder
to assemble the string. The reason for this was mainly due to using a
windows application and assembling everything to a single string object
before displaying it. It copies all the binary data to a byte array and uses
the byte array to read from instead of a stream. It isn't necessarily better
or worse reading from a byte array instead of a stream, but using a stream I
would probably use fs.Read and store the data in a byte arrays instead of
using a StreamReader.
StringBuilder sb = new StringBuilder();

int reportIdPosition = 877;
int custIdPosition = 1178;
int reportPosition = 2581;

byte[] data = File.ReadAllBytes(@"C:\TEST.BIN");
byte[] reportId = new byte[43];
byte[] custId = new byte[3];
byte[] report = new byte[data.Length - reportPosition];

// get report name
Array.Copy(data, reportIdPosition, reportId, 0, reportId.Length);
sb.AppendLine("ReportID: " + Encoding.ASCII.GetString(reportId));

// get customer ID
Array.Copy(data, custIdPosition, custId, 0, custId.Length);
sb.AppendLine("CustID: " + Encoding.ASCII.GetString(custId));

// get report
Array.Copy(data, reportPosition, report, 0, report.Length);
sb.AppendLine(Encoding.ASCII.GetString(report));

Console.WriteLine(sb.ToString());
--
Happy Coding!
Morten Wennevik [C# MVP]
"dm3281" wrote:
This is what I have so far and it kind of works for ReportID and CustID.
Then I try and do a ReadLine using streamreader and it re-reads the entire
file and prints the garbage at the beginning?

using System;
using System.IO;

namespace sample
{
public class test
{
static void Main()
{
FileStream fs = new FileStream(@"C:\TEMP\TEST.BIN",FileMode.Open);
// get report name
Console.Write("ReportID: ");
fs.Seek(877, SeekOrigin.Begin);
for (int i = 0; i < 43 && i < fs.Length; i++)
{
Console.Write((char) fs.ReadByte());
}

// get customer ID
Console.WriteLine();
Console.Write("CustID: ");
fs.Seek(1178, SeekOrigin.Begin);
for (int i = 0; i < 3 && i < fs.Length; i++)
{
Console.Write((char) fs.ReadByte());
}
StreamReader sr = new StreamReader(fs);

// jump to start of report
Console.WriteLine();
sr.BaseStream.Seek(1171,SeekOrigin.Begin);
//s.Seek(1171, SeekOrigin.Begin);

string str = sr.ReadLine();
while (str != null)
{
Console.WriteLine(str);
str = sr.ReadLine();
}
sr.Close();
fs.Close();
}

}
}


"dm3281" <dm****@nospam.netwrote in message
news:23**********************************@microsof t.com...
Hello, I have a text report from a mainframe that I need to parse.

The report has about a 2580 byte header that contains binary information
(garbage for the most part); although there are a couple areas that have
ASCII text that I need to extract. At the end of the 2580 bytes, I can
read the report like a standard text file. It should have CR/LF at the
end of each line.

What is the best way for me to read this report using C#. It is almost
like I need to access the file using seek() or something and then read it
using ReadLine() or something.

I have a sample file here. The extension is .BIN to cause your browser to
prompt for the file download.

http://members.verizon.net/dm3281/misc/TEST.BIN

Any assistance or sample code would be appreciated.
Jun 27 '08 #4

P: n/a
Thanks everyone from the reply.

Morten, regarding your way or the approach I was taking...

How difficult would it be to then parse the report for various columns and
totals? Basically, I will need to scan report looking for the BLOCKED USED
section and then pull out the amounts for the various block numbers.


"Morten Wennevik [C# MVP]" wrote:
Hi,

As Mach58 pointed out, your report position is wrong and since StreamReader
is reading the the data as text you probably have some binary data making the
StreamReader return a null line prematurely. If you change the position you
should get the text. Alternately you could treat everything as a byte array
and extract necessary text using Encoding.ASCII.

Below is another way to do the same as your method. It uses a StringBuilder
to assemble the string. The reason for this was mainly due to using a
windows application and assembling everything to a single string object
before displaying it. It copies all the binary data to a byte array and uses
the byte array to read from instead of a stream. It isn't necessarily better
or worse reading from a byte array instead of a stream, but using a stream I
would probably use fs.Read and store the data in a byte arrays instead of
using a StreamReader.
StringBuilder sb = new StringBuilder();

int reportIdPosition = 877;
int custIdPosition = 1178;
int reportPosition = 2581;

byte[] data = File.ReadAllBytes(@"C:\TEST.BIN");
byte[] reportId = new byte[43];
byte[] custId = new byte[3];
byte[] report = new byte[data.Length - reportPosition];

// get report name
Array.Copy(data, reportIdPosition, reportId, 0, reportId.Length);
sb.AppendLine("ReportID: " + Encoding.ASCII.GetString(reportId));

// get customer ID
Array.Copy(data, custIdPosition, custId, 0, custId.Length);
sb.AppendLine("CustID: " + Encoding.ASCII.GetString(custId));

// get report
Array.Copy(data, reportPosition, report, 0, report.Length);
sb.AppendLine(Encoding.ASCII.GetString(report));

Console.WriteLine(sb.ToString());
--
Happy Coding!
Morten Wennevik [C# MVP]
"dm3281" wrote:
This is what I have so far and it kind of works for ReportID and CustID.
Then I try and do a ReadLine using streamreader and it re-reads the entire
file and prints the garbage at the beginning?

using System;
using System.IO;

namespace sample
{
public class test
{
static void Main()
{
FileStream fs = new FileStream(@"C:\TEMP\TEST.BIN",FileMode.Open);
// get report name
Console.Write("ReportID: ");
fs.Seek(877, SeekOrigin.Begin);
for (int i = 0; i < 43 && i < fs.Length; i++)
{
Console.Write((char) fs.ReadByte());
}

// get customer ID
Console.WriteLine();
Console.Write("CustID: ");
fs.Seek(1178, SeekOrigin.Begin);
for (int i = 0; i < 3 && i < fs.Length; i++)
{
Console.Write((char) fs.ReadByte());
}
StreamReader sr = new StreamReader(fs);

// jump to start of report
Console.WriteLine();
sr.BaseStream.Seek(1171,SeekOrigin.Begin);
//s.Seek(1171, SeekOrigin.Begin);

string str = sr.ReadLine();
while (str != null)
{
Console.WriteLine(str);
str = sr.ReadLine();
}
sr.Close();
fs.Close();
}

}
}


"dm3281" <dm****@nospam.netwrote in message
news:23**********************************@microsof t.com...
Hello, I have a text report from a mainframe that I need to parse.
>
The report has about a 2580 byte header that contains binary information
(garbage for the most part); although there are a couple areas that have
ASCII text that I need to extract. At the end of the 2580 bytes, I can
read the report like a standard text file. It should have CR/LF at the
end of each line.
>
What is the best way for me to read this report using C#. It is almost
like I need to access the file using seek() or something and then read it
using ReadLine() or something.
>
I have a sample file here. The extension is .BIN to cause your browser to
prompt for the file download.
>
http://members.verizon.net/dm3281/misc/TEST.BIN
>
Any assistance or sample code would be appreciated.
>
>
Jun 27 '08 #5

P: n/a
Hi David,

If you are looking for the least amount of code lines, it could be done with

string reportString = Encoding.ASCII.GetString(report);
string[] reportLines = reportString.Split(new string[] {
Environment.NewLine }, StringSplitOptions.None);

string searchPhrase = "* * * * * * * * * * B L O C K S U S E
D * * * * * * * * * *";
int startIndex = Array.FindIndex<string>(reportLines, 0,
delegate(string s) { return s.Trim() == searchPhrase; });
int endIndex = Array.FindIndex<string>(reportLines, startIndex,
delegate(string s) { return s.Trim() == ""; });

string totalsLine = reportLines[endIndex - 1];
string[] totals = totalsLine.Split(new string[] { " " },
StringSplitOptions.RemoveEmptyEntries);

string totalDebits = totals[1].Trim();
string totalCredits = totals[2].Trim();
You could manage with even less if there is always a SUSPECT BLOCKS at after
the BLOCKS USED section

string searchPhrase = "**** SUSPECT DUPLICATE BLOCKS ****";
int startIndex = Array.FindIndex<string>(reportLines, 0,
delegate(string s) { return s.Trim() == searchPhrase; });

string totalsLine = reportLines[startIndex - 2];
In the end it all depends on the realiability of the report file. Identify
markers that will always be there and use those to find the sections you need.

--
Happy Coding!
Morten Wennevik [C# MVP]
"DavidM" wrote:
Thanks everyone from the reply.

Morten, regarding your way or the approach I was taking...

How difficult would it be to then parse the report for various columns and
totals? Basically, I will need to scan report looking for the BLOCKED USED
section and then pull out the amounts for the various block numbers.


"Morten Wennevik [C# MVP]" wrote:
Hi,

As Mach58 pointed out, your report position is wrong and since StreamReader
is reading the the data as text you probably have some binary data making the
StreamReader return a null line prematurely. If you change the position you
should get the text. Alternately you could treat everything as a byte array
and extract necessary text using Encoding.ASCII.

Below is another way to do the same as your method. It uses a StringBuilder
to assemble the string. The reason for this was mainly due to using a
windows application and assembling everything to a single string object
before displaying it. It copies all the binary data to a byte array and uses
the byte array to read from instead of a stream. It isn't necessarily better
or worse reading from a byte array instead of a stream, but using a stream I
would probably use fs.Read and store the data in a byte arrays instead of
using a StreamReader.
StringBuilder sb = new StringBuilder();

int reportIdPosition = 877;
int custIdPosition = 1178;
int reportPosition = 2581;

byte[] data = File.ReadAllBytes(@"C:\TEST.BIN");
byte[] reportId = new byte[43];
byte[] custId = new byte[3];
byte[] report = new byte[data.Length - reportPosition];

// get report name
Array.Copy(data, reportIdPosition, reportId, 0, reportId.Length);
sb.AppendLine("ReportID: " + Encoding.ASCII.GetString(reportId));

// get customer ID
Array.Copy(data, custIdPosition, custId, 0, custId.Length);
sb.AppendLine("CustID: " + Encoding.ASCII.GetString(custId));

// get report
Array.Copy(data, reportPosition, report, 0, report.Length);
sb.AppendLine(Encoding.ASCII.GetString(report));

Console.WriteLine(sb.ToString());
--
Happy Coding!
Morten Wennevik [C# MVP]
"dm3281" wrote:
This is what I have so far and it kind of works for ReportID and CustID.
Then I try and do a ReadLine using streamreader and it re-reads the entire
file and prints the garbage at the beginning?
>
using System;
using System.IO;
>
namespace sample
{
public class test
{
static void Main()
{
FileStream fs = new FileStream(@"C:\TEMP\TEST.BIN",FileMode.Open);
>
>
// get report name
Console.Write("ReportID: ");
fs.Seek(877, SeekOrigin.Begin);
for (int i = 0; i < 43 && i < fs.Length; i++)
{
Console.Write((char) fs.ReadByte());
}
>
// get customer ID
Console.WriteLine();
Console.Write("CustID: ");
fs.Seek(1178, SeekOrigin.Begin);
for (int i = 0; i < 3 && i < fs.Length; i++)
{
Console.Write((char) fs.ReadByte());
}
>
>
StreamReader sr = new StreamReader(fs);
>
// jump to start of report
Console.WriteLine();
sr.BaseStream.Seek(1171,SeekOrigin.Begin);
//s.Seek(1171, SeekOrigin.Begin);
>
string str = sr.ReadLine();
while (str != null)
{
Console.WriteLine(str);
str = sr.ReadLine();
}
sr.Close();
fs.Close();
}
>
}
}
>
>
>
>
>
>
"dm3281" <dm****@nospam.netwrote in message
news:23**********************************@microsof t.com...
Hello, I have a text report from a mainframe that I need to parse.

The report has about a 2580 byte header that contains binary information
(garbage for the most part); although there are a couple areas that have
ASCII text that I need to extract. At the end of the 2580 bytes, I can
read the report like a standard text file. It should have CR/LF at the
end of each line.

What is the best way for me to read this report using C#. It is almost
like I need to access the file using seek() or something and then read it
using ReadLine() or something.

I have a sample file here. The extension is .BIN to cause your browser to
prompt for the file download.

http://members.verizon.net/dm3281/misc/TEST.BIN

Any assistance or sample code would be appreciated.


>
Jun 27 '08 #6

This discussion thread is closed

Replies have been disabled for this discussion.