first question... I have a flat file which unfortinuatly has columns
seperated by nulls instead of spaces (a higher up company created it this
way for us) is there anyway to do a readline with this and not have it
affected by the null? because it is right now causes truncated data at wierd
places... but as soon as i manually with a hex editor change char(00) to
char(20) in the files it reads prerfectly... which leads me to my 2nd
question... if you cant do what i said in the 1st question, is there a way
to go through the file, convert the nulls to spaces and save it back then
open it as a stream reader to read it line by line? the lines are delimited
by CR+LF's thanks! 12 2762
Brian,
I would simply open it with a System.IO.FileStream, read a buffer full of
bytes, change the null bytes to AscW(" "c) then write the buffer to a second
FileStream.
Something like:
Imports System.IO
Dim input As New FileStream("input.txt", FileMode.Open)
Dim output As New FileStream("output.txt", FileMode.Create)
Dim buffer(1023) As Byte
Dim length As Integer
Do
length = input.Read(buffer, 0, buffer.Length)
For index As Integer = 0 To length - 1
If buffer(index) = 0 Then
buffer(index) = AscW(" "c)
End If
Next
output.Write(buffer, 0, length)
Loop Until length < buffer.Length
input.Close()
output.Close()
Note the above may not work correctly for non Ansi 8-bit encodings, however
it should be easy enough to adapt.
Hope this helps
Jay
"Brian Henry" <br**********@newsgroups.nospam> wrote in message
news:eD*************@TK2MSFTNGP15.phx.gbl... first question... I have a flat file which unfortinuatly has columns seperated by nulls instead of spaces (a higher up company created it this way for us) is there anyway to do a readline with this and not have it affected by the null? because it is right now causes truncated data at wierd places... but as soon as i manually with a hex editor change char(00) to char(20) in the files it reads prerfectly... which leads me to my 2nd question... if you cant do what i said in the 1st question, is there a way to go through the file, convert the nulls to spaces and save it back then open it as a stream reader to read it line by line? the lines are delimited by CR+LF's thanks!
forgot to say the test data is in the text file attached
"Brian Henry" <br**********@newsgroups.nospam> wrote in message
news:%2****************@TK2MSFTNGP15.phx.gbl... here is some test data where this works... its meaningless data but it shows the point... and here is the code i used to read it What it does that it shouldnt do is instead of reading this whole line..
456456456456456456 100 156454541 02/20/1955FDFHGEL R OCONFGG MGK K40 09/04/2004 00 Y48/17/1982
it will read only this 456456456456456456 100 then stop completely for that line and move to the next line and do the same thing! the first line wont even read any data (has no inital text only null values until the 000)... as soon as a i replace those ASCII Char(00) [null]'s with spaces ASCII Char(20) it works perfectly!... any ways to get around this? thanks
If IO.File.Exists(Me.txtFileName.Text.Trim) Then
Me.lblStatus.Text = "File exists... Trying to open..."
' we have a file, try to open it now
Dim fs As New IO.FileStream(Me.txtFileName.Text, IO.FileMode.Open, IO.FileAccess.Read)
Dim sr As New IO.StreamReader(fs, True)
Me.lblStatus.Text = "File opened..." Do Until sr.Peek = -1
debug.writeline(sr.ReadLine)
Loop sr.Close()
End If
Hrm, how about read in the file to a string and do a split with the null
char being the delimiter? I haven't tried it and don't know if it will
work...but what the heck, worth a try eh?
Mythran
"Jay B. Harlow [MVP - Outlook]" <Ja************@msn.com> wrote in message
news:O$**************@TK2MSFTNGP11.phx.gbl... Brian, I would simply open it with a System.IO.FileStream, read a buffer full of bytes, change the null bytes to AscW(" "c) then write the buffer to a second FileStream.
Something like:
Imports System.IO
Dim input As New FileStream("input.txt", FileMode.Open) Dim output As New FileStream("output.txt", FileMode.Create)
Dim buffer(1023) As Byte Dim length As Integer Do length = input.Read(buffer, 0, buffer.Length) For index As Integer = 0 To length - 1 If buffer(index) = 0 Then buffer(index) = AscW(" "c) End If Next output.Write(buffer, 0, length) Loop Until length < buffer.Length input.Close() output.Close()
Note the above may not work correctly for non Ansi 8-bit encodings, however it should be easy enough to adapt.
Hope this helps Jay
"Brian Henry" <br**********@newsgroups.nospam> wrote in message news:eD*************@TK2MSFTNGP15.phx.gbl... first question... I have a flat file which unfortinuatly has columns seperated by nulls instead of spaces (a higher up company created it this way for us) is there anyway to do a readline with this and not have it affected by the null? because it is right now causes truncated data at wierd places... but as soon as i manually with a hex editor change char(00) to char(20) in the files it reads prerfectly... which leads me to my 2nd question... if you cant do what i said in the 1st question, is there a way to go through the file, convert the nulls to spaces and save it back then open it as a stream reader to read it line by line? the lines are delimited by CR+LF's thanks!
Mythran,
Brian (the original poster) stated: because it is right now causes truncated data at wierd places...
Which leads me to believe something strange is going on with the encoding
when you attempt to read the file into a string (an encoding is required).
There may not be, or it may be something simple, why risk it, when I know
the FileStream won't cause problems, especially on the sample file he
provided...
Reading it as bytes with FileStream will not involve any Encoding objects. I
am reading pure bytes, changing pure bytes, writing pure bytes. Hence I
won't be taking time to track down any potential encoding problems implied
with the Brian's statement.
Hope this helps
Jay
"Mythran" <ki********@hotmail.comREMOVETRAIL> wrote in message
news:et**************@TK2MSFTNGP09.phx.gbl... Hrm, how about read in the file to a string and do a split with the null char being the delimiter? I haven't tried it and don't know if it will work...but what the heck, worth a try eh?
Mythran
"Jay B. Harlow [MVP - Outlook]" <Ja************@msn.com> wrote in message news:O$**************@TK2MSFTNGP11.phx.gbl... Brian, I would simply open it with a System.IO.FileStream, read a buffer full of bytes, change the null bytes to AscW(" "c) then write the buffer to a second FileStream.
Something like:
Imports System.IO
Dim input As New FileStream("input.txt", FileMode.Open) Dim output As New FileStream("output.txt", FileMode.Create)
Dim buffer(1023) As Byte Dim length As Integer Do length = input.Read(buffer, 0, buffer.Length) For index As Integer = 0 To length - 1 If buffer(index) = 0 Then buffer(index) = AscW(" "c) End If Next output.Write(buffer, 0, length) Loop Until length < buffer.Length input.Close() output.Close()
Note the above may not work correctly for non Ansi 8-bit encodings, however it should be easy enough to adapt.
Hope this helps Jay
"Brian Henry" <br**********@newsgroups.nospam> wrote in message news:eD*************@TK2MSFTNGP15.phx.gbl... first question... I have a flat file which unfortinuatly has columns seperated by nulls instead of spaces (a higher up company created it this way for us) is there anyway to do a readline with this and not have it affected by the null? because it is right now causes truncated data at wierd places... but as soon as i manually with a hex editor change char(00) to char(20) in the files it reads prerfectly... which leads me to my 2nd question... if you cant do what i said in the 1st question, is there a way to go through the file, convert the nulls to spaces and save it back then open it as a stream reader to read it line by line? the lines are delimited by CR+LF's thanks!
Yes, your way would work...maybe even better...
What I was stating is that he reads it all into a string instead of line by
line (regardless of how he reads it in....such as FileStream or whatever).
Then he just splits the results based on the Null...but my way may not work
anyways because the Split method may not accept DBNull.Value....but it may
take Chr(x). Hrm, probably would just do it your way if it was me :P
Mythran
"Jay B. Harlow [MVP - Outlook]" <Ja************@msn.com> wrote in message
news:%2****************@TK2MSFTNGP14.phx.gbl... Mythran, Brian (the original poster) stated: because it is right now causes truncated data at wierd places...
Which leads me to believe something strange is going on with the encoding when you attempt to read the file into a string (an encoding is required). There may not be, or it may be something simple, why risk it, when I know the FileStream won't cause problems, especially on the sample file he provided...
Reading it as bytes with FileStream will not involve any Encoding objects. I am reading pure bytes, changing pure bytes, writing pure bytes. Hence I won't be taking time to track down any potential encoding problems implied with the Brian's statement.
Hope this helps Jay
"Mythran" <ki********@hotmail.comREMOVETRAIL> wrote in message news:et**************@TK2MSFTNGP09.phx.gbl... Hrm, how about read in the file to a string and do a split with the null char being the delimiter? I haven't tried it and don't know if it will work...but what the heck, worth a try eh?
Mythran
"Jay B. Harlow [MVP - Outlook]" <Ja************@msn.com> wrote in message news:O$**************@TK2MSFTNGP11.phx.gbl... Brian, I would simply open it with a System.IO.FileStream, read a buffer full of bytes, change the null bytes to AscW(" "c) then write the buffer to a second FileStream.
Something like:
Imports System.IO
Dim input As New FileStream("input.txt", FileMode.Open) Dim output As New FileStream("output.txt", FileMode.Create)
Dim buffer(1023) As Byte Dim length As Integer Do length = input.Read(buffer, 0, buffer.Length) For index As Integer = 0 To length - 1 If buffer(index) = 0 Then buffer(index) = AscW(" "c) End If Next output.Write(buffer, 0, length) Loop Until length < buffer.Length input.Close() output.Close()
Note the above may not work correctly for non Ansi 8-bit encodings, however it should be easy enough to adapt.
Hope this helps Jay
"Brian Henry" <br**********@newsgroups.nospam> wrote in message news:eD*************@TK2MSFTNGP15.phx.gbl... first question... I have a flat file which unfortinuatly has columns seperated by nulls instead of spaces (a higher up company created it this way for us) is there anyway to do a readline with this and not have it affected by the null? because it is right now causes truncated data at wierd places... but as soon as i manually with a hex editor change char(00) to char(20) in the files it reads prerfectly... which leads me to my 2nd question... if you cant do what i said in the 1st question, is there a way to go through the file, convert the nulls to spaces and save it back then open it as a stream reader to read it line by line? the lines are delimited by CR+LF's thanks!
Mythran,
Brian is referring to a Null Char (ChrW(0)).
If you look at the file he attached to one of his messages the file is fixed
length records. The first couple of records have 20 or 30 zero bytes at the
start of the line.
The ChrW(0) are not being used for record or field delimiters per se.
I would use the Split method if the file was using ChrW(0) for record or
field delimiters. What I was stating is that he reads it all into a string instead of line by line (regardless of how he reads it in....such as FileStream or whatever).
FileStream cannot read a file into a string, it only reads Bytes. To read a
file into an System.Text.Encoding object is needed, as System.Text.Encoding
is used to convert the bytes in the file to Unicode which is what Strings
are. StreamReader uses an Encoding object to convert the bytes to a String,
if you use a FileStream you would need to create your own Encoding object &
use it to convert the bytes into a String. As I stated Brian's original
statement suggested that the Encoding object may have been having problems
with the 20 or 30 zero bytes...
Hope this helps
Jay
"Mythran" <ki********@hotmail.comREMOVETRAIL> wrote in message
news:us**************@TK2MSFTNGP10.phx.gbl... Yes, your way would work...maybe even better...
What I was stating is that he reads it all into a string instead of line by line (regardless of how he reads it in....such as FileStream or whatever). Then he just splits the results based on the Null...but my way may not work anyways because the Split method may not accept DBNull.Value....but it may take Chr(x). Hrm, probably would just do it your way if it was me :P
Mythran
"Jay B. Harlow [MVP - Outlook]" <Ja************@msn.com> wrote in message news:%2****************@TK2MSFTNGP14.phx.gbl... Mythran, Brian (the original poster) stated:> because it is right now causes truncated data at wierd places...
Which leads me to believe something strange is going on with the encoding when you attempt to read the file into a string (an encoding is required). There may not be, or it may be something simple, why risk it, when I know the FileStream won't cause problems, especially on the sample file he provided...
Reading it as bytes with FileStream will not involve any Encoding objects. I am reading pure bytes, changing pure bytes, writing pure bytes. Hence I won't be taking time to track down any potential encoding problems implied with the Brian's statement.
Hope this helps Jay
"Mythran" <ki********@hotmail.comREMOVETRAIL> wrote in message news:et**************@TK2MSFTNGP09.phx.gbl... Hrm, how about read in the file to a string and do a split with the null char being the delimiter? I haven't tried it and don't know if it will work...but what the heck, worth a try eh?
Mythran
"Jay B. Harlow [MVP - Outlook]" <Ja************@msn.com> wrote in message news:O$**************@TK2MSFTNGP11.phx.gbl... Brian, I would simply open it with a System.IO.FileStream, read a buffer full of bytes, change the null bytes to AscW(" "c) then write the buffer to a second FileStream.
Something like:
Imports System.IO
Dim input As New FileStream("input.txt", FileMode.Open) Dim output As New FileStream("output.txt", FileMode.Create)
Dim buffer(1023) As Byte Dim length As Integer Do length = input.Read(buffer, 0, buffer.Length) For index As Integer = 0 To length - 1 If buffer(index) = 0 Then buffer(index) = AscW(" "c) End If Next output.Write(buffer, 0, length) Loop Until length < buffer.Length input.Close() output.Close()
Note the above may not work correctly for non Ansi 8-bit encodings, however it should be easy enough to adapt.
Hope this helps Jay
"Brian Henry" <br**********@newsgroups.nospam> wrote in message news:eD*************@TK2MSFTNGP15.phx.gbl... > first question... I have a flat file which unfortinuatly has columns > seperated by nulls instead of spaces (a higher up company created it > this way for us) is there anyway to do a readline with this and not > have it affected by the null? because it is right now causes truncated > data at wierd places... but as soon as i manually with a hex editor > change char(00) to char(20) in the files it reads prerfectly... which > leads me to my 2nd question... if you cant do what i said in the 1st > question, is there a way to go through the file, convert the nulls to > spaces and save it back then open it as a stream reader to read it > line by line? the lines are delimited by CR+LF's thanks! >
On Thu, 21 Oct 2004 14:48:52 -0400, Brian Henry wrote: What it does that it shouldnt do is instead of reading this whole line.. then stop completely for that line and move to the next line and do the same
I don't think you have an error. I think the problem is with
Debug.WriteLine. It does not properly deal with zero bytes in the string.
If you add a watch and inspect the length of the string, does it show the
correct length for the string?
I used the code below on your file and even though Debug.WriteLine could
not display the string correctly, the length of the string was correct for
the line read in and the output file was identical to the input file. I
think you are experiencing a bug in Debug.WriteLine.
Dim fs As New FileStream("test.txt", FileMode.Open, FileAccess.Read)
Dim sr As New StreamReader(fs, True)
Dim sw As New StreamWriter("testout.txt")
Dim sBuf As String
Do Until sr.Peek() = -1
sBuf = sr.ReadLine
Debug.WriteLine(sBuf)
Debug.WriteLine(vbCrLf)
Debug.WriteLine(sBuf.Length.ToString)
sw.WriteLine(sBuf)
Loop
sr.Close()
fs.Close()
sw.Close()
--
Chris
dunawayc[AT]sbcglobal_lunchmeat_[DOT]net
To send me an E-mail, remove the "[", "]", underscores ,lunchmeat, and
replace certain words in my E-Mail address.
Ahh, didn't look at the file...but according to his first post, the columns
are separated by Null. Which I read as "ROWS" separated by NULL...so
scratch everything I've said in this thread...
Sometimes you just have to NOT listen to me at all :P
Mythran
"Jay B. Harlow [MVP - Outlook]" <Ja************@msn.com> wrote in message
news:O7**************@TK2MSFTNGP12.phx.gbl... Mythran, Brian is referring to a Null Char (ChrW(0)).
If you look at the file he attached to one of his messages the file is fixed length records. The first couple of records have 20 or 30 zero bytes at the start of the line.
The ChrW(0) are not being used for record or field delimiters per se.
I would use the Split method if the file was using ChrW(0) for record or field delimiters.
What I was stating is that he reads it all into a string instead of line by line (regardless of how he reads it in....such as FileStream or whatever). FileStream cannot read a file into a string, it only reads Bytes. To read a file into an System.Text.Encoding object is needed, as System.Text.Encoding is used to convert the bytes in the file to Unicode which is what Strings are. StreamReader uses an Encoding object to convert the bytes to a String, if you use a FileStream you would need to create your own Encoding object & use it to convert the bytes into a String. As I stated Brian's original statement suggested that the Encoding object may have been having problems with the 20 or 30 zero bytes...
Hope this helps Jay
"Mythran" <ki********@hotmail.comREMOVETRAIL> wrote in message news:us**************@TK2MSFTNGP10.phx.gbl... Yes, your way would work...maybe even better...
What I was stating is that he reads it all into a string instead of line by line (regardless of how he reads it in....such as FileStream or whatever). Then he just splits the results based on the Null...but my way may not work anyways because the Split method may not accept DBNull.Value....but it may take Chr(x). Hrm, probably would just do it your way if it was me :P
Mythran
"Jay B. Harlow [MVP - Outlook]" <Ja************@msn.com> wrote in message news:%2****************@TK2MSFTNGP14.phx.gbl... Mythran, Brian (the original poster) stated: >> because it is right now causes truncated data at wierd places...
Which leads me to believe something strange is going on with the encoding when you attempt to read the file into a string (an encoding is required). There may not be, or it may be something simple, why risk it, when I know the FileStream won't cause problems, especially on the sample file he provided...
Reading it as bytes with FileStream will not involve any Encoding objects. I am reading pure bytes, changing pure bytes, writing pure bytes. Hence I won't be taking time to track down any potential encoding problems implied with the Brian's statement.
Hope this helps Jay
"Mythran" <ki********@hotmail.comREMOVETRAIL> wrote in message news:et**************@TK2MSFTNGP09.phx.gbl... Hrm, how about read in the file to a string and do a split with the null char being the delimiter? I haven't tried it and don't know if it will work...but what the heck, worth a try eh?
Mythran
"Jay B. Harlow [MVP - Outlook]" <Ja************@msn.com> wrote in message news:O$**************@TK2MSFTNGP11.phx.gbl... > Brian, > I would simply open it with a System.IO.FileStream, read a buffer full > of bytes, change the null bytes to AscW(" "c) then write the buffer to > a second FileStream. > > Something like: > > Imports System.IO > > Dim input As New FileStream("input.txt", FileMode.Open) > Dim output As New FileStream("output.txt", FileMode.Create) > > Dim buffer(1023) As Byte > Dim length As Integer > Do > length = input.Read(buffer, 0, buffer.Length) > For index As Integer = 0 To length - 1 > If buffer(index) = 0 Then > buffer(index) = AscW(" "c) > End If > Next > output.Write(buffer, 0, length) > Loop Until length < buffer.Length > input.Close() > output.Close() > > Note the above may not work correctly for non Ansi 8-bit encodings, > however it should be easy enough to adapt. > > Hope this helps > Jay > > > "Brian Henry" <br**********@newsgroups.nospam> wrote in message > news:eD*************@TK2MSFTNGP15.phx.gbl... >> first question... I have a flat file which unfortinuatly has columns >> seperated by nulls instead of spaces (a higher up company created it >> this way for us) is there anyway to do a readline with this and not >> have it affected by the null? because it is right now causes >> truncated data at wierd places... but as soon as i manually with a >> hex editor change char(00) to char(20) in the files it reads >> prerfectly... which leads me to my 2nd question... if you cant do >> what i said in the 1st question, is there a way to go through the >> file, convert the nulls to spaces and save it back then open it as a >> stream reader to read it line by line? the lines are delimited by >> CR+LF's thanks! >> > >
Chris,
Ah! yes the Debug.Writeline doesn't like ChrW(0)...
Actually its the Marshaling class that treats the ChrW(0) in the string as a
string terminator.
Good point...
Jay
"Chris Dunaway" <"dunawayc[[at]_lunchmeat_sbcglobal[dot]]net"> wrote in
message news:1a*****************************@40tude.net... On Thu, 21 Oct 2004 14:48:52 -0400, Brian Henry wrote:
What it does that it shouldnt do is instead of reading this whole line.. then stop completely for that line and move to the next line and do the same
I don't think you have an error. I think the problem is with Debug.WriteLine. It does not properly deal with zero bytes in the string. If you add a watch and inspect the length of the string, does it show the correct length for the string?
I used the code below on your file and even though Debug.WriteLine could not display the string correctly, the length of the string was correct for the line read in and the output file was identical to the input file. I think you are experiencing a bug in Debug.WriteLine.
Dim fs As New FileStream("test.txt", FileMode.Open, FileAccess.Read) Dim sr As New StreamReader(fs, True)
Dim sw As New StreamWriter("testout.txt")
Dim sBuf As String
Do Until sr.Peek() = -1 sBuf = sr.ReadLine Debug.WriteLine(sBuf) Debug.WriteLine(vbCrLf) Debug.WriteLine(sBuf.Length.ToString) sw.WriteLine(sBuf) Loop
sr.Close() fs.Close()
sw.Close() -- Chris
dunawayc[AT]sbcglobal_lunchmeat_[DOT]net
To send me an E-mail, remove the "[", "]", underscores ,lunchmeat, and replace certain words in my E-Mail address.
Mythran, Sometimes you just have to NOT listen to me at all :P
I think that can be said about everybody, me included :-))
Especially now that Chris just reminded me that VS.NET itself has trouble
showing strings with ChrW(0) in them, which suggests that Brian's Encoding
"problem" I suspect is more related to the debugger then to the Encoding
object I was suspecting...
Jay
"Mythran" <ki********@hotmail.comREMOVETRAIL> wrote in message
news:Ou**************@TK2MSFTNGP11.phx.gbl... Ahh, didn't look at the file...but according to his first post, the columns are separated by Null. Which I read as "ROWS" separated by NULL...so scratch everything I've said in this thread...
Sometimes you just have to NOT listen to me at all :P
Mythran
<<snip>>
ah didn't even think of that...
"Jay B. Harlow [MVP - Outlook]" <Ja************@msn.com> wrote in message
news:%2****************@TK2MSFTNGP09.phx.gbl... Chris, Ah! yes the Debug.Writeline doesn't like ChrW(0)...
Actually its the Marshaling class that treats the ChrW(0) in the string as a string terminator.
Good point...
Jay
"Chris Dunaway" <"dunawayc[[at]_lunchmeat_sbcglobal[dot]]net"> wrote in message news:1a*****************************@40tude.net... On Thu, 21 Oct 2004 14:48:52 -0400, Brian Henry wrote:
What it does that it shouldnt do is instead of reading this whole line.. then stop completely for that line and move to the next line and do the same
I don't think you have an error. I think the problem is with Debug.WriteLine. It does not properly deal with zero bytes in the string. If you add a watch and inspect the length of the string, does it show the correct length for the string?
I used the code below on your file and even though Debug.WriteLine could not display the string correctly, the length of the string was correct for the line read in and the output file was identical to the input file. I think you are experiencing a bug in Debug.WriteLine.
Dim fs As New FileStream("test.txt", FileMode.Open, FileAccess.Read) Dim sr As New StreamReader(fs, True)
Dim sw As New StreamWriter("testout.txt")
Dim sBuf As String
Do Until sr.Peek() = -1 sBuf = sr.ReadLine Debug.WriteLine(sBuf) Debug.WriteLine(vbCrLf) Debug.WriteLine(sBuf.Length.ToString) sw.WriteLine(sBuf) Loop
sr.Close() fs.Close()
sw.Close() -- Chris
dunawayc[AT]sbcglobal_lunchmeat_[DOT]net
To send me an E-mail, remove the "[", "]", underscores ,lunchmeat, and replace certain words in my E-Mail address.
Hi
If you still have any concern on this issue, please feel free to post here.
Best regards,
Peter Huang
Microsoft Online Partner Support
Get Secure! - www.microsoft.com/security
This posting is provided "AS IS" with no warranties, and confers no rights. This discussion thread is closed Replies have been disabled for this discussion. Similar topics
4 posts
views
Thread by Eric Lilja |
last post: by
|
3 posts
views
Thread by craig.wagner |
last post: by
|
3 posts
views
Thread by Thom |
last post: by
|
2 posts
views
Thread by noopathan |
last post: by
|
15 posts
views
Thread by angellian |
last post: by
|
6 posts
views
Thread by othellomy |
last post: by
|
6 posts
views
Thread by Cliff72 |
last post: by
|
5 posts
views
Thread by bobh |
last post: by
| | | | | | | | | | | |