I need help. Please bear with this.
I have a program. It takes in files that are delimited.
The delimiters are declared in the file by looking at
fixed positions in the file (If you work with ANSI x12
files, you know what I mean). This normally isn't a
problem, but I'm getting a file that is using some odd
characters as delimiters.
Specifically, a Hex 'BA' is declared as a delimiter. I
read the file into memory using this...
=============== =============== =============== ====
fs = New FileStream(InFi le, System.IO.FileM ode.Open,
System.IO.FileA ccess.Read)
sr = New StreamReader(fs , System.Text.Enc oding.UTF7)
InputString = sr.ReadToEnd
=============== =============== =============== ====
At this point I close the file. In the string, I remove
any carriage control and line feed characters.
Then I write the string to a new file with this.
=============== =============== =============== ====
fs2 = New FileStream(OutF ile, System.IO.FileM ode.Create,
System.IO.FileA ccess.Write, IO.FileShare.Wr ite)
sw = New StreamWriter(fs 2, System.Text.Enc oding.ASCII)
sw.Write(sANSIS tring)
=============== =============== =============== ====
NOTE :Initially, I don't think my stream writer specified
encoding.
Anyway here is the problem....
The resulting file ends up with a different value in the
places where the hex 'BA' used to be. I've played with
various combinations of encoding, for both reading and
writing, and I'm not able to
maintain the character. I need to maintain this!
In one case, the single-byte hex 'BA' is actually
replaced with two bytes, but everything else in the file
is as it should be. In another case, the character is
a "?". I don't remember what happens in other
situations, but in no case is the hex 'BA' maintained.
I don't really understand encoding, so that is only
compounding my frustration and confusion.
Any help is greatly appreciated. I could supply more
details, if necessary.
QM. 6 1571
Not for sure here but her we go anyway....
Stop removing the cr and lf's from the stream the A in BA is actually a lf.
When you replace this your losing your ability to parse at that position.
cr = Carriage Return
lf = Line Feed
Bryan Martin Sp**@ahwayside. com
"quincy" <qm*******@yaho o.com> wrote in message
news:2a******** *************** ******@phx.gbl. .. I need help. Please bear with this.
I have a program. It takes in files that are delimited. The delimiters are declared in the file by looking at fixed positions in the file (If you work with ANSI x12 files, you know what I mean). This normally isn't a problem, but I'm getting a file that is using some odd characters as delimiters.
Specifically, a Hex 'BA' is declared as a delimiter. I read the file into memory using this...
=============== =============== =============== ==== fs = New FileStream(InFi le, System.IO.FileM ode.Open, System.IO.FileA ccess.Read) sr = New StreamReader(fs , System.Text.Enc oding.UTF7)
InputString = sr.ReadToEnd =============== =============== =============== ====
At this point I close the file. In the string, I remove any carriage control and line feed characters.
Then I write the string to a new file with this.
=============== =============== =============== ==== fs2 = New FileStream(OutF ile, System.IO.FileM ode.Create, System.IO.FileA ccess.Write, IO.FileShare.Wr ite) sw = New StreamWriter(fs 2, System.Text.Enc oding.ASCII)
sw.Write(sANSIS tring) =============== =============== =============== ====
NOTE :Initially, I don't think my stream writer specified encoding.
Anyway here is the problem....
The resulting file ends up with a different value in the places where the hex 'BA' used to be. I've played with various combinations of encoding, for both reading and writing, and I'm not able to maintain the character. I need to maintain this!
In one case, the single-byte hex 'BA' is actually replaced with two bytes, but everything else in the file is as it should be. In another case, the character is a "?". I don't remember what happens in other situations, but in no case is the hex 'BA' maintained.
I don't really understand encoding, so that is only compounding my frustration and confusion.
Any help is greatly appreciated. I could supply more details, if necessary.
QM.
Oh and BTW it seems your BA is represented as....
Hex B = Dec 11 which corresponds to vertical tab mostly added by word
processors.
Hex A = Dec 10 which corresponds to line feed.
Bryan Martin Sp**@ahwayside. com
"Bryan Martin" <sp**@ahwayside .com> wrote in message
news:uJ******** ******@tk2msftn gp13.phx.gbl... Not for sure here but her we go anyway....
Stop removing the cr and lf's from the stream the A in BA is actually a
lf. When you replace this your losing your ability to parse at that position.
cr = Carriage Return lf = Line Feed
Bryan Martin Sp**@ahwayside. com
"quincy" <qm*******@yaho o.com> wrote in message news:2a******** *************** ******@phx.gbl. .. I need help. Please bear with this.
I have a program. It takes in files that are delimited. The delimiters are declared in the file by looking at fixed positions in the file (If you work with ANSI x12 files, you know what I mean). This normally isn't a problem, but I'm getting a file that is using some odd characters as delimiters.
Specifically, a Hex 'BA' is declared as a delimiter. I read the file into memory using this...
=============== =============== =============== ==== fs = New FileStream(InFi le, System.IO.FileM ode.Open, System.IO.FileA ccess.Read) sr = New StreamReader(fs , System.Text.Enc oding.UTF7)
InputString = sr.ReadToEnd =============== =============== =============== ====
At this point I close the file. In the string, I remove any carriage control and line feed characters.
Then I write the string to a new file with this.
=============== =============== =============== ==== fs2 = New FileStream(OutF ile, System.IO.FileM ode.Create, System.IO.FileA ccess.Write, IO.FileShare.Wr ite) sw = New StreamWriter(fs 2, System.Text.Enc oding.ASCII)
sw.Write(sANSIS tring) =============== =============== =============== ====
NOTE :Initially, I don't think my stream writer specified encoding.
Anyway here is the problem....
The resulting file ends up with a different value in the places where the hex 'BA' used to be. I've played with various combinations of encoding, for both reading and writing, and I'm not able to maintain the character. I need to maintain this!
In one case, the single-byte hex 'BA' is actually replaced with two bytes, but everything else in the file is as it should be. In another case, the character is a "?". I don't remember what happens in other situations, but in no case is the hex 'BA' maintained.
I don't really understand encoding, so that is only compounding my frustration and confusion.
Any help is greatly appreciated. I could supply more details, if necessary.
QM.
I wondered about doing that, and I did run the read/write
without removing the crlf.... same problems. -----Original Message----- Not for sure here but her we go anyway....
Stop removing the cr and lf's from the stream the A in
BA is actually a lf.When you replace this your losing your ability to parse
at that position. cr = Carriage Return lf = Line Feed
Bryan Martin Sp**@ahwayside .com
"quincy" <qm*******@yaho o.com> wrote in message news:2a******* *************** *******@phx.gbl ... I need help. Please bear with this.
I have a program. It takes in files that are
delimited. The delimiters are declared in the file by looking at fixed positions in the file (If you work with ANSI x12 files, you know what I mean). This normally isn't a problem, but I'm getting a file that is using some odd characters as delimiters.
Specifically, a Hex 'BA' is declared as a delimiter. I read the file into memory using this...
=============== =============== =============== ==== fs = New FileStream(InFi le, System.IO.FileM ode.Open, System.IO.FileA ccess.Read) sr = New StreamReader(fs , System.Text.Enc oding.UTF7)
InputString = sr.ReadToEnd =============== =============== =============== ====
At this point I close the file. In the string, I
remove any carriage control and line feed characters.
Then I write the string to a new file with this.
=============== =============== =============== ==== fs2 = New FileStream(OutF ile,
System.IO.FileM ode.Create, System.IO.FileA ccess.Write, IO.FileShare.Wr ite) sw = New StreamWriter(fs 2, System.Text.Enc oding.ASCII)
sw.Write(sANSIS tring) =============== =============== =============== ====
NOTE :Initially, I don't think my stream writer
specified encoding.
Anyway here is the problem....
The resulting file ends up with a different value in
the places where the hex 'BA' used to be. I've played with various combinations of encoding, for both reading and writing, and I'm not able to maintain the character. I need to maintain this!
In one case, the single-byte hex 'BA' is actually replaced with two bytes, but everything else in the
file is as it should be. In another case, the character is a "?". I don't remember what happens in other situations, but in no case is the hex 'BA' maintained.
I don't really understand encoding, so that is only compounding my frustration and confusion.
Any help is greatly appreciated. I could supply more details, if necessary.
QM.
.
Hex 'BA' is Dec 186. Looks like the little Degree symbol. -----Original Message----- Oh and BTW it seems your BA is represented as....
Hex B = Dec 11 which corresponds to vertical tab mostly
added by wordprocessors. Hex A = Dec 10 which corresponds to line feed.
Bryan Martin Sp**@ahwayside .com
"Bryan Martin" <sp**@ahwayside .com> wrote in message news:uJ******* *******@tk2msft ngp13.phx.gbl.. . Not for sure here but her we go anyway....
Stop removing the cr and lf's from the stream the A in
BA is actually alf. When you replace this your losing your ability to
parse at that position. cr = Carriage Return lf = Line Feed
Bryan Martin Sp**@ahwayside. com
"quincy" <qm*******@yaho o.com> wrote in message news:2a******** *************** ******@phx.gbl. .. > I need help. Please bear with this. > > I have a program. It takes in files that are
delimited. > The delimiters are declared in the file by looking at > fixed positions in the file (If you work with ANSI
x12 > files, you know what I mean). This normally isn't a > problem, but I'm getting a file that is using some
odd > characters as delimiters. > > Specifically, a Hex 'BA' is declared as a delimiter.
I > read the file into memory using this... > > > =============== =============== =============== ==== > fs = New FileStream(InFi le, System.IO.FileM ode.Open, > System.IO.FileA ccess.Read) > sr = New StreamReader(fs , System.Text.Enc oding.UTF7) > > InputString = sr.ReadToEnd > =============== =============== =============== ==== > > > At this point I close the file. In the string, I
remove > any carriage control and line feed characters. > > Then I write the string to a new file with this. > > > =============== =============== =============== ==== > fs2 = New FileStream(OutF ile,
System.IO.FileM ode.Create, > System.IO.FileA ccess.Write, IO.FileShare.Wr ite) > sw = New StreamWriter(fs 2,
System.Text.Enc oding.ASCII) > > sw.Write(sANSIS tring) > =============== =============== =============== ==== > > NOTE :Initially, I don't think my stream writer
specified > encoding. > > Anyway here is the problem.... > > The resulting file ends up with a different value in
the > places where the hex 'BA' used to be. I've played
with > various combinations of encoding, for both reading
and > writing, and I'm not able to > maintain the character. I need to maintain this! > > In one case, the single-byte hex 'BA' is actually > replaced with two bytes, but everything else in the
file > is as it should be. In another case, the character
is > a "?". I don't remember what happens in other > situations, but in no case is the hex 'BA'
maintained. > > I don't really understand encoding, so that is only > compounding my frustration and confusion. > > Any help is greatly appreciated. I could supply more > details, if necessary. > > QM.
.
Try setting the code page for the encoder, for example:
Encoding enc = Encoding.GetEnc oding(1252);
quincy wrote: I need help. Please bear with this.
I have a program. It takes in files that are delimited. The delimiters are declared in the file by looking at fixed positions in the file (If you work with ANSI x12 files, you know what I mean). This normally isn't a problem, but I'm getting a file that is using some odd characters as delimiters.
Specifically, a Hex 'BA' is declared as a delimiter. I read the file into memory using this...
=============== =============== =============== ==== fs = New FileStream(InFi le, System.IO.FileM ode.Open, System.IO.FileA ccess.Read) sr = New StreamReader(fs , System.Text.Enc oding.UTF7)
InputString = sr.ReadToEnd =============== =============== =============== ====
At this point I close the file. In the string, I remove any carriage control and line feed characters.
Then I write the string to a new file with this.
=============== =============== =============== ==== fs2 = New FileStream(OutF ile, System.IO.FileM ode.Create, System.IO.FileA ccess.Write, IO.FileShare.Wr ite) sw = New StreamWriter(fs 2, System.Text.Enc oding.ASCII)
sw.Write(sANSIS tring) =============== =============== =============== ====
NOTE :Initially, I don't think my stream writer specified encoding.
Anyway here is the problem....
The resulting file ends up with a different value in the places where the hex 'BA' used to be. I've played with various combinations of encoding, for both reading and writing, and I'm not able to maintain the character. I need to maintain this!
In one case, the single-byte hex 'BA' is actually replaced with two bytes, but everything else in the file is as it should be. In another case, the character is a "?". I don't remember what happens in other situations, but in no case is the hex 'BA' maintained.
I don't really understand encoding, so that is only compounding my frustration and confusion.
Any help is greatly appreciated. I could supply more details, if necessary.
QM.
quincy <qm*******@yaho o.com> wrote: I need help. Please bear with this.
I have a program. It takes in files that are delimited. The delimiters are declared in the file by looking at fixed positions in the file (If you work with ANSI x12 files, you know what I mean). This normally isn't a problem, but I'm getting a file that is using some odd characters as delimiters.
Specifically, a Hex 'BA' is declared as a delimiter. I read the file into memory using this...
=============== =============== =============== ==== fs = New FileStream(InFi le, System.IO.FileM ode.Open, System.IO.FileA ccess.Read) sr = New StreamReader(fs , System.Text.Enc oding.UTF7)
InputString = sr.ReadToEnd =============== =============== =============== ====
As I said before, if you use UTF7 any bytes which were 0xba won't be
decoded properly, because UTF7 doesn't have any character which is
encoded to hex 0xba.
Please read the full response I posted when you asked the same question
(without the encoding on the writing side) 5 days ago.
--
Jon Skeet - <sk***@pobox.co m> http://www.pobox.com/~skeet
If replying to the group, please do not mail me too This thread has been closed and replies have been disabled. Please start a new discussion. Similar topics |
by: David Thomas |
last post by:
Hi there, a while ago, I posted a question regarding reading japanese
text from a text file.
Well, since I solved the problem, I thought I'd post my solution for
the benefit of other people with the same problem.
The plan was to make a script to read and display japanese text. I
will use it for making a japanese proverb script and for a japanese
language study script.
|
by: H Lee |
last post by:
Hi,
I'm an XML newbie, and not sure if this is the appropriate newsgroup to post
my question, so feel free to suggest other newgroups where I should post
this message if this is the case.
I'm having issues using XmlTextWriter, saving it out to a file with UTF8
encoding, and seeing "dirty", or "human unreadable" characters show up
*right before* the XML declaration.
I need to have the XML declaration state "encoding = utf-8", but also...
|
by: Kai Bohli |
last post by:
Hi all !
I've come across a huge problem (for me at least).
I'm trying to send some initial graphics to a labelprinter. To do this, I load the graphics from
resource and send it directly to the printerport along with "printer instructions".
The problem is that the printer instruction have to be "plain text" while the image has to be
binary. Something like this:
|
by: Lenard Gunda |
last post by:
hi!
I have the following problem. I need to read data from a TXT file our
company receives.
I would use StreamReader, and process it line by line using ReadLine,
however, the following problem occurs.
The file contains characters with ASCII codes above 128. But the file is
still text (nothing like UTF7/8 or the like). It also might contain + signs.
As a result:
|
by: Ian |
last post by:
I am creating an XML file through the XmlTextWriter. This is output to a
MemoryStream which I convert a string through a Byte Array. Everything works
correctly except for one BIG issue. My XML file is being truncated somewhere
in the process. Large XML files give a truncated result, and small ones rsult
in a Byte.Length = 0 .
I assume the it is getting stuck in a buffer??
I tried Fluch() on the MemStream and Base Stream without success....
| |
by: Nikolay Petrov |
last post by:
How can I convert DOS cyrillic text to Unicode
|
by: Flix |
last post by:
Hello.
What I want to do is simple: correctly reading a text file whose encoding is
not known (it can be Ascii,UTF7,UTF8 or Unicode).
I'm thinking of something like that:
1) Read the text as Ascii:
string text="";
|
by: George |
last post by:
Hi,
I am puzzled by the following and seeking some assistance to help me
understand what happened. I have very limited encoding knowledge.
Our SAP system writes out a text file which includes German characters.
1. When I use StreamReader(System.String filepath) without specifying an
encoding method, the German characters such as Ä are lost when I do a
ReadLine()
|
by: pardesiya |
last post by:
Friends,
I am having trouble displaying Japanese text within a textbox (or
anywhere else) in an aspx page with .net 2.0 framework. Initial
default text in Japanese displays perfectly but when I attempt to
change the text following a button-click event, it displays as junk.
I have tried setting the globalization tag in the web.config file but
that does not help eiter.
<globalization requestEncoding="UTF-8" responseEncoding="UTF-8"
|
by: list |
last post by:
Hi folks,
I am new to Googlegroups. I asked my questions at other forums, since
now.
I have an important question: I have to check files if they are
binary(.bmp, .avi, .jpg) or text(.txt, .cpp, .h, .php, .html). How to
check a file an find out if the file is binary or text?
Thanks for your help.
|
by: marktang |
last post by:
ONU (Optical Network Unit) is one of the key components for providing high-speed Internet services. Its primary function is to act as an endpoint device located at the user's premises. However, people are often confused as to whether an ONU can Work As a Router. In this blog post, we’ll explore What is ONU, What Is Router, ONU & Router’s main usage, and What is the difference between ONU and Router. Let’s take a closer look !
Part I. Meaning of...
| |
by: Oralloy |
last post by:
Hello folks,
I am unable to find appropriate documentation on the type promotion of bit-fields when using the generalised comparison operator "<=>".
The problem is that using the GNU compilers, it seems that the internal comparison operator "<=>" tries to promote arguments from unsigned to signed.
This is as boiled down as I can make it.
Here is my compilation command:
g++-12 -std=c++20 -Wnarrowing bit_field.cpp
Here is the code in...
|
by: jinu1996 |
last post by:
In today's digital age, having a compelling online presence is paramount for businesses aiming to thrive in a competitive landscape. At the heart of this digital strategy lies an intricately woven tapestry of website design and digital marketing. It's not merely about having a website; it's about crafting an immersive digital experience that captivates audiences and drives business growth.
The Art of Business Website Design
Your website is...
|
by: Hystou |
last post by:
Overview:
Windows 11 and 10 have less user interface control over operating system update behaviour than previous versions of Windows. In Windows 11 and 10, there is no way to turn off the Windows Update option using the Control Panel or Settings app; it automatically checks for updates and installs any it finds, whether you like it or not. For most users, this new feature is actually very convenient. If you want to control the update process,...
|
by: agi2029 |
last post by:
Let's talk about the concept of autonomous AI software engineers and no-code agents. These AIs are designed to manage the entire lifecycle of a software development project—planning, coding, testing, and deployment—without human intervention. Imagine an AI that can take a project description, break it down, write the code, debug it, and then launch it, all on its own....
Now, this would greatly impact the work of software developers. The idea...
|
by: conductexam |
last post by:
I have .net C# application in which I am extracting data from word file and save it in database particularly. To store word all data as it is I am converting the whole word file firstly in HTML and then checking html paragraph one by one.
At the time of converting from word file to html my equations which are in the word document file was convert into image.
Globals.ThisAddIn.Application.ActiveDocument.Select();...
|
by: TSSRALBI |
last post by:
Hello
I'm a network technician in training and I need your help.
I am currently learning how to create and manage the different types of VPNs and I have a question about LAN-to-LAN VPNs.
The last exercise I practiced was to create a LAN-to-LAN VPN between two Pfsense firewalls, by using IPSEC protocols.
I succeeded, with both firewalls in the same network. But I'm wondering if it's possible to do the same thing, with 2 Pfsense firewalls...
| |
by: adsilva |
last post by:
A Windows Forms form does not have the event Unload, like VB6. What one acts like?
|
by: bsmnconsultancy |
last post by:
In today's digital era, a well-designed website is crucial for businesses looking to succeed. Whether you're a small business owner or a large corporation in Toronto, having a strong online presence can significantly impact your brand's success. BSMN Consultancy, a leader in Website Development in Toronto offers valuable insights into creating effective websites that not only look great but also perform exceptionally well. In this comprehensive...
| |