473,699 Members | 2,702 Online
Bytes | Software Development & Data Engineering Community
+ Post

Home Posts Topics Members FAQ

Reading unicode text files

Wx
Hello.

I'm trying to read a textfile written by the NTBackup utility on
Windows 2003 SBS. The problem is that when i print the output, it
looks like this:

S t a t o : b a c k u p
O p e r a z i o n e : b a c k u p
D e s t i n a z i o n e b a c k u p a t t i v o : F i l e
N o m e s u p p o r t o : " l u m e v e . b k f c r e a t o i
l 2 1 / 0 5 / 2 0 0 7 a l l e 2 3 . 0 0 "

As you can see, there is a space prior to any charater. I know that
unicode characters uses two bytes, so... can be the problem related to
different charset?

If I try to read a new textfile, there are no problem.

This is the relevant portion of the code:
try {
ifstream infile(strLogFi le.c_str());

if (infile.is_open ()) {
string line;
while (getline(infile , line)) {
cout << line << endl;
}

infile.close();

} else {
cerr << "Impossibil e aprire " << strLogFile << endl;
return false;
}

Excuse me for my english.
Thanks

Wx

May 22 '07 #1
1 3199
* Wx:
>
I'm trying to read a textfile written by the NTBackup utility on
Windows 2003 SBS. The problem is that when i print the output, it
looks like this:

S t a t o : b a c k u p
O p e r a z i o n e : b a c k u p
D e s t i n a z i o n e b a c k u p a t t i v o : F i l e
N o m e s u p p o r t o : " l u m e v e . b k f c r e a t o i
l 2 1 / 0 5 / 2 0 0 7 a l l e 2 3 . 0 0 "

As you can see, there is a space prior to any charater. I know that
unicode characters uses two bytes, so... can be the problem related to
different charset?
Yes. The "spaces" are, at least before they end up in your program,
zero bytes.

If I try to read a new textfile, there are no problem.

This is the relevant portion of the code:
try {
ifstream infile(strLogFi le.c_str());
Well, it doesn't help you to use a wide character stream, because they
simply convert to/from external narrow character data.

What you can do is open the file in binary mode.

Then read the contents as binary data and treat as a sequence of wchar_t
values (e.g., you can just store them in a std::wstring).

Essentially this means implementing the machinery that the standard
library provides for narrow character streams. Or, you can buy an
existing implementation or find one on the net (I doubt you'll find
one). I think Dinkumware offers such an implementation.

Note that handling wchar_t in Windows leads you into compiler-specific
territory, since e.g. MingW g++ 3.4.4 doesn't support wide character
streams.
--
A: Because it messes up the order in which people normally read text.
Q: Why is it such a bad thing?
A: Top-posting.
Q: What is the most annoying thing on usenet and in e-mail?
May 22 '07 #2

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

8
3650
by: Francis Girard | last post by:
Hi, For the first time in my programmer life, I have to take care of character encoding. I have a question about the BOM marks. If I understand well, into the UTF-8 unicode binary representation, some systems add at the beginning of the file a BOM mark (Windows?), some don't. (Linux?). Therefore, the exact same text encoded in the same UTF-8 will result in two different binary files, and of a slightly different length. Right ?
9
6761
by: jeff M via .NET 247 | last post by:
I'm still having problems reading EBCDIC files. Currently itlooks like the lower range (0 to 127) is working. I have triedthe following code pages 20284, 20924, 1140, 37, 500 and 20127.By working I get the correct answer by taking the decimal valueand using that as an index to an array that will map to thecorrect EBCDIC value in hex. By larger values, an example would be "AA" in EBCDIC hex wouldgive me the value of 63 in decimal (ASCII) when...
7
7754
by: Drew Berkemeyer | last post by:
Hello, I'm using the following code to read a text file in VB.NET. Dim sr As StreamReader = File.OpenText(strFilePath) Dim input As String = sr.ReadLine() While Not input Is Nothing strReturn += input + vbCrLf input = sr.Read
2
3248
by: nnimod | last post by:
Hi. I'm having trouble reading some unicode files. Basically, I have to parse certain files. Some of those files are being input in Japanese, Chinese etc. The easiest way, I figured, to distinguish between plain ASCII files I receive and the Unicode ones would be to check if the first two bytes read 0xFFFE. But nothing I do seems to be able to do that. I tried reading it in binary mode and reading two characters in:
7
5041
by: Dmitry Akselrod | last post by:
Hello everyone, I am attempting to extract some header information from typical Microsoft Outlook MSG files in VB.NET. I am not after a complete message or attachments that may be enclosed. I am particularly interested in the Message ID field. I have examined MSG files in notepad and hex editors. I can see that the Internet Headers are there and present. I can do a search for Message-ID and locate it without any problems in notepad....
9
2784
by: amyl | last post by:
I have an application that is reading data from text files to the first occurrence of a double crlf. I have this working using streamreader and using readline. However, I am hitting 500+ files at a time and this is a bit slow. I was thinking of testing this using FileStream or something similar, but I am not sure how to detect the occurrence of crlf crlf? Thoughts?
18
9062
by: John | last post by:
Hi, I'm a beginner is using C# and .net. I have big legacy files that stores various values (ints, bytes, strings) and want to read them into a C# programme so that I can store them in a database. The files are written by a late 1980's PC Pascal programme, for which I don't have the source code. I've managed to reverse engineer the file format. The strings are stored as Ascii in the file, with the first byte indicating the string...
18
620
by: Chameleon | last post by:
I am trying to #define this: #ifdef UNICODE_STRINGS #define UC16 L typedef wstring String; #else #define UC16 typedef string String; #endif ....
11
3586
by: Freddy Coal | last post by:
Hi, I'm trying to read a binary file of 2411 Bytes, I would like load all the file in a String. I make this function for make that: '-------------------------- Public Shared Function Read_bin(ByVal ruta As String) Dim cadena As String = "" Dim dato As Array If File.Exists(ruta) = True Then
24
3373
by: Donn Ingle | last post by:
Hello, I hope someone can illuminate this situation for me. Here's the nutshell: 1. On start I call locale.setlocale(locale.LC_ALL,''), the getlocale. 2. If this returns "C" or anything without 'utf8' in it, then things start to go downhill: 2a. The app assumes unicode objects internally. i.e. Whenever there is
0
8613
by: Hystou | last post by:
Most computers default to English, but sometimes we require a different language, especially when relocating. Forgot to request a specific language before your computer shipped? No problem! You can effortlessly switch the default language on Windows 10 without reinstalling. I'll walk you through it. First, let's disable language synchronization. With a Microsoft account, language settings sync across devices. To prevent any complications,...
0
9172
Oralloy
by: Oralloy | last post by:
Hello folks, I am unable to find appropriate documentation on the type promotion of bit-fields when using the generalised comparison operator "<=>". The problem is that using the GNU compilers, it seems that the internal comparison operator "<=>" tries to promote arguments from unsigned to signed. This is as boiled down as I can make it. Here is my compilation command: g++-12 -std=c++20 -Wnarrowing bit_field.cpp Here is the code in...
1
8908
by: Hystou | last post by:
Overview: Windows 11 and 10 have less user interface control over operating system update behaviour than previous versions of Windows. In Windows 11 and 10, there is no way to turn off the Windows Update option using the Control Panel or Settings app; it automatically checks for updates and installs any it finds, whether you like it or not. For most users, this new feature is actually very convenient. If you want to control the update process,...
0
8880
tracyyun
by: tracyyun | last post by:
Dear forum friends, With the development of smart home technology, a variety of wireless communication protocols have appeared on the market, such as Zigbee, Z-Wave, Wi-Fi, Bluetooth, etc. Each protocol has its own unique characteristics and advantages, but as a user who is planning to build a smart home system, I am a bit confused by the choice of these technologies. I'm particularly interested in Zigbee because I've heard it does some...
1
6532
isladogs
by: isladogs | last post by:
The next Access Europe User Group meeting will be on Wednesday 1 May 2024 starting at 18:00 UK time (6PM UTC+1) and finishing by 19:30 (7.30PM). In this session, we are pleased to welcome a new presenter, Adolph Dupré who will be discussing some powerful techniques for using class modules. He will explain when you may want to use classes instead of User Defined Types (UDT). For example, to manage the data in unbound forms. Adolph will...
0
4374
by: TSSRALBI | last post by:
Hello I'm a network technician in training and I need your help. I am currently learning how to create and manage the different types of VPNs and I have a question about LAN-to-LAN VPNs. The last exercise I practiced was to create a LAN-to-LAN VPN between two Pfsense firewalls, by using IPSEC protocols. I succeeded, with both firewalls in the same network. But I'm wondering if it's possible to do the same thing, with 2 Pfsense firewalls...
0
4626
by: adsilva | last post by:
A Windows Forms form does not have the event Unload, like VB6. What one acts like?
2
2344
muto222
by: muto222 | last post by:
How can i add a mobile payment intergratation into php mysql website.
3
2008
bsmnconsultancy
by: bsmnconsultancy | last post by:
In today's digital era, a well-designed website is crucial for businesses looking to succeed. Whether you're a small business owner or a large corporation in Toronto, having a strong online presence can significantly impact your brand's success. BSMN Consultancy, a leader in Website Development in Toronto offers valuable insights into creating effective websites that not only look great but also perform exceptionally well. In this comprehensive...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.