Connecting Tech Pros Worldwide Help | Site Map

Why Am I Getting an Inverted Question Mark?

  #1  
Old July 23rd, 2005, 03:00 AM
mary
Guest
 
Posts: n/a
When I read an HTML file starting with

<META HTTP-EQUIV="Content-Type" CONTENT="text/html; charset=UTF-8">

and then I write it into another file, say OUTPUT.txt, I get an
inverted question mark, "¿",
at the beginning of the OUTPUT.txt file. Why is that?
Thanks!

mary

PS. I use:

string line;
while (getline(in,line)) {
out.write(line.c_str(),line.size());
out.put('\n');
}
  #2  
Old July 23rd, 2005, 03:00 AM
Phil Staite
Guest
 
Posts: n/a

re: Why Am I Getting an Inverted Question Mark?


Seems odd. Maybe, just maybe there is an empty or blank line at the
beginning of your source file? In that case during the first iteration
of the while loop line would be empty. Now, it *should* be ok to call
write with a char count of 0 and have it do nothing... But maybe there
is a problem with your stream code? Try adding a simple test:

while (getline(in,line)) {
if( ! line.empty() )
{
out.write(line.c_str(),line.size());
out.put('\n');
}
}
  #3  
Old July 23rd, 2005, 03:00 AM
mary
Guest
 
Posts: n/a

re: Why Am I Getting an Inverted Question Mark?


Phil,

Here is the code. It still does it with any file starting with
anything!
Thanks!

Mary

@@@@@@@@@@@@@@@@@@@@@@@

#include <iostream>
#include <fstream>
#include <string>

using namespace std;

string line;
int main()
{
ifstream in("INPUT.txt",ios::in);
if (!in) {
cout << "Cannot Open the INPUT file.\n";
return 1;
}
ofstream out("OUTPUT.txt",ios::out);
if (!out) {
cout << "Cannot Open the OUTPUT file.\n";
in.close();
return 1;
}
while (getline(in,line)) {
if( ! line.empty() ) {
out.write(line.c_str(),line.size());
out.put('\n');
}
}
in.close();
out.close();
return 0;
}

@@@@@@@@@@@@@@@@@@@@@@@@
On Sun, 13 Mar 2005 21:26:05 -0700, Phil Staite <phil@nospam.com>
wrote:
[color=blue]
>Seems odd. Maybe, just maybe there is an empty or blank line at the
>beginning of your source file? In that case during the first iteration
>of the while loop line would be empty. Now, it *should* be ok to call
>write with a char count of 0 and have it do nothing... But maybe there
>is a problem with your stream code? Try adding a simple test:
>
>while (getline(in,line)) {
> if( ! line.empty() )
> {
> out.write(line.c_str(),line.size());
> out.put('\n');
> }
>}[/color]

  #4  
Old July 23rd, 2005, 03:00 AM
Phlip
Guest
 
Posts: n/a

re: Why Am I Getting an Inverted Question Mark?


mary wrote:
[color=blue]
> When I read an HTML file starting with
>
> <META HTTP-EQUIV="Content-Type" CONTENT="text/html; charset=UTF-8">
>
> and then I write it into another file, say OUTPUT.txt, I get an
> inverted question mark, "¿",
> at the beginning of the OUTPUT.txt file. Why is that?[/color]

Are you saving the file with Notepad.exe?

That program prefixes files that it perceives as Unicode (even UTF-8) with a
Byte Order Mark. If you use an editor to open your file in hex (or "binary")
mode, you might see the BOM, FEFF or FFEF, at the beginning.

Your output system does not interpret the codes as UTF-8, so it probably
uses ISO Latin-1. That has no glyph for FF or EF, so you get a "missing
glyph" symbol as ¿.

This could all be wrong, but the details are off-topic, so nobody is allowed
to contradict me.

--
Phlip
http://industrialxp.org/community/bi...UserInterfaces


  #5  
Old July 23rd, 2005, 03:00 AM
Kurt Stutsman
Guest
 
Posts: n/a

re: Why Am I Getting an Inverted Question Mark?


mary wrote:[color=blue]
> out.write(line.c_str(),line.size());
> out.put('\n');[/color]

I don't see anything wrong with your code, but the above lines could be
simplified to:
out << line << '\n';
  #6  
Old July 23rd, 2005, 03:00 AM
Sven Axelsson
Guest
 
Posts: n/a

re: Why Am I Getting an Inverted Question Mark?


On Mon, 14 Mar 2005 06:16:12 GMT, Phlip wrote:
[color=blue]
> mary wrote:
>[color=green]
>> When I read an HTML file starting with
>>
>> <META HTTP-EQUIV="Content-Type" CONTENT="text/html; charset=UTF-8">
>>
>> and then I write it into another file, say OUTPUT.txt, I get an
>> inverted question mark, "¿",
>> at the beginning of the OUTPUT.txt file. Why is that?[/color]
>
> Are you saving the file with Notepad.exe?
>
> That program prefixes files that it perceives as Unicode (even UTF-8) with a
> Byte Order Mark. If you use an editor to open your file in hex (or "binary")
> mode, you might see the BOM, FEFF or FFEF, at the beginning.
>
> Your output system does not interpret the codes as UTF-8, so it probably
> uses ISO Latin-1. That has no glyph for FF or EF, so you get a "missing
> glyph" symbol as ¿.
>
> This could all be wrong, but the details are off-topic, so nobody is allowed
> to contradict me.[/color]

Well, your reasoning is correct, but not your facts. A Unicode file may
start with FEFF or FFFE (not FFEF) to indicate endianness. A UTF-8 file,
however, starts with EFBBBF if it has a BOM mark at all. But, no doubt, the
BOM mark is what the OP is seeing.

--
Sven Axelsson, Sweden
Closed Thread


Similar Threads
Thread Thread Starter Forum Replies Last Post
Why I am getting Inverted quetion (?)mark? vbiswsu answers 1 August 14th, 2009 02:25 AM